Python - Cant make it to encode string properly -
im trying grab data , here code:
import requests bs4 import beautifulsoup url = 'http://www.privredni-imenik.com/firma/68225-a_expo' r = requests.get(url) soup = beautifulsoup(r.content, "html.parser") g_data = soup.find_all("div", {"class":"podaci"}) in g_data: = i.text.encode('utf-8', 'ignore') print (some)
it works, results looks this:
b'a & l expo preduze\xc4\x86e za proizvodnju
where \xc4\x86
should represented letter Ć
.
how can make work?
you have string, print text:
in [18]: g_data = soup.find_all("div", {"class":"podaci"}) in [19]: in g_data: ....: = i.text ....: print (some) ....: & l expo preduzeĆe za proizvodnju, trgovinu usluge doo 11070 beograd vladimira popovtelefaksmatični broj: 17461460 informacije o delatnostima koje obavlja ova firma: » organizovanje sastanaka sajmova in [20]: print(type(some)) <class 'str'> in [21]: print(type(some.encode('utf-8', 'ignore'))) <class 'bytes'>
you encoding bytes
i.text.encode('utf-8', 'ignore')
there no need @ bar print text.
Comments
Post a Comment