python - get text content from p tag -
i trying description text content of each block on page
https://twitter.com/search?q=data%20mining&src=typd&vertical=default&f=users.
html p tag looks like
<p class="profilecard-bio u-dir" dir="ltr" data-aria-label-part=""><a href="http://t.co/kwtdyfn6dc" rel="nofollow" dir="ltr" data-expanded-url="http://dataminingblog.com" class="twitter-timeline-link" target="_blank" title="http://dataminingblog.com"><span class="invisible">http://</span><span class="js-display-url">dataminingblog.com</span><span class="tco-ellipsis"><span class="invisible"> </span></span></a> covers current challenges, interviews leading actors , book reviews related data mining, analytics , data science.</p>
my code:
productdivs = soup.findall('div', attrs={'class' : 'profilecard-content'}) div in productdivs: print div.find('p', attrs={'class' : 'profilecard-bio u-dir'}).text
anything wrong here? getting exception here
traceback (most recent call last): file "twitter_user_scrapper.py", line 91, in getimagelist print div.find('p', attrs={'class' : 'profilecard-bio u-dir'}).text attributeerror: 'nonetype' object has no attribute 'text'
the issue might div
class
profilecard-content
may not have child p
element class - profilecard-bio u-dir
, when happens , following returns none
-
div.find('p', attrs={'class' : ['profilecard-bio', 'u-dir']})
and reason getting attributeerror
. should return of above , save in variable , , check whether none
or not , take text if not none.
also, should give class list of classes , not single string, -
attrs={'class' : ['profilecard-bio', 'u-dir']}
example -
productdivs = soup.findall('div', attrs={'class' : 'profilecard-content'}) div in productdivs: elem = div.find('p', attrs={'class' : ['profilecard-bio', 'u-dir']}) if elem: print elem.text
Comments
Post a Comment