How do I extract data from multiple related web pages in Android using Jsoup? -
well, have been working in app display news headings , contents site http://www.myagdikali.com
i able extract data 'myagdikali.com/category/news/national-news/' there 10 posts in page , there links other pages 1,2,3... myagdikali.com/category/news/national-news/page/2.
all need know is, how extract news every possible pages under /national_news ? possible using jsoup ?
till code extract data single page is:
public view oncreateview(layoutinflater inflater, viewgroup container, bundle savedinstancestate) { view rootview = inflater.inflate(r.layout.fragment_all, container, false); int = getarguments().getint(news); string topics = getresources().getstringarray(r.array.topics)[i]; switch (i) { case 0: url = "http://myagdikali.com/category/news/national-news"; new newsextractor().execute(); break; ..... [edit] private class newsextractor extends asynctask<void, void, void> { string title; @override protected void doinbackground(void... params) { while (status == ok) { currenturl = url + string.valueof(page); try { response = jsoup.connect(currenturl).execute(); status = response.statuscode(); if (status == ok) { document doc = response.parse(); elements urllists = doc.select("a[rel=bookmark]"); (org.jsoup.nodes.element urllist : urllists) { string src = urllist.text(); mylinks.add(src); } title = doc.title(); } } catch (ioexception e) { e.printstacktrace(); } page++; } return null; }
edit: while trying extract data single page without loop, can extract data. after using while loop, error stating no adapter attached.
actually loading extracted data in recyclerview , onpostexecute this:
@override protected void onpostexecute(void avoid) { layoutmanager = new linearlayoutmanager(getactivity()); recyclerview.setlayoutmanager(layoutmanager); myrecyclerviewadapter = new myrecyclerviewadapter(getactivity(),mylinks); recyclerview.setadapter(myrecyclerviewadapter); }
since know url
of pages need - http://myagdikali.com/category/news/national-news/page/x (where x page number between 2 , 446), can loop through url
s. you'll need use jsoup's response
, make sure page exists (the number 446 can changed - believe increases).
code should this:
final string url = "http://myagdikali.com/category/news/national-news/page/"; final int ok = 200; string currenturl; int page = 2; int status = ok; connection.response response = null; document doc = null; while (status == ok) { currenturl = url + string.valueof(page); //add page number url response = jsoup.connect(currenturl) .useragent("mozilla/5.0") .execute(); //you may add here useragent/timeout etc. status = response.statuscode(); if (status == ok) { doc = response.parse(); //extract info. need } page++; }
this of course not working code - you'll have add try-catch
sentences, compiler you. hope helps you.
edit:
1. i've editted code - i've had send useragent
string in order response server.
2. code runs on machine, prints lots of ????
, because don't have proper fonts installed.
3. error you're getting android
part - view
s. haven't posted piece of code...
4. try add useragent
, might solve it.
5. please add error , code you're running original question editting it, it's more readable.
Comments
Post a Comment