需要从HTML页面解析图像src,然后显示它 [英] Need to parse image src from HTML page then display it
问题描述
我目前正在尝试开发一个应用程序,使其可以访问以下网站(http://lulpix.com)并解析HTML并从以下部分获取img src
I'm currently trying to develop an app whereby it visits the following site (Http://lulpix.com) and parses the HTML and gets the img src from the following section
<div class="pic rounded-8" style="overflow:hidden;"><div style="margin:0 0 36px 0;overflow:hidden;border:none;height:474px;"><img src="**http://lulpix.com/images/2012/April/13/4f883cdde3591.jpg**" alt="All clogged up" title="All clogged up" width="319"/></div></div>
每次加载页面时,它当然都不同,所以我不能给我打算做的图像异步画廊一个直接URL,例如
Its of course different every time the page is loaded so I cannot give a direct URL to an Asynchronous gallery of images which is what i intend to do, for instance
加载页面>解析img src>下载Async到imageview>重新加载lulpix.com>重新开始
然后将它们分别放置在图像视图中,用户可以从中向左或向右滑动来浏览.
Then place each of these in an image view from which the user can swipe left and right to browse.
因此TL; DR是,我该如何解析html来检索URL,并让任何人都有使用库显示图像的经验.
So the TL;DR of this is, how can i parse the html to retrieve the URL and has anyone got any experiences with libarys for displaying images.
非常感谢.
推荐答案
这是一个连接到lulpix的AsyncTask,伪造了引荐来源网址&用户代理(lulpix尝试通过一些相当la脚的检查来阻止抓取).从您的Activity
:
Here's an AsyncTask that connects to lulpix, fakes a referrer & user-agent (lulpix tries to block scraping with some pretty lame checks apparently). Starts like this in your Activity
:
new ForTheLulz().execute();
以非常la脚的方式下载生成的Bitmap
(不进行缓存或检查图像是否已为DL:ed)&错误处理总体上是不存在的-但基本概念应该可以.
The resulting Bitmap
is downloaded in a pretty lame way (no caching or checks if the image is already DL:ed) & error handling is overall pretty non-existent - but the basic concept should be ok.
class ForTheLulz extends AsyncTask<Void, Void, Bitmap> {
@Override
protected Bitmap doInBackground(Void... args) {
Bitmap result = null;
try {
Document doc = Jsoup.connect("http://lulpix.com")
.referrer("http://www.google.com")
.userAgent("Mozilla/5.0 (Windows; U; WindowsNT 5.1; en-US; rv1.8.1.6) Gecko/20070725 Firefox/2.0.0.6")
.get();
//parse("http://lulpix.com");
if (doc != null) {
Elements elems = doc.getElementsByAttributeValue("class", "pic rounded-8");
if (elems != null && !elems.isEmpty()) {
Element elem = elems.first();
elems = elem.getElementsByTag("img");
if (elems != null && !elems.isEmpty()) {
elem = elems.first();
String src = elem.attr("src");
if (src != null) {
URL url = new URL(src);
// Just assuming that "src" isn't a relative URL is probably stupid.
InputStream is = url.openStream();
try {
result = BitmapFactory.decodeStream(is);
} finally {
is.close();
}
}
}
}
}
} catch (IOException e) {
// Error handling goes here
}
return result;
}
@Override
protected void onPostExecute(Bitmap result) {
ImageView lulz = (ImageView) findViewById(R.id.lulpix);
if (result != null) {
lulz.setImageBitmap(result);
} else {
//Your fallback drawable resource goes here
//lulz.setImageResource(R.drawable.nolulzwherehad);
}
}
}
这篇关于需要从HTML页面解析图像src,然后显示它的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!