需要从HTML页面解析图像src,然后显示它 [英] Need to parse image src from HTML page then display it

查看:198
本文介绍了需要从HTML页面解析图像src,然后显示它的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在尝试开发一个应用程序,使其可以访问以下网站(http://lulpix.com)并解析HTML并从以下部分获取img src

I'm currently trying to develop an app whereby it visits the following site (Http://lulpix.com) and parses the HTML and gets the img src from the following section

<div class="pic rounded-8" style="overflow:hidden;"><div style="margin:0 0 36px 0;overflow:hidden;border:none;height:474px;"><img src="**http://lulpix.com/images/2012/April/13/4f883cdde3591.jpg**" alt="All clogged up" title="All clogged up" width="319"/></div></div>

每次加载页面时,它当然都不同,所以我不能给我打算做的图像异步画廊一个直接URL,例如

Its of course different every time the page is loaded so I cannot give a direct URL to an Asynchronous gallery of images which is what i intend to do, for instance

加载页面>解析img src>下载Async到imageview>重新加载lulpix.com>重新开始

然后将它们分别放置在图像视图中,用户可以从中向左或向右滑动来浏览.

Then place each of these in an image view from which the user can swipe left and right to browse.

因此TL; DR是,我该如何解析html来检索URL,并让任何人都有使用库显示图像的经验.

So the TL;DR of this is, how can i parse the html to retrieve the URL and has anyone got any experiences with libarys for displaying images.

非常感谢.

推荐答案

这是一个连接到lulpix的AsyncTask,伪造了引荐来源网址&用户代理(lulpix尝试通过一些相当la脚的检查来阻止抓取).从您的Activity:

Here's an AsyncTask that connects to lulpix, fakes a referrer & user-agent (lulpix tries to block scraping with some pretty lame checks apparently). Starts like this in your Activity:

new ForTheLulz().execute();

以非常la脚的方式下载生成的Bitmap(不进行缓存或检查图像是否已为DL:ed)&错误处理总体上是不存在的-但基本概念应该可以.

The resulting Bitmap is downloaded in a pretty lame way (no caching or checks if the image is already DL:ed) & error handling is overall pretty non-existent - but the basic concept should be ok.

class ForTheLulz extends AsyncTask<Void, Void, Bitmap> {
        @Override
        protected Bitmap doInBackground(Void... args) {
            Bitmap result = null;
            try {
                Document doc = Jsoup.connect("http://lulpix.com")
                        .referrer("http://www.google.com")
                        .userAgent("Mozilla/5.0 (Windows; U; WindowsNT 5.1; en-US; rv1.8.1.6) Gecko/20070725 Firefox/2.0.0.6")
                        .get();
                        //parse("http://lulpix.com");
                if (doc != null) {
                    Elements elems = doc.getElementsByAttributeValue("class", "pic rounded-8");
                    if (elems != null && !elems.isEmpty()) {
                        Element elem = elems.first();
                        elems = elem.getElementsByTag("img");
                        if (elems != null && !elems.isEmpty()) {
                            elem = elems.first();
                            String src = elem.attr("src");
                            if (src != null) {
                                    URL url = new URL(src);
                                    // Just assuming that "src" isn't a relative URL is probably stupid.
                                    InputStream is = url.openStream();
                                    try {
                                        result = BitmapFactory.decodeStream(is);
                                    } finally {
                                        is.close();
                                    }
                            }
                        }
                    }
                }
            } catch (IOException e) {
                // Error handling goes here
            }
            return result;
        }
        @Override
        protected void onPostExecute(Bitmap result) {
            ImageView lulz = (ImageView) findViewById(R.id.lulpix);
            if (result != null) {
                lulz.setImageBitmap(result);
            } else {
                //Your fallback drawable resource goes here
                //lulz.setImageResource(R.drawable.nolulzwherehad);
            }
        }
    }

这篇关于需要从HTML页面解析图像src,然后显示它的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆