问题提取HTML数据使用Android [英] Issue Extracting HTML Data Via Android

查看:178
本文介绍了问题提取HTML数据使用Android的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有我使用从HTML表格但是我无法提取以下网址数据获取数据其中一个Android jsoup基于应用程序:

http://sheriff.org/apps/arrest/results .CFM LNAME =&放大器; FNAME =

我只需要一点点的帮助搞清楚如何从这个特定的表分析数据。

我知道自己需要改变这里的参数:

 文档的DOC = Jsoup.connect(PARAMS [0])获得();
                元件的tableHeader = doc.select(TR)第一();                对于(element元素:tableHeader.children()){
                    (element.text()的toString())aa.add;

这是我第一次通过java提取HTML数据/ androind,我不知道它究竟能如何进行。

任何投入大大AP preciated。

来源:

 公共类MainActivity延伸活动{
    上下文语境;
    ArrayList的<串GT; AA =新的ArrayList<串GT;();
        LV的ListView;
        最终字符串URL =htt​​p://example.com;        @覆盖
        公共无效的onCreate(捆绑savedInstanceState){
            super.onCreate(savedInstanceState);
            的setContentView(R.layout.activity_main);
            LV =(ListView控件)findViewById(R.id.listView1);
            新MyTask()执行(URL);
        }        私有类MyTask扩展的AsyncTask<弦乐,太虚,字符串> {
            ProgressDialog PROG;
            字符串title =;            @覆盖
            在preExecute保护无效(){
                PROG =新ProgressDialog(MainActivity.this);
                prog.setMessage(加载......);
                prog.show();
            }            @覆盖
            保护字符串doInBackground(字符串... PARAMS){
                尝试{
                    。文档的DOC = Jsoup.connect(PARAMS [0])获得();
                    元件的tableHeader = doc.select(TR)第一();                    对于(element元素:tableHeader.children()){
                        (element.text()的toString())aa.add;
                    }                    标题= doc.title();
                }赶上(IOException异常五){
                    e.printStackTrace();
                }
                返回称号;
            }            @覆盖
            保护无效onPostExecute(字符串结果){
                super.onPostExecute(结果);
                prog.dismiss();
                ArrayAdapter<串GT;适配器=新ArrayAdapter<串GT;(MainActivity.this,android.R.layout.simple_list_item_1,AA);
                lv.setAdapter(适配器);
            }
        }
    }

HTML

 <表类=数据网格>
        &所述; TR>
            百分位>用户名和LT; /第i
            <第i日期和LT; /第i
            <第i个时间和LT; /第i
            <第i个位置和LT; /第i
        < / TR>            &所述; TR>
                &所述; TD>&下;一个href=\"redirector.cfm?ID=c4e7a7ea-0832-4cdb-9b38-4cbdde8c07bc&page=1&&lname=&fname=\"标题=501207593> 501207593&安培; NBSP;< / A>< / TD>
                < TD>姓氏,名字和放大器; NBSP;< / TD>
                < TD> M&安培; NBSP;< / TD>
                < TD>&LOCATION1 LT; / TD>
            < / TR>            &所述; TR>
                &所述; TD>&下;一个href=\"redirector.cfm?ID=6dfb8f0b-949a-49a1-b3bf-b361544ee5d8&page=1&&lname=&fname=\"标题=501302750> 501302750&安培; NBSP;< / A>< / TD>
                < TD>姓氏,名字和放大器; NBSP;< / TD>
                < TD> M&安培; NBSP;< / TD>
                < TD>&LOCATION2 LT; / TD>
            < / TR>            &所述; TR>
                &所述; TD>&下;一个href=\"redirector.cfm?ID=b638597e-0319-4eea-a2d4-d763d43125eb&page=1&&lname=&fname=\"标题=531201804> 531201804&安培; NBSP;< / A>< / TD>
                < TD>姓氏,名字和放大器; NBSP;< / TD>
                < TD> M&安培; NBSP;< / TD>
                < TD> LOCATION3< / TD>
            < / TR>


解决方案

另外,您也可以使用HttpURLConnection类提取链接的所有数据。

  HttpURLConnection的CON =(HttpURLConnection类)url.openConnection();InputStream为= con.getInputStream();FOS的FileOutputStream =新的FileOutputStream(storeDir +/+文件名);int数据= 0;  而((数据= is.​​read())!=  -  1){    fos.write(数据);   }is.close();fos.flush();fos.close();

您可能要检查这个网站了解更多信息的http:// dev-androidapps.blogspot.com/2013/09/web-download.html

I have an android jsoup based app which I'm using to pull data from an HTML table however I'm unable to extract data from the following url:

http://sheriff.org/apps/arrest/results.cfm?lname=&fname=

I simply need a bit of assistance figuring out how to parse the data from this particular table.

I know I need to change a parameter here:

Document doc = Jsoup.connect(params[0]).get();
                Element tableHeader = doc.select("tr").first();

                for (Element element : tableHeader.children()) {
                    aa.add(element.text().toString());

This is my first time extracting HTML data via java/androind and I'm not sure exactly how it can be done.

Any input is greatly appreciated.

SOURCE:

public class MainActivity extends Activity {
    Context context;
    ArrayList<String> aa = new ArrayList<String>();
        ListView lv;
        final String URL = "http://example.com";

        @Override
        public void onCreate(Bundle savedInstanceState) {
            super.onCreate(savedInstanceState);
            setContentView(R.layout.activity_main);
            lv= (ListView) findViewById(R.id.listView1);
            new MyTask().execute(URL);
        }

        private class MyTask extends AsyncTask<String, Void, String> {
            ProgressDialog prog;
            String title = "";

            @Override
            protected void onPreExecute() {
                prog = new ProgressDialog(MainActivity.this);
                prog.setMessage("Loading....");
                prog.show();
            }

            @Override
            protected String doInBackground(String... params) {
                try {
                    Document doc = Jsoup.connect(params[0]).get();
                    Element tableHeader = doc.select("tr").first();

                    for (Element element : tableHeader.children()) {
                        aa.add(element.text().toString());
                    }

                    title = doc.title();
                } catch (IOException e) {
                    e.printStackTrace();
                }
                return title;
            }

            @Override
            protected void onPostExecute(String result) {
                super.onPostExecute(result);
                prog.dismiss();
                ArrayAdapter<String> adapter = new ArrayAdapter<String>(MainActivity.this,android.R.layout.simple_list_item_1,aa);
                lv.setAdapter(adapter);
            }
        }
    }

HTML:

<table class="datagrid">
        <tr>
            <th>User Name</th>
            <th>Date</th>
            <th>Time</th>
            <th>Location</th>
        </tr>

            <tr>
                <td><a href="redirector.cfm?ID=c4e7a7ea-0832-4cdb-9b38-4cbdde8c07bc&page=1&&amp;lname=&amp;fname=" title="501207593">501207593&nbsp;</a></td>
                <td>LASTNAME, FIRSTNAME&nbsp;</td>
                <td>M&nbsp;</td>
                <td>Location1</td>
            </tr>

            <tr>
                <td><a href="redirector.cfm?ID=6dfb8f0b-949a-49a1-b3bf-b361544ee5d8&page=1&&amp;lname=&amp;fname=" title="501302750">501302750&nbsp;</a></td>
                <td>LASTNAME, FIRSTNAME&nbsp;</td>
                <td>M&nbsp;</td>
                <td>Location2</td>
            </tr>

            <tr>
                <td><a href="redirector.cfm?ID=b638597e-0319-4eea-a2d4-d763d43125eb&page=1&&amp;lname=&amp;fname=" title="531201804">531201804&nbsp;</a></td>
                <td>LASTNAME, FIRSTNAME&nbsp;</td>
                <td>M&nbsp;</td>
                <td>Location3</td>
            </tr>

解决方案

Alternatively, you can extract all data of a link by using the HttpURLConnection.

HttpURLConnection con=(HttpURLConnection)url.openConnection();

InputStream is=con.getInputStream();

FileOutputStream fos=new FileOutputStream(storeDir+"/"+filename);

int data=0;

  while((data=is.read())!=-1){

    fos.write(data);

   }

is.close();

fos.flush();

fos.close();

You might want to check this site for more information http://dev-androidapps.blogspot.com/2013/09/web-download.html.

这篇关于问题提取HTML数据使用Android的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆