JSoup解析标签内的数据 [英] JSoup parsing data from within a tag

查看：184 发布时间：2019/1/8 20:40:51 java parsing jsoup

本文介绍了JSoup解析标签内的数据的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在设法解析我需要的大部分数据，因为它包含在一个href标签中，我需要在mmsi =之后出现的数字

I am managing to parse most of the data I need except for one as it is contained within the a href tag and I am needing the number that appears after "mmsi="

<a href="/showship.php?mmsi=235083844">Sunsail 4013</a>

我当前的解析器获取我需要的所有其他数据，如下所示。我尝试了一些代码注释掉的东西偶尔返回未指定的条目。有什么方法可以添加到我的代码中，以便在返回数据时，数字235083844在名称Sunsail 4013之前返回？

my current parser fetches all the other data I need and is below. I tried a few things out the code commented out returns unspecified occasionally for an entry. Is there any way I can add to my code below so that when the data is returned the number "235083844" returns before the name "Sunsail 4013"?

try {
        File input = new File("shipMove.txt");
        Document doc = Jsoup.parse(input, null);
        Elements tables = doc.select("table.shipInfo");
        for( Element element : tables )
        {
            Elements tdTags = element.select("td");
            //Elements mmsi = element.select("a[href*=/showship.php?mmsi=]");
            // Iterate over all 'td' tags found
            for( Element td : tdTags ){
                // Print it's text if not empty
                final String text = td.text();
                if( text.isEmpty() == false )
                {
                    System.out.println(td.text());
                }
            }
        }
    } catch (IOException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }

解析的数据示例和html文件这里

Example of data parsed and html file here

推荐答案

您可以在元素对象上使用 attr 来检索特定属性的值

如果String模式一致，则使用 substring 获取所需的值

You can use attr on an Element object to retrieve a particular attribute's value
Use substring to get the required value if the String pattern is consistent

代码

// Using just your anchor html tag
String html = "<a href=\"/showship.php?mmsi=235083844\">Sunsail 4013</a>";
Document doc = Jsoup.parse(html);

// Just selecting the anchor tag, for your implementation use a generic one
Element link = doc.select("a").first();

// Get the attribute value
String url = link.attr("href");

// Check for nulls here and take the substring from '=' onwards
String id = url.substring(url.indexOf('=') + 1);
System.out.println(id + " "+ link.text());

给予，

235083844 Sunsail 4013

修改条件在您的 for 循环代码中：

...
    for (Element td : tdTags) {
                // Print it's text if not empty
                final String text = td.text();
                if (text.isEmpty() == false) {
                    if (td.getElementsByTag("a").first() != null) {
                        // Get the attribute value
                        String url = td.getElementsByTag("a").first().attr("href");

                        // Check for nulls here and take the substring from '=' onwards
                        String id = url.substring(url.indexOf('=') + 1);
                        System.out.println(id + " "+ td.text());
                    }
                    else {
                        System.out.println(td.text());
                    }
                }
            }
...

上面的代码将打印所需的输出。

The above code would print the desired output.

这篇关于JSoup解析标签内的数据的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

JSoup解析标签内的数据 [英] JSoup parsing data from within a tag

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

JSoup解析标签内的数据 [英] JSoup parsing data from within a tag

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭