Jsoup图像标记提取 [英] Jsoup image tag extraction
本文介绍了Jsoup图像标记提取的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我需要使用此html中的jsoup提取图像标记
i need to extract an image tag using jsoup from this html
<div class="picture">
<img src="http://asdasd/aacb.jpgs" title="picture" alt="picture" />
</div>
我需要提取这个img标签的src ...
i am使用此代码我得到空值
i need to extract the src of this img tag ... i am using this code i am getting null value
Element masthead2 = doc.select("div.picture").first();
String linkText = masthead2.outerHtml();
Document doc1 = Jsoup.parse(linkText);
Element masthead3 = doc1.select("img[src]").first();
String linkText1 = masthead3.html();
推荐答案
以下是获取图像源属性的示例:
Here's an example to get the image source attribute:
public static void main(String... args) {
Document doc = Jsoup.parse("<div class=\"picture\"><img src=\"http://asdasd/aacb.jpgs\" title=\"picture\" alt=\"picture\" /></div>");
Element img = doc.select("div.picture img").first();
String imgSrc = img.attr("src");
System.out.println("Img source: " + imgSrc);
}
div.picture img
selector在div下找到image元素。
The div.picture img
selector finds the image element under the div.
元素的主提取方法是:
-
attr(name)
,它获取元素属性的值, -
text()
,它获取元素的文本内容(例如,在< p> Hello< / p>
中,text()是你好), -
html()
,它获取一个元素的内部HTML(< div> < img>< / div>
html()=< img>
)和 -
outerHtml()
,它获取完整的HTML元素(< div>< img>< / div>
html()=< div>< img>< / div>
)
attr(name)
, which gets the value of an element's attribute,text()
, which gets the text content of an element (e.g. in<p>Hello</p>
, text() is "Hello"),html()
, which gets an element's inner HTML (<div><img></div>
html() =<img>
), andouterHtml()
, which gets an elements full HTML (<div><img></div>
html() =<div><img></div>
)
您不需要像当前示例那样重新分析HTML,要么使用更具体的选择器首先选择正确的元素,要么点击 element.select(string)
获胜的方法。
You don't need to reparse the HTML like in your current example, either select the correct element in the first place using a more specific selector, or hit the element.select(string)
method to winnow down.
这篇关于Jsoup图像标记提取的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文