无法使用选择器获取全部内容 [英] Unable to get the full content using selector

查看：102 发布时间：2020/5/4 8:40:48 python python-3.x web-scraping css-selectors lxml

本文介绍了无法使用选择器获取全部内容的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我已经编写了一些在python中使用的选择器来获取一些项目及其价值.我希望刮掉不符合风格的物品.但是，当我运行脚本时，它仅获取项目，但无法达到由"br"标记分隔的那些项目的值.我该如何抓住他们?在这种情况下，我不愿意使用xpath来达到目的.预先感谢.

I've written some selector used within python to get some items and it's value. I wish to scrape the items not to style. However, when I run my script, It only gets the items but can't reach the value of those items which are separated by "br" tag. How can I grab them? I do not with to use xpath in this very case to serve the purpose. Thanks in advance.

以下是元素:

html = '''
<div class="elems"><br>
    <ul>
    <li><b>Item Name:</b><br>
            titan
                </li>
        <li><b>Item No:</b><br>
                23003400
                    </li>
        <li><b>Item Sl:</b><br>
            2760400
                </li>
        </ul>
    </div>
'''

这是我的脚本，其中包含css选择器:

Here is my script with css selectors in it:

from lxml import html as e

root = e.fromstring(html)
for items in root.cssselect(".elems li"):
    item = items.cssselect("b")[0].text_content()
    print(item)

执行后，我得到的结果是

Upon execution, the result I'm having:

Item Name:
Item No:
Item Sl:

我追求的结果:

Item Name: titan
Item No: 23003400
Item Sl: 2760400

推荐答案

最简单的解决方案.值在"li"标记内，而不在"b"内.

The easiest solution ever. Values are within "li" tag not "b".

from lxml import html as e

root = e.fromstring(html)
for items in root.cssselect(".elems"):
    item = [item.text_content() for item in items.cssselect("li")]
    print(''.join(item))

这篇关于无法使用选择器获取全部内容的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

无法使用选择器获取全部内容 [英] Unable to get the full content using selector

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

无法使用选择器获取全部内容 [英] Unable to get the full content using selector

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭