两个单词之间的Python文本解析 [英] Python text parsing between two words

查看：116 发布时间：2020/9/20 6:31:58 python beautifulsoup

本文介绍了两个单词之间的Python文本解析的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用beautifulsoup，并希望从网页上两个单词之间提取所有文本.

I'm using beautifulsoup and want to extract all text from between two words on a webpage.

例如，想象以下网站文字:

Ex, imagine the following website text:

This is the text of the webpage. It is just a string of a bunch of stuff and maybe some tags in between.

我想提取页面上所有以text开始并以bunch结尾的内容.

I want to pull out everything on the page that starts with text and ends with bunch.

在这种情况下，我只想要:

In this case I'd want only:

text of the webpage. It is just a string of a bunch

但是，页面上可能有多个实例.

However, there's a chance there could be multiple instances of this on a page.

做到这一点的最佳方法是什么?

What is the best way to do this?

这是我当前的设置:

#!/usr/bin/env python
from mechanize import Browser
from BeautifulSoup import BeautifulSoup

mech = Browser()
urls = [
http://ca.news.yahoo.com/forget-phoning-business-app-sends-text-instead-100143774--sector.html
    ]



   for url in urls:
        page = mech.open(url)
        html = page.read()
        soup = BeautifulSoup(html)
        text= soup.prettify()
            texts = soup.findAll(text=True) 

    def visible(element):
        if element.parent.name in ['style', 'script', '[document]', 'head', 'title']: 
        # If the parent of your element is any of those ignore it

            return False

        elif re.match('<!--.*-->', str(element)):
        # If the element matches an html tag, ignore it

            return False

        else:
        # Otherwise, return True as these are the elements we need

          return True

    visible_texts = filter(visible, texts)
    # Filter only returns those items in the sequence, texts, that return True. 
    # We use those to build our final list.

    for line in visible_texts:
      print line

两个单词之间的Python文本解析 [英] Python text parsing between two words

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

两个单词之间的Python文本解析 [英] Python text parsing between two words

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭