Python,BeautifulSoup - 解析一个推文 [英] Python, BeautifulSoup - Parsing out a Tweet

查看:92
本文介绍了Python,BeautifulSoup - 解析一个推文的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我从我的Twitter时间轴的源中获得了一份HTML,如下所示: http://pastebin.com/deefvbYw



这是一个我将用作示例的Tweet。
我不能为了我的生活而让它合作。
我希望它显示:



Dmitri @TheFPShow
我一直这么做...... youtube.com/watch?v= DF9WP8 ...



如果任何人都可以提供一些很棒的建议。

解决方案

div>

  soup = BeautifulSoup(twit)

name_tag = soup('strong',{'class':'全名js-action-profile-name show-popup-id-id'})
user = name_tag [0] .contents [0]

action_tag = soup('span',{'class':'username js- action-profile-name'})
at_sign = action_tag [0] .contents [0] .contents [0]
show_name = action_tag [0] .contents [1] .contents [0]
$ b $ twit_text = soup('p',{'class':'js-tweet-text'})
message = twit_text [0] .contents [0]
url = twit_text [0] .contents [1] ['data-expanded-url']

打印用户,at_sign,show_name,消息,url

输出:

$ $ $ $ $ $ $ $ D $ D $ ... http:// www。 youtube.com/watch?v=DF9WP87KNPk


I have a peice of HTML I took from the source of my Twitter timeline, shown here:

http://pastebin.com/deefvbYw

That's one Tweet I'll use for an example. I can't for the life of me get it to co-operate. I want it to show:

Dmitri @TheFPShow "I do this all the time... youtube.com/watch?v=DF9WP8…"

If anyone could offer some suggestions that'd be great.

解决方案

soup = BeautifulSoup(twit)

name_tag = soup('strong', {'class': 'fullname js-action-profile-name show-popup-with-id'})
user = name_tag[0].contents[0]

action_tag = soup('span', {'class': 'username js-action-profile-name'})
at_sign = action_tag[0].contents[0].contents[0]
show_name = action_tag[0].contents[1].contents[0]

twit_text = soup('p', {'class': 'js-tweet-text'})
message = twit_text[0].contents[0]
url = twit_text[0].contents[1]['data-expanded-url']

print user, at_sign, show_name, message, url

The output:

Dmitri @ TheFPShow I do this all the time...  http://www.youtube.com/watch?v=DF9WP87KNPk

这篇关于Python,BeautifulSoup - 解析一个推文的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆