Python的beatutiful汤“结果”对象有没有属性'得到' [英] Python beatutiful soup 'ResultSet' object has no attribute 'get'

查看:364
本文介绍了Python的beatutiful汤“结果”对象有没有属性'得到'的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试着去抓住一些链接FRA一个网站,他们已经被清理后,将其写入一个文件。网站上的链接看起来是这样的:

Im trying to grab some links fra a website and write them to a file after they've been cleaned up. The links on the site looks like this:

<a href="javascript:changeChannel('http://dr01-lh.akamaihd.net/i/dr01_0@147054/index_1700_av-b.m3u8', 20);">DR1</a><br>
<a href="javascript:changeChannel('http://dr02-lh.akamaihd.net/i/dr02_0@147055/index_1700_av-b.m3u8', 21);">DR2</a><br>
<a href="javascript:changeChannel('http://dr03-lh.akamaihd.net/i/dr03_0@147056/index_1700_av-b.m3u8', 701);">DR3</a><br>
<a href="javascript:changeChannel('http://dr06-lh.akamaihd.net/i/dr06_0@147059/index_1700_av-b.m3u8', 31);">DR Ultra</a><br>
<a href="javascript:changeChannel('http://dr04-lh.akamaihd.net/i/dr04_0@147057/index_1700_av-b.m3u8', 38);">DR K</a><br>
<a href="javascript:changeChannel('http://dr05-lh.akamaihd.net/i/dr05_0@147058/index_1700_av-b.m3u8', 50);">DR Ramasjang</a><br>

我可以用这个抢他们:

and I can grab them using this:

links = soup.findAll(href=re.compile("javascript"))

给我这样的输出:

giving me this output:

[<a href="javascript:changeChannel('http://dr01-lh.akamaihd.net/i/dr01_0@147054/index_1700_av-b.m3u8', 20);">DR1</a>, <a href="javascript:changeChannel('http://dr02-lh.akamaihd.net/i/dr02_0@147055/index_1700_av-b.m3u8', 21);">DR2</a>, <a href="javascript:changeChannel('http://dr03-lh.akamaihd.net/i/dr03_0@147056/index_1700_av-b.m3u8', 701);">DR3</a>, <a href="javascript:changeChannel('http://dr06-lh.akamaihd.net/i/dr06_0@147059/index_1700_av-b.m3u8', 31);">DR Ultra</a>, <a href="javascript:changeChannel('http://dr04-lh.akamaihd.net/i/dr04_0@147057/index_1700_av-b.m3u8', 38);">DR K</a>, <a href="javascript:changeChannel('http://dr05-lh.akamaihd.net/i/dr05_0@147058/index_1700_av-b.m3u8', 50);">DR Ramasjang</a>]

现在我要清理这个,所以我只能得到HTTP:在''之间//孤单的一部分,这是它变坏

Now I want to clean this up so I only get the http:// part theres between the '' and this is where it goes bad.

我试着

fullink = links.get('href')

在那里我得到的错误:

where I get the error:

'ResultSet' object has no attribute 'get'

那么,如何让链接出来的?

So how do I get the links out of this?

推荐答案

美丽的汤文档说:

AttributeError的:结果'对象有没有属性'富' - 这
  通常是因为你预期find_all()返回一个标签
  或字符串。但find_all()返回的标签和字符串-A 的一个列表
  ResultSet对象。您需要遍历列表,并期待在
  每一个包含.foo。或者,如果你真的只想要一个结果,你需要
  使用find(),而find_all()。

AttributeError: 'ResultSet' object has no attribute 'foo' - This usually happens because you expected find_all() to return a single tag or string. But find_all() returns a list of tags and strings–a ResultSet object. You need to iterate over the list and look at the .foo of each one. Or, if you really only want one result, you need to use find() instead of find_all().

所以,你可能希望 full_links = [x.get(的href),用于连接x]

这篇关于Python的beatutiful汤“结果”对象有没有属性'得到'的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆