BeautifulSoup 输出到 .txt 文件 [英] BeautifulSoup output to .txt file

查看:18
本文介绍了BeautifulSoup 输出到 .txt 文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试将我的数据导出为 .txt 文件

I am trying to export my data as a .txt file

from bs4 import BeautifulSoup
import requests
import os

import os

os.getcwd()
'/home/folder'
os.mkdir("Probeersel6") 
os.chdir("Probeersel6")
os.getcwd()
'/home/Desktop/folder'
os.mkdir("img")  #now `folder` 

url = "http://nos.nl/artikel/2093082-steeds-meer-nekklachten-bij-kinderen-door-gebruik-tablets.html"
r  = requests.get(url)
soup = BeautifulSoup(r.content)
data = soup.find_all("article", {"class": "article"})

with open(""%s".txt", "wb" %(url)) as file:
    for item in data:
        print item.contents[0].find_all("time", {"datetime": "2016-03-16T09:50:30+0100"})[0].text 
        print item.contents[0].find_all("a", {"class": "link-grey"})[0].text
        print "
"
        print item.contents[0].find_all("img", {"class": "media-full"})[0]
        print "
"
        print item.contents[1].find_all("div", {"class": "article_textwrap"})[0].text
        file.write()

什么应该放在:

file.write()

上班?

我还试图让 .txt 文件的名称与 url 相同,我应该用字符串来做吗?

I am also trying to get the name of the .txt file the same as the url should I do that with a string?

with open(""%s".txt", "wb" %(url)) as file:


url = "http://nos.nl/artikel/2093082-steeds-meer-nekklachten-bij-kinderen-door-gebruik-tablets.html"

推荐答案

你应该把你的内容放在 file.write 里面.我可能会做类似的事情:

You should put Inside file.write your content. I'll probably do something like:

#!/usr/bin/python3
#

from bs4 import BeautifulSoup
import requests

url = 'http://nos.nl/artikel/2093082-steeds-meer-nekklachten-bij-kinderen-door-gebruik-tablets.html'
file_name=url.rsplit('/',1)[1].rsplit('.')[0]

r  = requests.get(url)
soup = BeautifulSoup(r.content, 'lxml')
data = soup.find_all('article', {'class': 'article'})


content=''.join('''{}
{}

{}
{}'''.format( item.contents[0].find_all('time', {'datetime': '2016-03-16T09:50:30+0100'})[0].text,
                                               item.contents[0].find_all('a', {'class': 'link-grey'})[0].text,
                                               item.contents[0].find_all('img', {'class': 'media-full'})[0],
                                               item.contents[1].find_all('div', {'class': 'article_textwrap'})[0].text,
                                             ) for item in data)

with open('./{}.txt'.format(file_name), mode='wt', encoding='utf-8') as file:
    file.write(content)

这篇关于BeautifulSoup 输出到 .txt 文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆