BeautifulSoup输出到.txt文件 [英] BeautifulSoup output to .txt file

查看：2364 发布时间：2016/8/5 19:01:01 python operating-system beautifulsoup python-requests bs4

本文介绍了BeautifulSoup输出到.txt文件的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我想我的数据导出为.txt文件

 从BS4进口BeautifulSoup
进口要求
进口OS进口OSos.getcwd（）
'/家庭/文件夹
os.mkdir（Probeersel6）
os.chdir（Probeersel6）
os.getcwd（）
'/家庭/桌面/文件夹
os.mkdir（img目录）#now`folder`URL =http://nos.nl/artikel/2093082-steeds-meer-nekklachten-bij-kinderen-door-gebruik-tablets.html
R = requests.get（URL）
汤= BeautifulSoup（r.content）
数据= soup.find_all（文章，{级：文章}）开放（％s的名.txt，WB％（URL））的文件中：
    在数据项：
        打印item.contents [0] .find_all（时间，{日期时间：2016-03-16T09：50：30 + 0100}）[0]的.text
        打印item.contents [0] .find_all（A，{级：链接灰色}）[0]的.text
        打印\\ n
        打印item.contents [0] .find_all（IMG，{级：媒体全}）[0]
        打印\\ n
        打印item.contents [1] .find_all（格，{级：article_textwrap}）[0]的.text
        file.write（）

什么应该被放在：

  file.write（）

工作？

我也试图让.txt文件的相同网址我应该做的名字与一个字符串？

 开放（％s的名.txt，WB％（URL））的文件中：

结果

  URL =http://nos.nl/artikel/2093082-steeds-meer-nekklachten-bij-kinderen-door-gebruik-tablets.html

解决方案

您应该把里面的 file.write 您的内容。我可能会做这样的事情：

 ＃！的/ usr / bin中/ python3
＃从BS4进口BeautifulSoup
进口要求URL ='http://nos.nl/artikel/2093082-steeds-meer-nekklachten-bij-kinderen-door-gebruik-tablets.html
FILE_NAME = url.rsplit（'/'，1）[1] .rsplit（'。'）[0]R = requests.get（URL）
汤= BeautifulSoup（r.content，'LXML'）
数据= soup.find_all（'文章'，{'类'：'文章'}）
内容=''。加入（'''{} \\ n {} \\ n \\ n {} \\ n {}''时间'，{'日期时间''格式（item.contents [0] .find_all（。'： '2016-03-16T09：50：30 + 0100'}）[0]的.text，
                                               item.contents [0] .find_all（'A'，{'类'：'链接灰色'}）[0]的.text，
                                               item.contents [0] .find_all（'IMG'，{类：媒介全'}）[0]
                                               item.contents [1] .find_all（'格'，{'类'：'article_textwrap'}）[0]的.text，
                                             ），用于在数据项）开放（./ {} txt'.format（FILE_NAME），MODE ='重量'，编码='UTF-8'）的文件中：
    file.write（内容）

I am trying to export my data as a .txt file

from bs4 import BeautifulSoup
import requests
import os

import os

os.getcwd()
'/home/folder'
os.mkdir("Probeersel6") 
os.chdir("Probeersel6")
os.getcwd()
'/home/Desktop/folder'
os.mkdir("img")  #now `folder` 

url = "http://nos.nl/artikel/2093082-steeds-meer-nekklachten-bij-kinderen-door-gebruik-tablets.html"
r  = requests.get(url)
soup = BeautifulSoup(r.content)
data = soup.find_all("article", {"class": "article"})

with open(""%s".txt", "wb" %(url)) as file:
    for item in data:
        print item.contents[0].find_all("time", {"datetime": "2016-03-16T09:50:30+0100"})[0].text 
        print item.contents[0].find_all("a", {"class": "link-grey"})[0].text
        print "\n"
        print item.contents[0].find_all("img", {"class": "media-full"})[0]
        print "\n"
        print item.contents[1].find_all("div", {"class": "article_textwrap"})[0].text
        file.write()

what should be put in the:

file.write()

to work?

I am also trying to get the name of the .txt file the same as the url should I do that with a string?

with open(""%s".txt", "wb" %(url)) as file:

url = "http://nos.nl/artikel/2093082-steeds-meer-nekklachten-bij-kinderen-door-gebruik-tablets.html"

解决方案

You should put Inside file.write your content. I'll probably do something like:

#!/usr/bin/python3
#

from bs4 import BeautifulSoup
import requests

url = 'http://nos.nl/artikel/2093082-steeds-meer-nekklachten-bij-kinderen-door-gebruik-tablets.html'
file_name=url.rsplit('/',1)[1].rsplit('.')[0]

r  = requests.get(url)
soup = BeautifulSoup(r.content, 'lxml')
data = soup.find_all('article', {'class': 'article'})


content=''.join('''{}\n{}\n\n{}\n{}'''.format( item.contents[0].find_all('time', {'datetime': '2016-03-16T09:50:30+0100'})[0].text,
                                               item.contents[0].find_all('a', {'class': 'link-grey'})[0].text,
                                               item.contents[0].find_all('img', {'class': 'media-full'})[0],
                                               item.contents[1].find_all('div', {'class': 'article_textwrap'})[0].text,
                                             ) for item in data)

with open('./{}.txt'.format(file_name), mode='wt', encoding='utf-8') as file:
    file.write(content)

这篇关于BeautifulSoup输出到.txt文件的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

BeautifulSoup输出到.txt文件 [英] BeautifulSoup output to .txt file

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

BeautifulSoup输出到.txt文件 [英] BeautifulSoup output to .txt file

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭