解析表BeautifulSoup和文本中写入文件 [英] parsing table with BeautifulSoup and write in text file

查看：2100 发布时间：2016/8/5 19:02:42 python beautifulsoup

本文介绍了解析表BeautifulSoup和文本中写入文件的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我需要在此格式的文本文件（output.txt的）从表中的数据：
DATA1;数据2;数据3，数据4; .....

Celkova podlahova plocha bytu;33米; Vytah;肛; Nadzemne podlazie; Prizemne podlazie; ......;备考vlastnictva; Osobne

所有一行，分隔符为（后来以CSV文件导出）

I'm初学者..帮助，谢谢。

 从BeautifulSoup进口BeautifulSoup
进口的urllib2
进口codeCS响应= urllib2.urlopen('http://www.reality.sk/zakazka/0747-003578/$p$pdaj/1-izb-byt/kosice-mestska-cast-sever-sladkovicova-kosice-sever/art-real-1-izb-byt-sladkovicova-ul-kosice-sever')
HTML = response.read（）
汤= BeautifulSoup（HTML）tabulka = soup.find（表，{级：细节字符}）在tabulka.findAll（'TR'）行：
    COL = row.findAll（'TD'）
    prvy = COL [0] .string.strip（）
    druhy = COL [1] .string.strip（）
    记录=（prvy]，[druhy]）FL = codecs.open（'output.txt的'，'WB'，'UTF8'）
在记录REC：
    行=''
    在REC VAL：
        行+ = VAL + U';'
    fl.write（行+ U'\\ r \\ n）
fl.close（）

解决方案

您不能保留，你在读它的每个记录。试试这个，它存储在记录的记录：

 从BeautifulSoup进口BeautifulSoup
进口的urllib2
进口codeCS响应= urllib2.urlopen('http://www.reality.sk/zakazka/0747-003578/$p$pdaj/1-izb-byt/kosice-mestska-cast-sever-sladkovicova-kosice-sever/art-real-1-izb-byt-sladkovicova-ul-kosice-sever')
HTML = response.read（）
汤= BeautifulSoup（HTML）tabulka = soup.find（表，{级：细节字符}）记录= []＃的所有记录存储在该列表
在tabulka.findAll（'TR'）行：
    COL = row.findAll（'TD'）
    prvy = COL [0] .string.strip（）
    druhy = COL [1] .string.strip（）
    记录=％S;％s'的％（prvy，druhy）＃存储与记录的; prvy和druhy之间
    records.append（记录）FL = codecs.open（'output.txt的'，'WB'，'UTF8'）
行=';'。加入（记录）
fl.write（行+ U'\\ r \\ n）
fl.close（）

这可以更干净了，但我认为这是你想要的东西。

I need data from table in text file (output.txt) in this format: data1;data2;data3;data4;.....

Celkova podlahova plocha bytu;33m;Vytah;Ano;Nadzemne podlazie;Prizemne podlazie;.....;Forma vlastnictva;Osobne

All in "one line", separator is ";" (later export in csv-file).

I´m beginner.. Help, thanks.

from BeautifulSoup import BeautifulSoup
import urllib2
import codecs

response = urllib2.urlopen('http://www.reality.sk/zakazka/0747-003578/predaj/1-izb-byt/kosice-mestska-cast-sever-sladkovicova-kosice-sever/art-real-1-izb-byt-sladkovicova-ul-kosice-sever')
html = response.read()
soup = BeautifulSoup(html)

tabulka = soup.find("table", {"class" : "detail-char"})

for row in tabulka.findAll('tr'):
    col = row.findAll('td')
    prvy = col[0].string.strip()
    druhy = col[1].string.strip()
    record = ([prvy], [druhy])

fl = codecs.open('output.txt', 'wb', 'utf8')
for rec in record:
    line = ''
    for val in rec:
        line += val + u';'
    fl.write(line + u'\r\n')
fl.close()

解决方案

You are not keeping each record as you read it in. Try this, which stores the records in records:

from BeautifulSoup import BeautifulSoup
import urllib2
import codecs

response = urllib2.urlopen('http://www.reality.sk/zakazka/0747-003578/predaj/1-izb-byt/kosice-mestska-cast-sever-sladkovicova-kosice-sever/art-real-1-izb-byt-sladkovicova-ul-kosice-sever')
html = response.read()
soup = BeautifulSoup(html)

tabulka = soup.find("table", {"class" : "detail-char"})

records = [] # store all of the records in this list
for row in tabulka.findAll('tr'):
    col = row.findAll('td')
    prvy = col[0].string.strip()
    druhy = col[1].string.strip()
    record = '%s;%s' % (prvy, druhy) # store the record with a ';' between prvy and druhy
    records.append(record)

fl = codecs.open('output.txt', 'wb', 'utf8')
line = ';'.join(records)
fl.write(line + u'\r\n')
fl.close()

This could be cleaned up more, but I think it's what you are wanting.

这篇关于解析表BeautifulSoup和文本中写入文件的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

解析表BeautifulSoup和文本中写入文件 [英] parsing table with BeautifulSoup and write in text file

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

解析表BeautifulSoup和文本中写入文件 [英] parsing table with BeautifulSoup and write in text file

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭