如何从BeautifulSoup中获取换行符get text方法 [英] How to strip line breaks from BeautifulSoup get text method

查看:125
本文介绍了如何从BeautifulSoup中获取换行符get text方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

抓取网页后,我得到以下输出

I have a following output after scraping a web page

       text
Out[50]: 
['\nAbsolute FreeBSD, 2nd Edition\n',
'\nAbsolute OpenBSD, 2nd Edition\n',
'\nAndroid Security Internals\n',
'\nApple Confidential 2.0\n',
'\nArduino Playground\n',
'\nArduino Project Handbook\n',
'\nArduino Workshop\n',
'\nArt of Assembly Language, 2nd Edition\n',
'\nArt of Debugging\n',
'\nArt of Interactive Design\n',]

在遍历它时,我需要从上面的列表中删除\ n.以下是我的代码

I need to strip \n from above list while iterating over it. Following is my code

text = []
for name in web_text:
   a = name.get_text()
   text.append(a)

推荐答案

而不是显式调用 .strip(),而应使用 strip 参数:

Rather than calling .strip() explicitly, use the strip argument:

a = name.get_text(strip=True)

如果有的话,这也会删除子文本中多余的空格和换行符.

This would also remove the extra whitespace and newline characters in the children texts if any.

这篇关于如何从BeautifulSoup中获取换行符get text方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆