从 pandas 数据框单元格中的凌乱字符串中删除换行符? [英] removing newlines from messy strings in pandas dataframe cells?

查看:79
本文介绍了从 pandas 数据框单元格中的凌乱字符串中删除换行符?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用了多种方式来拆分和剥离熊猫数据框中的字符串,以删除所有的'\ n'字符,但是由于某些原因,它根本不想删除附加在其他单词上的字符,即使我分开了我有一个带有列的pandas数据框,该列使用Beautifulsoup捕获网页中的文本.文本已经由beautifulsoup清除了一点,但是未能删除附加在其他字符上的换行符.我的琴弦看起来像这样:

I've used multiple ways of splitting and stripping the strings in my pandas dataframe to remove all the '\n'characters, but for some reason it simply doesn't want to delete the characters that are attached to other words, even though I split them. I have a pandas dataframe with a column that captures text from web pages using Beautifulsoup. The text has been cleaned a bit already by beautifulsoup, but it failed in removing the newlines attached to other characters. My strings look a bit like this:

动手\ n开发游戏.我们将研究与游戏相关的各种软件技术,包括编程语言,脚本\语言,操作系统,文件系统,网络,模拟\ n引擎和多媒体设计系统.我们还将研究来自计算机科学和相关\ n领域的一些基础科学概念,包括"

"hands-on\ndevelopment of games. We will study a variety of software technologies\nrelevant to games including programming languages, scripting\nlanguages, operating systems, file systems, networks, simulation\nengines, and multi-media design systems. We will also study some of\nthe underlying scientific concepts from computer science and related\nfields including"

是否有一种简单的python方式来删除这些"\ n"字符?

Is there an easy python way to remove these "\n" characters?

提前谢谢!

推荐答案

正确的答案是:

df = df.replace(r'\\n',' ', regex=True) 

我认为您需要 replace :

I think you need replace:

df = df.replace('\n','', regex=True)

或者:

df = df.replace('\n',' ', regex=True)

或者:

df = df.replace(r'\\n',' ', regex=True)

示例:

text = '''hands-on\ndev nologies\nrelevant scripting\nlang
'''
df = pd.DataFrame({'A':[text]})
print (df)
                                                   A
0  hands-on\ndev nologies\nrelevant scripting\nla...

df = df.replace('\n',' ', regex=True)
print (df)
                                                A
0  hands-on dev nologies relevant scripting lang 

这篇关于从 pandas 数据框单元格中的凌乱字符串中删除换行符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆