如何删除 pandas 数据框列中的换行符? [英] How to remove newline in pandas dataframe columns?

查看:147
本文介绍了如何删除 pandas 数据框列中的换行符?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想缩短并清理CSV文件以在ElasticSearch中使用它. 但是某些数据框(单元格)中存在换行符,并且无法将CSV解析为ElasticSearch.现在,我用熊猫缩短了CSV格式,并尝试删除了换行符,但是它不起作用.

I want to shorten and clean up a CSV file to use it in ElasticSearch. but there are line breaks in some Dataframes (cells) and it is not possible to parse the CSV to ElasticSearch. I now shortend the CSV with pandas and tried to remove the newline but it is not working.

代码如下:

import pandas as pd

f=pd.read_csv("test.csv")

keep_col = ["Plugin ID","CVE","CVSS","Risk","Host","Protocol","Port","Name","Synopsis","Description","Solution",]

new_f = f[keep_col].replace('\\n',' ', regex=True)
new_f.to_csv("newFile.csv", index=False)

短缺正在解决,但我在说明,简介和解决方案中有换行符. 知道如何使用Python/Pandas解决它吗? CSV大约有10万个条目,因此必须在每个条目中都删除换行符.

the shortage is working, but i have newlines in Description, Synopsis and Solutions. Any idea how to solve it with Python / Pandas? The CSV has about 100k entries so the linebreak removal has to be done in every entry.

推荐答案

据我了解,.replace()参数的第三个参数计算了您要替换为旧子字符串的次数.新的子字符串,所以您只需删除第三个参数,因为您不知道新行的存在次数.

From what I've learnt, the third parameter for the .replace() parameter takes the count of the number of times you want to replace the old substring with the new substring, so instead just remove the third parameter since you don't know the number of times the new line exists.

new_f = f[keep_col].replace('\\n',' ')

这应该有帮助

这篇关于如何删除 pandas 数据框列中的换行符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆