删除数据框中的空间python [英] Removing space in dataframe python

查看:481
本文介绍了删除数据框中的空间python的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的代码中出现错误,因为我尝试通过从csv调用元素来创建数据帧。我有两个列我从一个文件:CompanyName和QualityIssue调用。质量问题有三种:设备质量,用户和两种。我遇到问题,试图使一个数据框df.Equipment质量,这显然不起作用,因为在那里有一个空间。我想从原始文件中获取设备质量,并用下划线替换空间。



输入:

 热门客户,设备质量,用户, 
客户3,2,2,0,
客户1,0,2,1,
客户2,0,1,0,
客户4, 1,0,

这是我的代码:

  import numpy as np 
import pandas as pd
import pandas.util.testing as tm; tm.N = 3

#获取数据。
data = pd.DataFrame.from_csv('MYDATA.csv')
#通过调用CompanyName和QualityIssue列对数据进行分组。
byqualityissue = data.groupby([CompanyName,QualityIssue])。size()
#制作分组数据的熊猫数据框。
df = pd.DataFrame(byqualityissue)
#更改数据的格式以匹配我想要的SpiderPlot读取。
formatted = df.unstack(level = -1)[0]
#将NaN值替换为零。
格式[np.isnan(格式)] = 0
includestotals = pd.concat([格式化,pd.DataFrame(formatted.sum(轴= 1)),
列= ['总计'])],轴= 1)
sortedtotal = includtotals.sort_index(by = ['Total'],ascending = [False])
sortedtotal.to_csv('byqualityissue.csv')

这似乎是一个常见问题,我尝试了很多解决方案,但似乎没有起作用。这是我试过的:

  with open('byqualityissue.csv','r')as f:
reader = csv.reader(f,delimiter =',',quoting = csv.QUOTE_NONE)
return [[x.strip()for x in row] for reader in reader]
sentence.replace( ,_)

  sortedtotal ['QualityIssue'] = sortedtotal ['QualityIssue']。map(lambda x:x.rstrip(''))
/ pre>

我认为这是最有前途的一个 http://pandas.pydata.org/pandas-docs/stable/text.html

  formatted.columns = formatted.columns.str.strip()。str.replace('','_')

但是我收到这个错误:AttributeError:'Index'对象没有属性'str'



感谢您提前的帮助!

解决方案

尝试:

  formatted.columns = [x.strip()。replace('','_')for x in formatted.columns] 


I am getting an error in my code because I tried to make a dataframe by calling an element from a csv. I have two columns I call from a file: CompanyName and QualityIssue. There are three types of Quality issues: Equipment Quality, User, and Neither. I run into problems trying to make a dataframe df.Equipment Quality, which obviously doesn't work because there is a space there. I want to take Equipment Quality from the original file and replace the space with an underscore.

input:

Top Calling Customers,         Equipment Quality,    User,    Neither,
Customer 3,                      2,           2,        0,
Customer 1,                      0,           2,        1,
Customer 2,                      0,           1,        0,
Customer 4,                      0,           1,        0,

Here is my code:

import numpy as np
import pandas as pd
import pandas.util.testing as tm; tm.N = 3

# Get the data.
data = pd.DataFrame.from_csv('MYDATA.csv')   
# Group the data by calling CompanyName and QualityIssue columns.
byqualityissue = data.groupby(["CompanyName", "QualityIssue"]).size() 
# Make a pandas dataframe of the grouped data.
df = pd.DataFrame(byqualityissue) 
# Change the formatting of the data to match what I want SpiderPlot to read.
formatted = df.unstack(level=-1)[0]  
# Replace NaN values with zero.
formatted[np.isnan(formatted)] = 0 
includingtotals = pd.concat([formatted,pd.DataFrame(formatted.sum(axis=1), 
                             columns=['Total'])], axis=1)
sortedtotal = includingtotals.sort_index(by=['Total'], ascending=[False])
sortedtotal.to_csv('byqualityissue.csv')

This seems to be a frequently asked question and I tried lots of the solutions but they didn't seem to work. Here is what I tried:

with open('byqualityissue.csv', 'r') as f:
    reader = csv.reader(f, delimiter=',', quoting=csv.QUOTE_NONE)
    return [[x.strip() for x in row] for row in reader]
    sentence.replace(" ", "_")

And

sortedtotal['QualityIssue'] = sortedtotal['QualityIssue'].map(lambda x: x.rstrip(' ')) 

And what I thought was the most promising from here http://pandas.pydata.org/pandas-docs/stable/text.html:

formatted.columns = formatted.columns.str.strip().str.replace(' ', '_')

but I got this error: AttributeError: 'Index' object has no attribute 'str'

Thanks for your help in advance!

解决方案

Try:

formatted.columns = [x.strip().replace(' ', '_') for x in formatted.columns]

这篇关于删除数据框中的空间python的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆