从pandas数据框列中的对象中删除逗号 [英] Remove comma from objects in a pandas dataframe column

查看:201
本文介绍了从pandas数据框列中的对象中删除逗号的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经使用熊猫导入了一个csv文件.

I have imported a csv file using pandas.

我的数据框有多个标题为农场",总苹果"和好苹果"的列.

My dataframe has multiple columns titled "Farm", "Total Apples" and "Good Apples".

为总苹果"和好苹果"导入的数值数据包含逗号,表示数千个. 1200等 我要删除逗号,以便数据看起来像1200等.

The numerical data imported for "Total Apples" and "Good Apples" contains commas to indicate thousands e.g. 1,200 etc. I want to remove the comma so the data looks like 1200 etc.

总苹果"和好苹果"列的变量类型作为对象出现.

The variable type for the "Total Apples" and "Good Apples" columns comes up as object.

我尝试使用df.str.replacedf.strip,但没有成功.

I tried using df.str.replace and df.strip but have not been successful.

还尝试将变量类型从对象更改为字符串,将对象类型更改为整数,但无法使其正常工作.

Also tried to change the variable type from object to string and object to integer but couldn't make it work.

任何帮助将不胜感激.

****编辑****

****EDIT****

使用pd.read_csv导入的csv文件中的数据摘录:

Excerpt of data from csv file imported using pd.read_csv:

Farm_Name   Total Apples    Good Apples
EM  18,327  14,176
EE  18,785  14,146
IW  635 486
L   33,929  24,586
NE  12,497  9,609
NW  30,756  23,765
SC  8,515   6,438
SE  22,896  17,914
SW  11,972  9,114
WM  27,251  20,931
Y   21,495  16,662

推荐答案

我认为您可以将参数thousands添加到

I think you can add parameter thousands to read_csv, then values in columns Total Apples and Good Apples are converted to integers:

也许您的separator是不同的,别忘了更改它.如果分隔符为空白,请将其更改为sep='\s+'.

Maybe your separator is different, dont forget change it. If separator is whitespace, change it to sep='\s+'.

import pandas as pd
import io

temp=u"""Farm_Name;Total Apples;Good Apples
EM;18,327;14,176
EE;18,785;14,146
IW;635;486
L;33,929;24,586
NE;12,497;9,609
NW;30,756;23,765
SC;8,515;6,438
SE;22,896;17,914
SW;11,972;9,114
WM;27,251;20,931
Y;21,495;16,662"""
#after testing replace io.StringIO(temp) to filename
df = pd.read_csv(io.StringIO(temp), sep=";",thousands=',')
print df
   Farm_Name  Total Apples  Good Apples
0         EM         18327        14176
1         EE         18785        14146
2         IW           635          486
3          L         33929        24586
4         NE         12497         9609
5         NW         30756        23765
6         SC          8515         6438
7         SE         22896        17914
8         SW         11972         9114
9         WM         27251        20931
10         Y         21495        16662

print df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 11 entries, 0 to 10
Data columns (total 3 columns):
Farm_Name       11 non-null object
Total Apples    11 non-null int64
Good Apples     11 non-null int64
dtypes: int64(2), object(1)
memory usage: 336.0+ bytes
None

这篇关于从pandas数据框列中的对象中删除逗号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆