如何使用pandas将一列csv读取为dtype列表? [英] How to read a column of csv as dtype list using pandas?

查看:47
本文介绍了如何使用pandas将一列csv读取为dtype列表?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含 3 列的 csv 文件,其中第 3 列的每一行都有一个值列表.从下表结构可以看出

I have a csv file with 3 columns, wherein each row of Column 3 has list of values in it. As you can see from the following table structure

Col1,Col2,Col3
1,a1,"['Proj1', 'Proj2']"
2,a2,"['Proj3', 'Proj2']"
3,a3,"['Proj4', 'Proj1']"
4,a4,"['Proj3', 'Proj4']"
5,a5,"['Proj5', 'Proj2']"

每当我尝试读取此 csv 时,Col3 都会被读取为 str 对象而不是列表.我尝试将该列的 dtype 更改为列出,但出现属性错误",如下所示

Whenever I try to read this csv, Col3 is getting read as str object and not as list. I tried to alter the dtype of that column to list but got "Attribute Error" as below

df = pd.read_csv("inputfile.csv")
df.Col3.dtype = list

AttributeError                            Traceback (most recent call last)
<ipython-input-19-6f9ec76b1b30> in <module>()
----> 1 df.Col3.dtype = list

C:Python27libsite-packagespandascoregeneric.pyc in __setattr__(self,         name, value)
   1953                     object.__setattr__(self, name, value)
   1954             except (AttributeError, TypeError):
-> 1955                 object.__setattr__(self, name, value)
   1956 
   1957     #----------------------------------------------------------------------

属性错误:无法设置属性

AttributeError: can't set attribute

如果你能指导我如何去做,那就太好了.

It would be really great if you can guide me how to go about it.

推荐答案

你可以使用 ast 库:

You could use the ast lib:

from ast import literal_eval


df.Col3 = df.Col3.apply(literal_eval)
print(df.Col3[0][0])
Proj1

您也可以在从 csv 创建数据帧时使用 converters:

You can also do it when you create the dataframe from the csv, using converters:

df = pd.read_csv("in.csv",converters={"Col3": literal_eval})

如果您确定所有字符串的格式都相同,则剥离和拆分会快得多:

If you are sure the format is he same for all strings, stripping and splitting will be a lot faster:

 df = pd.read_csv("in.csv",converters={"Col3": lambda x: x.strip("[]").split(", ")})

但是你最终会得到用引号括起来的字符串

But you will end up with the strings wrapped in quotes

这篇关于如何使用pandas将一列csv读取为dtype列表?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆