将列名称分配给csv数据集 [英] Assign columns names to a csv dataset
问题描述
我目前正在处理由以下数据组成的数据集:
I'm currently working on a dataset that consists of the following data:
paper_id, word_attributes, class_label
现在总共有3700个word_attributes列表示一个二进制值. python中是否有一种方法可以用来分配列标题?谢谢.
Now there are a total of 3700 word_attributes columns representing a binary value. Is there a method in python using which I could assign the column headers? Thanks.
推荐答案
您也许可以使用以下命令读取csv文件:
You can perhaps read the csv file using:
a = np.genfromtxt(filename, delimiter=',', dtype=None, names=True)
它将创建一个numpy.recarray
,其中每个列都可以通过键(例如a['paper_id']
)来调用.当dtype=None
时,"dtypes将由每个
列".
it will create a numpy.recarray
where each column can be called by a key, like a['paper_id']
. When dtype=None
, "the dtypes will be determined by the contents of each
column, individually".
如@askewchan所建议,您必须传递names=True
来保留csv列的原始名称.
as suggested by @askewchan, you have to pass names=True
to keep the original names for the csv columns.
这篇关于将列名称分配给csv数据集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!