将我自己的描述属性添加到Pandas DataFrame [英] Adding my own description attribute to a Pandas DataFrame
问题描述
我正在检索一些Web数据,将其解析,并将输出作为Pandas DataFrame存储到HDF5文件中。在将 DataFrame
写入H5文件之前,我添加了自己的描述字符串,以注释一些元数据,说明数据来自何处以及在解析数据时是否出错。 / p>
I am retrieving some web data, parsing it, and storing the output as a Pandas DataFrame into an HDF5 file. Right before I write the DataFrame
into the H5 file, I add my own description string to annotate some metadata about where the data came from and whether anything went wrong while parsing it.
In [1]: my_data_frame.desc = "Some string about the data"
In [2]: my_data_frame.desc
Out[1]: "Some string about the data"
In [3]: print type(my_data_frame)
<class 'pandas.core.frame.DataFrame'>
但是,在使用 pandas.io.pytables.HDFStore加载相同的数据之后()
,我添加的 desc
属性丢失,并且出现错误: AttributeError:'DataFrame'对象没有属性'desc'
,好像我从未添加过此新属性一样。
However, after loading the same data with pandas.io.pytables.HDFStore()
, my added desc
attribute is missing and I get the error: AttributeError: 'DataFrame' object has no attribute 'desc'
as if I had never added this new attribute.
如何获取元数据描述作为DataFrame对象? (或者是否存在一些现有的,公认的DataFrame属性,我可以为我的元数据目的劫持该属性?)
How can I get my metadata descriptions to persist as an extra attribute of the DataFrame object? (Or is there some existing, recognized attribute of a DataFrame that I can hijack for my metadata purposes?)
推荐答案
添加DataFrame元数据或按列的元数据已列入路线图,但尚未实施。不过,我对API的外观持开放态度。
Adding DataFrame metadata or per-column metadata is on the roadmap but hasn't been implemented yet. I'm open to ideas about what the API should look like, though.
这篇关于将我自己的描述属性添加到Pandas DataFrame的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!