设置一个空的 pandas 数据框的列 [英] Setting columns for an empty pandas dataframe
问题描述
这是我很困惑的事情...
This is something that I'm confused about...
import pandas as pd
# this works fine
df1 = pd.DataFrame(columns=['A','B'])
# but let's say I have this
df2 = pd.DataFrame([])
# this doesn't work!
df2.columns = ['A','B']
# ValueError: Length mismatch: Expected axis has 0 elements, new values have 2 elements
为什么这行不通?我该怎么办呢?是做这样的事情的唯一方法吗?
Why doesn't this work? What can I do instead? Is the only way to do something like this?
if len(df2.index) == 0:
df2 = pd.DataFrame(columns=['A','B'])
else:
df2.columns = ['A','B']
必须有一种更优雅的方式.
There must be a more elegant way.
谢谢您的帮助!
有人问为什么要这么做:
Someone asked why do this at all:
df2 = pd.DataFrame([])
原因是实际上我在做这样的事情:
The reason is that actually I'm doing something like this:
df2 = pd.DataFrame(data)
...其中数据可以是列表的空列表,但在大多数情况下不是.所以是的,我可以做:
... where data could be empty list of lists, but in most cases it is not. So yes, I could do:
if len(data) > 0:
df2 = pd.DataFrame(data, columns=['A','B'])
else:
df2 = pd.DataFrame(columns=['A','B'])
...但是看起来不太干(当然也不太简洁).
... but this doesn't seem very DRY (and certainly not concise).
如果您有任何疑问,请告诉我.谢谢!
Let me know if you have any questions. Thanks!
推荐答案
这看起来像是熊猫中的错误.所有这些工作:
This looks like a bug in pandas. All of these work:
pd.DataFrame(columns=['A', 'B'])
pd.DataFrame({}, columns=['A', 'B'])
pd.DataFrame(None, columns=['A', 'B'])
但不是这样:
pd.DataFrame([], columns=['A', 'B'])
在修复之前,我建议类似以下内容:
Until it's fixed, I suggest something like this:
if len(data) == 0: data = None
df2 = pd.DataFrame(data, columns=['A','B'])
或:
df2 = pd.DataFrame(data if len(data) > 0 else None, columns=['A', 'B'])
这篇关于设置一个空的 pandas 数据框的列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!