将列表列表转换为 pandas 数据帧 [英] Converting a list of dicts to a Pandas dataframe
问题描述
我有一个Python dict列表,每个都有相同的密钥,
dict_keys = ['k1','k2','k3','k4','k5','k6']#更多的30个键在实践中
data = []
对于我在范围(20):#更像3000在实践
data.append({k:np.random.randint(100)for k在dict_keys})
,并希望使用它创建一个对应的熊猫数据框与一个子集的键。我目前的方法是一次从列表中取出每个 dict
,并使用
<$ p将其附加到数据框$ p>
df = pd.DataFrame(columns = ['k1','k2','k5','k6'])
在数据中为d:
df = df.append({k:d [k] for k in list(df.columns)},ignore_index = True)
#在实践中,这里有一些值的计算
但这是非常慢的(实际列表及其包含的dicts都相当大)
有没有更好,更快(更习惯)的方法来迭代字典列表,并将它们作为行添加到Pandas数据框?
解决方案只需将数据
DataFrame
的 __ init __
或 DataFrame.from_records
将工作)。
您可能还需要设置索引,例如 DataFrame.from_records(data,index ='k1')
。
如果您还需要执行一些计算,在创建它之后,在 DataFrame
上执行它通常更容易和更方便。杠杆熊猫!
I have a list of Python dict
s each with the same keys,
dict_keys= ['k1','k2','k3','k4','k5','k6'] # More like 30 keys in practice
data = []
for i in range(20): # More like 3000 in practice
data.append({k: np.random.randint(100) for k in dict_keys})
and would like to use it to create a corresponding Pandas dataframe with a subset of the keys. My current approach is to take each dict
from the list one at a time and append it to the dataframe using
df = pd.DataFrame(columns=['k1','k2','k5','k6'])
for d in data:
df = df.append({k: d[k] for k in list(df.columns)}, ignore_index=True)
# In practice, there are some calculations on some of the values here
but this is very slow (the actual list, and the dicts it contains, are both quite large).
Is there a better, faster (and more idiomatic) method for iterating through a list of dictionaries and adding them as rows to a Pandas dataframe?
解决方案 Simply pass data
to DataFrame
's __init__
, or to DataFrame.from_records
(either would work).
You might also want to set an index, e.g. DataFrame.from_records(data, index = 'k1')
.
If you need to also perform some calculations, it's usually easier and more convenient to do it on the DataFrame
, after creating it. Leverage pandas!
这篇关于将列表列表转换为 pandas 数据帧的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!