Pandas:向数据框添加新列,这是索引列的副本 [英] Pandas: Adding new column to dataframe which is a copy of the index column

查看:57
本文介绍了Pandas:向数据框添加新列,这是索引列的副本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据框,我想用 matplotlib 绘制它,但索引列是时间,我无法绘制它.

I have a dataframe which I want to plot with matplotlib, but the index column is the time and I cannot plot it.

这是数据框 (df3):

This is the dataframe (df3):

但是当我尝试以下操作时:

but when I try the following:

plt.plot(df3['magnetic_mag mean'], df3['YYYY-MO-DD HH-MI-SS_SSS'], label='FDI')

我显然收到了一个错误:

I'm getting an error obviously:

KeyError: 'YYYY-MO-DD HH-MI-SS_SSS'

所以我想要做的是向我的数据框(名为时间")添加一个新的额外列,它只是索引列的副本.

So what I want to do is to add a new extra column to my dataframe (named 'Time) which is just a copy of the index column.

我该怎么做?

这是完整的代码:

#Importing the csv file into df
df = pd.read_csv('university2.csv', sep=";", skiprows=1)

#Changing datetime
df['YYYY-MO-DD HH-MI-SS_SSS'] = pd.to_datetime(df['YYYY-MO-DD HH-MI-SS_SSS'], 
                                               format='%Y-%m-%d %H:%M:%S:%f')

#Set index from column
df = df.set_index('YYYY-MO-DD HH-MI-SS_SSS')

#Add Magnetic Magnitude Column
df['magnetic_mag'] = np.sqrt(df['MAGNETIC FIELD X (μT)']**2 + df['MAGNETIC FIELD Y (μT)']**2 + df['MAGNETIC FIELD Z (μT)']**2)

#Subtract Earth's Average Magnetic Field from 'magnetic_mag'
df['magnetic_mag'] = df['magnetic_mag'] - 30

#Copy interesting values
df2 = df[[ 'ATMOSPHERIC PRESSURE (hPa)',
          'TEMPERATURE (C)', 'magnetic_mag']].copy()

#Hourly Average and Standard Deviation for interesting values 
df3 = df2.resample('H').agg(['mean','std'])
df3.columns = [' '.join(col) for col in df3.columns]

df3.reset_index()
plt.plot(df3['magnetic_mag mean'], df3['YYYY-MO-DD HH-MI-SS_SSS'], label='FDI')  

谢谢!!

推荐答案

我认为你需要 reset_index:

I think you need reset_index:

df3 = df3.reset_index()

可能的解决方案,但我认为 inplace 不是好的做法,请查看 thisthis:

Possible solution, but I think inplace is not good practice, check this and this:

df3.reset_index(inplace=True)

但如果您需要新列,请使用:

But if you need new column, use:

df3['new'] = df3.index

我认为你可以read_csv 更好:

I think you can read_csv better:

df = pd.read_csv('university2.csv', 
                 sep=";", 
                 skiprows=1,
                 index_col='YYYY-MO-DD HH-MI-SS_SSS',
                 parse_dates='YYYY-MO-DD HH-MI-SS_SSS') #if doesnt work, use pd.to_datetime

然后省略:

#Changing datetime
df['YYYY-MO-DD HH-MI-SS_SSS'] = pd.to_datetime(df['YYYY-MO-DD HH-MI-SS_SSS'], 
                                               format='%Y-%m-%d %H:%M:%S:%f')
#Set index from column
df = df.set_index('YYYY-MO-DD HH-MI-SS_SSS')

这篇关于Pandas:向数据框添加新列,这是索引列的副本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆