创建名称为带有句点的 pandas 系列 [英] Creating a Pandas Series with a period in the name

查看:58
本文介绍了创建名称为带有句点的 pandas 系列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我运行了以下Python代码,该代码创建具有两个系列(ab)的Pandas DataFrame,然后尝试创建两个新系列(cd):

I ran the following Python code, which creates a Pandas DataFrame with two Series (a and b), and then attempts to create two new Series (c and d):

import pandas as pd
df = pd.DataFrame({'a':[1, 2, 3], 'b':[4, 5, 6]})
df['c'] = df.a + df.b
df.d = df.a + df.b

我的理解是,如果Pandas系列是DataFrame的一部分,并且Series名称没有任何空格(并且不与现有属性或方法冲突),则可以将Series作为DataFrame的属性进行访问.因此,我希望第3行会起作用(因为这是您创建新的熊猫系列的方式),并且我希望第4行会失败(因为d在执行该行代码之后,DataFrame的属性才不存在.

My understanding is that if a Pandas Series is part of a DataFrame, and the Series name does not have any spaces (and does not collide with an existing attribute or method), the Series can be accessed as an attribute of the DataFrame. As such, I expected that line 3 would work (since that's how you create a new Pandas Series), and I expected that line 4 would fail (since the d attribute does not exist for the DataFrame until after you execute that line of code).

令我惊讶的是,第4行没有导致错误.相反,DataFrame现在包含三个Series:

To my surprise, line 4 did not result in an error. Instead, the DataFrame now contains three Series:

>>> df
   a  b  c
0  1  4  5
1  2  5  7
2  3  6  9

还有一个新对象df.d,它是熊猫系列:

And there is a new object, df.d, which is a Pandas Series:

>>> df.d
0    5
1    7
2    9
dtype: int64

>>> type(df.d)
pandas.core.series.Series

我的问题如下:

  • 为什么第4行没有导致错误?
  • df.d现在是具有所有常规系列功能的普通"熊猫系列吗?
  • df.d是否以任何方式连接"到df DataFrame,还是完全独立的对象?
  • Why did line 4 not result in an error?
  • Is df.d now a "normal" Pandas Series with all of the regular Series functionality?
  • Is df.d in any way "connected" to the df DataFrame, or is it a completely independent object?

我问这个问题的动机仅仅是因为我想更好地了解Pandas,而不是因为第4行有特定的用例.

My motivation in asking this question is simply that I want to better understand Pandas, and not because there is a particular use case for line 4.

我的Python版本是2.7.11,我的熊猫版本是0.17.1.

My Python version is 2.7.11, and my Pandas version is 0.17.1.

推荐答案

进行分配时,您需要使用方括号表示法,例如df['d'] = ...

When doing assignment, you need to use bracket notation, e.g. df['d'] = ...

d现在是数据框df的属性.与任何对象一样,您可以为其分配属性.这就是为什么它没有产生错误的原因.它只是表现不如您预期...

d is now a property of the dataframe df. As with any object, you can assign properties to them. That is why it did not generate the error. It just didn't behave as you expected...

df.some_property = 'What?'
>>> df.some_property
'What?'

这是熊猫初学者的常见误区. 始终使用括号符号进行分配.点标记是为了方便引用数据帧/系列时使用的.为了安全起见,您可以始终使用方括号表示法.

This is a common area of misunderstanding for beginners to Pandas. Always use bracket notation for assignment. The dot notation is for convenience when referencing the dataframe/series. To be safe, you could always use bracket notation.

是的,根据您的示例,df.d是一个普通系列,现在是数据框的意外属性.该系列是它自己的对象,通过将其分配给df时创建的引用进行连接.

And yes, df.d per your example is a normal series that is now an unexpected property of the dataframe. This series is its own object, connected by the reference you created when you assigned it to df.

这篇关于创建名称为带有句点的 pandas 系列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆