pandas vs.Numpy数据框 [英] Pandas vs. Numpy Dataframes

查看：48 发布时间：2020/5/24 2:34:31 python pandas numpy multidimensional-array dataframe

本文介绍了 pandas vs.Numpy数据框的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

看看以下几行代码:

df2 = df.copy()
df2[1:] = df[1:]/df[:-1].values -1
df2.ix[0, :] = 0

我们的讲师说，我们需要使用 .values 属性来访问基础的numpy数组，否则，我们的代码将无法正常工作.

Our instructor said we need to use the .values attribute to access the underlying numpy array, otherwise, our code wouldn't work.

我知道pandas DataFrame确实具有作为numpy数组的基础表示形式，但是我不明白为什么我们不能仅通过切片直接在pandas DataFrame上进行操作.

I understand that a pandas DataFrame does have an underlying representation as a numpy array, but I didn't understand why we cannot operate directly on the pandas DataFrame using just slicing.

您能向我说明一下吗?

推荐答案

pandas专注于表格数据结构，并且在执行操作(加法，减法等)时，它着眼于标签-而不是位置.

pandas focuses on tabular data structures and when doing the operations (addition, subtraction etc.) it looks at the labels - not positions.

请考虑以下DataFrame:

Consider the following DataFrame:

df = pd.DataFrame(np.random.randn(5, 3), index=list('abcde'), columns=list('xyz'))

在这里，df[1:]是:

df[1:]
Out: 
          x         y         z
b  1.003035  0.172960  1.160033
c  0.117608 -1.114294 -0.557413
d -1.312315  1.171520 -1.034012
e -0.380719 -0.422896  1.073535

df[:-1]是:

df[:-1]
Out: 
          x         y         z
a  1.367916  1.087607 -0.625777
b  1.003035  0.172960  1.160033
c  0.117608 -1.114294 -0.557413
d -1.312315  1.171520 -1.034012

如果您执行df[1:] / df[:-1]，则会将b行除以b行，将c行除以c行，将d行除以的.对于行a和e，它将无法在另一个DataFrame中找到对应的行(在第一个或第二个中)，因此它将返回nan:

If you do df[1:] / df[:-1] it will divide row b's by row b's, row c's by row c's and row d's by row d's. For row a and e, it will not be able to find corresponding rows in the other DataFrame (either in the first one or in the second one) so it will return nan:

df[1:] / df[:-1]
Out: 
     x    y    z
a  NaN  NaN  NaN
b  1.0  1.0  1.0
c  1.0  1.0  1.0
d  1.0  1.0  1.0
e  NaN  NaN  NaN

如果只想忽略标签进行元素划分，则通过.values访问其中一个框架的基础numpy数组是一种告诉熊猫忽略标签的方法.由于numpy数组没有标签，因此熊猫将只执行按元素操作:

If you just want to do element-wise division ignoring the labels, accessing the underlying numpy array by .values for one of the frames is a way of telling pandas to ignore labels. Since numpy arrays don't have labels, pandas will just do element-wise operations:

df[1:]/df[:-1].values
Out: 
           x         y         z
b   0.733258  0.159028 -1.853749
c   0.117252 -6.442482 -0.480515
d -11.158359 -1.051357  1.855018
e   0.290112 -0.360981 -1.038223

这篇关于 pandas vs.Numpy数据框的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

pandas vs.Numpy数据框 [英] Pandas vs. Numpy Dataframes

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

pandas vs.Numpy数据框 [英] Pandas vs. Numpy Dataframes

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭