在 Python pandas 中,从 1 而不是 0 开始行索引而不创建额外的列 [英] In Python pandas, start row index from 1 instead of zero without creating additional column

查看:53
本文介绍了在 Python pandas 中,从 1 而不是 0 开始行索引而不创建额外的列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道我可以像这样重置索引

I know that I can reset the indices like so

df.reset_index(inplace=True)

但这将从 0 开始索引.我想从 1 开始.如何在不创建任何额外列并保留 index/reset_index 功能和选项的情况下做到这一点?我不想想创建一个新的数据框,所以 inplace=True 应该仍然适用.

but this will start the index from 0. I want to start it from 1. How do I do that without creating any extra columns and by keeping the index/reset_index functionality and options? I do not want to create a new dataframe, so inplace=True should still apply.

推荐答案

直接分配一个新的索引数组即可:

Just assign directly a new index array:

df.index = np.arange(1, len(df) + 1)

示例:

In [151]:

df = pd.DataFrame({'a':np.random.randn(5)})
df
Out[151]:
          a
0  0.443638
1  0.037882
2 -0.210275
3 -0.344092
4  0.997045
In [152]:

df.index = np.arange(1,len(df)+1)
df
Out[152]:
          a
1  0.443638
2  0.037882
3 -0.210275
4 -0.344092
5  0.997045

或者只是:

df.index = df.index + 1

如果索引已经从 0 开始

If the index is already 0 based

时间安排

出于某种原因,我无法对 reset_index 进行计时,但以下是 100,000 行 df 的计时:

For some reason I can't take timings on reset_index but the following are timings on a 100,000 row df:

In [160]:

%timeit df.index = df.index + 1
The slowest run took 6.45 times longer than the fastest. This could mean that an intermediate result is being cached 
10000 loops, best of 3: 107 µs per loop


In [161]:

%timeit df.index = np.arange(1, len(df) + 1)
10000 loops, best of 3: 154 µs per loop

所以如果没有 reset_index 的时间,我不能肯定地说,但是如果索引已经是 0,那么看起来只是给每个索引值加 1 会更快基于

So without the timing for reset_index I can't say definitively, however it looks like just adding 1 to each index value will be faster if the index is already 0 based

这篇关于在 Python pandas 中,从 1 而不是 0 开始行索引而不创建额外的列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆