在Python大 pandas 中,从1开始行索引而不是零,而不创建其他列 [英] In Python pandas, start row index from 1 instead of zero without creating additional column
问题描述
我知道我可以重新设置这样的索引
I know that I can reset the indices like so
df.reset_index(inplace=True)
但这将从 0
开始索引。我想从 1
开始。如何创建任何额外的列,并保持index / reset_index的功能和选项?我不想想要创建一个新的数据框,所以 inplace = True
应该仍然适用。
but this will start the index from 0
. I want to start it from 1
. How do I do that without creating any extra columns and by keeping the index/reset_index functionality and options? I do not want to create a new dataframe, so inplace=True
should still apply.
推荐答案
直接分配一个新的索引数组:
Just assign directly a new index array:
df.index = np.arange(1, len(df) + 1)
示例:
In [151]:
df = pd.DataFrame({'a':np.random.randn(5)})
df
Out[151]:
a
0 0.443638
1 0.037882
2 -0.210275
3 -0.344092
4 0.997045
In [152]:
df.index = np.arange(1,len(df)+1)
df
Out[152]:
a
1 0.443638
2 0.037882
3 -0.210275
4 -0.344092
5 0.997045
或者只是:
df.index = df.index + 1
如果索引已经为0,则
时间
由于某种原因,我无法在 reset_index
以下是100,000行df的时间:
For some reason I can't take timings on reset_index
but the following are timings on a 100,000 row df:
In [160]:
%timeit df.index = df.index + 1
The slowest run took 6.45 times longer than the fastest. This could mean that an intermediate result is being cached
10000 loops, best of 3: 107 µs per loop
In [161]:
%timeit df.index = np.arange(1, len(df) + 1)
10000 loops, best of 3: 154 µs per loop
所以没有时间为 reset_index
我不能说明确,但是它似乎只是为每个索引值添加1如果索引已经 0
基于
So without the timing for reset_index
I can't say definitively, however it looks like just adding 1 to each index value will be faster if the index is already 0
based
这篇关于在Python大 pandas 中,从1开始行索引而不是零,而不创建其他列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!