如何在数据框中取出列索引名称 [英] How to take out the column index name in dataframe

查看：63 发布时间：2020/5/24 3:56:19 python pandas

本文介绍了如何在数据框中取出列索引名称的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

                     Open     High    Low    Close   Volume      Adj Close
Date                        
1990-01-02 00:00:00  35.25   37.50   35.00   37.25   6555600     8.70
1990-01-03 00:00:00  38.00   38.00   37.50   37.50   7444400     8.76
1990-01-04 00:00:00  38.25   38.75   37.25   37.63   7928800     8.79
1990-01-05 00:00:00  37.75   38.25   37.00   37.75   4406400     8.82
1990-01-08 00:00:00  37.50   38.00   37.00   38.00   3643200     8.88

如何摆脱上面dataframe中的日期索引名称?它应该与其他列名称位于同一行，但不是，这会引起问题.

How can I get rid of the Date index name in the above dataframe? It should be in the same row as the other column names but its not which is causing problems.

谢谢

推荐答案

简短的回答:您不能，而且不清楚为什么会引起问题". "Date"名称用于命名DataFrame的索引，该索引与任何列均不同.它专门为此偏移量打印，因此您不会将其与框架的列混淆.您不会使用DataFrame['Date']如下所示分割日期:

Short answer: you can't and it's not clear why this could ever "cause problems". The 'Date' name is naming the Index of the DataFrame, which is different from any of the columns. It gets printed with this offset specifically so you will not confuse it with a column of the frame. You would not slice into the date with DataFrame['Date'] as per below:

>>> import numpy as np; import pandas; import datetime

>>> dfrm = pandas.DataFrame(np.random.rand(10,3), 
... columns=['A','B','C'], 
... index = pandas.Index(
... [datetime.date(2012,6,elem) for elem in range(1,11)],
... name="Date"))

>>> dfrm
                   A         B         C
Date                                    
2012-06-01  0.283724  0.863012  0.798891
2012-06-02  0.097231  0.277564  0.872306
2012-06-03  0.821461  0.499485  0.126441
2012-06-04  0.887782  0.389486  0.374118
2012-06-05  0.248065  0.032287  0.850939
2012-06-06  0.101917  0.121171  0.577643
2012-06-07  0.225278  0.161301  0.708996
2012-06-08  0.906042  0.828814  0.247564
2012-06-09  0.733363  0.924076  0.393353
2012-06-10  0.273837  0.318013  0.754807

>>> dfrm['Date']
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/frame.py", line 1458, in __getitem__
    return self._get_item_cache(key)
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/generic.py", line 294, in _get_item_cache
    values = self._data.get(item)
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/internals.py", line 625, in get
    _, block = self._find_block(item)
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/internals.py", line 715, in _find_block
    self._check_have(item)
  File "/usr/local/lib/python2.7/dist-packages/pandas/core/internals.py", line 722, in _check_have
    raise KeyError('no item named %s' % str(item))
KeyError: 'no item named Date'

更长的答案:

您可以通过将索引添加到自己的列中来更改DataFrame(如果您希望它以这种方式打印).例如:

You can change your DataFrame by adding the index into its own column if you'd like it to print that way. For example:

>>> dfrm['Date'] = dfrm.index

>>> dfrm
                   A         B         C        Date
Date                                                
2012-06-01  0.283724  0.863012  0.798891  2012-06-01
2012-06-02  0.097231  0.277564  0.872306  2012-06-02
2012-06-03  0.821461  0.499485  0.126441  2012-06-03
2012-06-04  0.887782  0.389486  0.374118  2012-06-04
2012-06-05  0.248065  0.032287  0.850939  2012-06-05
2012-06-06  0.101917  0.121171  0.577643  2012-06-06
2012-06-07  0.225278  0.161301  0.708996  2012-06-07
2012-06-08  0.906042  0.828814  0.247564  2012-06-08
2012-06-09  0.733363  0.924076  0.393353  2012-06-09
2012-06-10  0.273837  0.318013  0.754807  2012-06-10

在此之后，您可以简单地更改索引的名称，以便不打印任何内容:

After this, you could simply change the name of the index so that nothing prints:

>>> dfrm.reindex(pandas.Series(dfrm.index.values, name=''))
                   A         B         C        Date

2012-06-01  0.283724  0.863012  0.798891  2012-06-01
2012-06-02  0.097231  0.277564  0.872306  2012-06-02
2012-06-03  0.821461  0.499485  0.126441  2012-06-03
2012-06-04  0.887782  0.389486  0.374118  2012-06-04
2012-06-05  0.248065  0.032287  0.850939  2012-06-05
2012-06-06  0.101917  0.121171  0.577643  2012-06-06
2012-06-07  0.225278  0.161301  0.708996  2012-06-07
2012-06-08  0.906042  0.828814  0.247564  2012-06-08
2012-06-09  0.733363  0.924076  0.393353  2012-06-09
2012-06-10  0.273837  0.318013  0.754807  2012-06-10

这似乎有些矫kill过正.另一种选择是在将日期"添加为列之后，将索引更改为整数或其他形式:

This seems a bit overkill. Another option is to just change the index to integers or something after adding the Date as a column:

>>> dfrm.reset_index()

或者如果您已经将索引手动移动到列中，那么

or if you already moved the index into a column manually, then just

>>> dfrm.index = range(len(dfrm))

>>> dfrm
          A         B         C        Date
0  0.283724  0.863012  0.798891  2012-06-01
1  0.097231  0.277564  0.872306  2012-06-02
2  0.821461  0.499485  0.126441  2012-06-03
3  0.887782  0.389486  0.374118  2012-06-04
4  0.248065  0.032287  0.850939  2012-06-05
5  0.101917  0.121171  0.577643  2012-06-06
6  0.225278  0.161301  0.708996  2012-06-07
7  0.906042  0.828814  0.247564  2012-06-08
8  0.733363  0.924076  0.393353  2012-06-09
9  0.273837  0.318013  0.754807  2012-06-10

或者，如果您关心列的显示顺序，则执行以下操作:

Or the following if you care about the order the columns appear:

>>> dfrm.ix[:,[-1]+range(len(dfrm.columns)-1)]
         Date         A         B         C
0  2012-06-01  0.283724  0.863012  0.798891
1  2012-06-02  0.097231  0.277564  0.872306
2  2012-06-03  0.821461  0.499485  0.126441
3  2012-06-04  0.887782  0.389486  0.374118
4  2012-06-05  0.248065  0.032287  0.850939
5  2012-06-06  0.101917  0.121171  0.577643
6  2012-06-07  0.225278  0.161301  0.708996
7  2012-06-08  0.906042  0.828814  0.247564
8  2012-06-09  0.733363  0.924076  0.393353
9  2012-06-10  0.273837  0.318013  0.754807

已添加

以下是一些有用的函数，它们可以包含在iPython配置脚本中(以便在启动时加载它们)，或者放在可以在Python中工作时轻松加载的模块中.

Here are a few helpful functions to include in an iPython configuration script (so that they are loaded upon startup), or to put in a module you can easily load when working in Python.

###########
# Imports #
###########
import pandas
import datetime
import numpy as np
from dateutil import relativedelta
from pandas.io import data as pdata


############################################
# Functions to retrieve Yahoo finance data #
############################################

# Utility to get generic stock symbol data from Yahoo finance.
# Starts two days prior to present (or most recent business day)
# and goes back a specified number of days.
def getStockSymbolData(sym_list, end_date=datetime.date.today()+relativedelta.relativedelta(days=-1), num_dates = 30):

    dReader = pdata.DataReader
    start_date = end_date + relativedelta.relativedelta(days=-num_dates)
    return dict( (sym, dReader(sym, "yahoo", start=start_date, end=end_date)) for sym in sym_list )                     
###

# Utility function to get some AAPL data when needed
# for testing.
def getAAPL(end_date=datetime.date.today()+relativedelta.relativedelta(days=-1), num_dates = 30):

    dReader = pdata.DataReader
    return getStockSymbolData(['AAPL'], end_date=end_date, num_dates=num_dates)
###

我还在下面开设了一个班级，以保存一些普通股数据:

I also made a class below to hold some data for common stocks:

#####
# Define a 'Stock' class that can hold simple info
# about a security, like SEDOL and CUSIP info. This
# is mainly for debugging things and quickly getting
# info for a single security.
class MyStock():

    def __init__(self, ticker='None', sedol='None', country='None'):
        self.ticker = ticker
        self.sedol=sedol
        self.country = country
    ###


    def getData(self, end_date=datetime.date.today()+relativedelta.relativedelta(days=-1), num_dates = 30):
        return pandas.DataFrame(getStockSymbolData([self.ticker], end_date=end_date, num_dates=num_dates)[self.ticker])
    ###


#####
# Make some default stock objects for common stocks.
AAPL = MyStock(ticker='AAPL', sedol='03783310', country='US')
SAP  = MyStock(ticker='SAP',  sedol='484628',   country='DE')

这篇关于如何在数据框中取出列索引名称的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何在数据框中取出列索引名称 [英] How to take out the column index name in dataframe

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何在数据框中取出列索引名称 [英] How to take out the column index name in dataframe

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭