只返回大 pandas 一年中的最后一天吗? [英] Return only the last day of the year with pandas?

查看:78
本文介绍了只返回大 pandas 一年中的最后一天吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使一个api从financialmodelingprep api中获取指定公司的股票历史收盘价的请求.它返回库存的每个记录日期.问题是,我只需要过去5年的最后日期,即可将其与财务报表进行比较.有谁知道如何过滤数据集以获取年份的最后日期,而无需指定确切日期?目标是将表格导出为csv格式,然后将其与其他公司进一步合并.

Made an api get request for the historical close prices of a stock for a specified company from the financialmodelingprep api. It returns every recorded date for the stock. The problem is that i need only the last date of the last 5 years, in order to compare it to the financial statements. Does anyone know how to filter the dataset to get the last date of the year, without specifying the exact date? The goal is to export the table to csv format and further combine it with other companies.

是否有更好的方法来获得所需的结果?

Is there a better way to get the result that i need?

symbols = ["MMM",
           "ABT",
           "ABBV",
           "ABMD",
           "ACN",
           ]
import requests
import pandas as pd
import datetime

API_KEY = 'my_key'
api_stock_price_url =  "https://financialmodelingprep.com/api/v3/historical-price-full/" + symbols[0] + "?serietype=line&apikey=" + API_KEY
company_stock_price = pd.read_json(api_stock_price_url)
date_and_close = pd.json_normalize(company_stock_price["historical"])
company_stock_price["date"] = date_and_close["date"]
company_stock_price["close"] = date_and_close["close"]

company_stock_price.index = [company_stock_price["date"]]
# CHANGES THE INDEX TO BE THE NORMALIZED DATE
company_stock_price["date"] = pd.to_datetime(company_stock_price["date"])
# CHANGES THE FORMAT TO DATE

del company_stock_price['historical']
del company_stock_price['date']
# deletes the unwanted columns

重新调整的 company_stock_price

    symbol  close
date        
2020-12-04  MMM 172.460007
2020-12-03  MMM 171.830002
2020-12-02  MMM 171.850006
2020-12-01  MMM 170.520004
2020-11-30  MMM 172.729996
... ... ...
1970-09-14  MMM 0.322600
1970-09-11  MMM 0.321700
1970-09-10  MMM 0.323500
1970-09-09  MMM 0.324000
1970-09-08  MMM 0.318800
12675 rows × 2 columns

我需要的期望输出看起来像这样:

the desired output i need would look something like this:

    symbol  close
date        
2020-12-31  MMM 172.460007
2019-12-31  MMM 131.112123
2018-12-31  MMM 123.123123
2017-12-31  MMM 111.111111
2016-11-31  MMM 101.111111

在这种情况下的问题是我无法指定确切的日期,因为一些s& p500公司(我打算进行循环)缺少该日期的股价在返回的api响应中.

the problem in this case is that i cannot specify the exact date, because some of the s&p500 companies(which i am going for to loop over) are missing the stock price for that date in the returned api responses.

推荐答案

df2 = df.groupby(pd.DatetimeIndex(df['date']).year, 
    as_index=False).agg({'date': max}).reset_index(drop=True)

        date symbol       close
0 1970-09-14    MMM    0.322600
1 2020-12-04    MMM  172.460007

此处,数据框按 date 列的年份分组,然后返回每年最大日期的行.然后,您可以按日期对它进行排序,并获得最后五行:

Here the dataframe is grouped by the year of date column, then the rows with maximum date per year are returned. Then you can sort it by date and get the five last rows:

df2.sort_values('date').iloc[-5:]

这篇关于只返回大 pandas 一年中的最后一天吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆