R tbats 模型季节性客户标记 无结果 [英] R tbats Model Seasonal Customer Flag No Results

查看:58
本文介绍了R tbats 模型季节性客户标记 无结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在关注此博客以识别我的时间序列数据中的季节性客户:

导入pandas作为pa将 numpy 导入为 np将 pyodbc 导入为 pycnxn = py.connect('DRIVER='+driver+';SERVER='+server+';PORT=1433;DATABASE='+database+';UID='+用户名+';PWD='+密码)original = pa.read_sql_query('SELECT s.customer_id, s.yr, s.mnth, Case when s.usage<0 then 0 else s.usage end as usage FROM dbo.Seasonal s Join ( Select Top 2000 customer_id, SUM(用法)作为用法来自 dbo.Seasonal where Yr!=2018 Group by customer_id ) t ON s.customer_id = t.customer_id Where yr!= 2018 Order by customer_id, yr, mnth', cnxn)grouped = original.groupby(by='customer_id')def yearmonth_to_justmonth(年,月):返回年 * 12 + 月 - 1def fillInForOwner(group):min = group.head(1).iloc[0]max = group.tail(1).iloc[0]minMonths = yearmonth_to_justmonth(min.yr, min.mnth)maxMonths = yearmonth_to_justmonth(max.yr, max.mnth)fill_index = pa.Index(np.arange(minMonths, maxMonths, 1), name="filled_months")group['months'] = group.yr * 12 + group.mnth - 1group = group.set_index('月')group = group.reindex(fill_index)group.customer_id = min.customer_idgroup.yr = group.index//12group.mnth = group.index % 12 + 1group.usage = np.where(group.usage.isnull(), 0, group.usage).astype(int)返回组fillIn = grouped.apply(fillInForOwner)newIndex = pa.Index(np.arange(fillIn.customer_id.count()))将 rpy2 导入为 r从 rpy2.robjects.packages 导入导入器从 rpy2.robjects 导入 r、pandas2ri、globalenvpandas2ri.activate()base = importr('base')colorspace = importr('colorspace')预测 = 进口商('预测')times = importr('timeSeries')统计=进口商('统计')输出文件 = 'results.csv'df_list = []对于 customerid,fillIn.groupby(by=['customer_id']) 中的 dataForCustomer:startYear = dataForCustomer.head(1).iloc[0].yrstartMonth = dataForCustomer.head(1).iloc[0].mnthendYear = dataForCustomer.tail(1).iloc[0].yrendMonth = dataForCustomer.tail(1).iloc[0].mnthcustomerTS = stats.ts(dataForCustomer.usage.astype(int),start=base.c(startYear,startMonth),end=base.c(endYear, endMonth),频率=12)r.assign('customerTS', customerTS)尝试:季节性 = r('''fit<-tbats(customerTS,seasonal.periods = 12,use.parallel = TRUE)适合$季节性''')除了:季节性 = 1df_list.append({'customer_id': customerid, '季节性': 季节性})打印(f' {customerid} | {seasonal} ')季节性输出 = pa.DataFrame(df_list)打印(季节性输出)季节性输出.to_csv(输出文件)

解决方案

回过头来回答我是如何让它工作的,只是将原始"数据帧传递到 for 循环中.我的数据已经有空的 $0 月份,所以我不需要运行那部分代码.谢谢大家的帮助

I am following this blog to identify seasonal customers in my time series data: https://www.kristenkehrer.com/seasonality-code

My code is shamelessly nearly identical to the blog, with some small tweaks, code is below. I am able to run the code entirely, for 2000 customers. A few hours later, 0 customers were flagged as seasonal in my results.

Manually looking at customers data over time, I do believe I have many examples of seasonal customers that should have been picked up. Below is a sample of the data I am using.

Am I missing something stupid? am I in way over my head to even try this, being very new to python?

Note that I am adding the "0 months" in my data source, but I don't think it would hurt anything for that function to check again. I'm also not including the data source credentials step.

Thank you

import pandas as pa
import numpy as np
import pyodbc as py

cnxn = py.connect('DRIVER='+driver+';SERVER='+server+';PORT=1433;DATABASE='+database+';UID='+username+';PWD='+ password)

original = pa.read_sql_query('SELECT s.customer_id, s.yr, s.mnth, Case when s.usage<0 then 0 else s.usage end as usage  FROM dbo.Seasonal  s   Join   ( Select Top 2000  customer_id, SUM(usage) as usage   From dbo.Seasonal where Yr!=2018    Group by customer_id ) t ON s.customer_id = t.customer_id Where yr!= 2018 Order by customer_id, yr, mnth', cnxn)

grouped = original.groupby(by='customer_id')

def yearmonth_to_justmonth(year, month):

    return year * 12 + month - 1

def fillInForOwner(group):
    min = group.head(1).iloc[0]
    max = group.tail(1).iloc[0]
    minMonths = yearmonth_to_justmonth(min.yr, min.mnth)
    maxMonths = yearmonth_to_justmonth(max.yr, max.mnth)
    filled_index = pa.Index(np.arange(minMonths, maxMonths, 1), name="filled_months")
    group['months'] = group.yr * 12 + group.mnth - 1
    group = group.set_index('months')
    group = group.reindex(filled_index)
    group.customer_id = min.customer_id
    group.yr = group.index // 12
    group.mnth = group.index % 12 + 1
    group.usage = np.where(group.usage.isnull(), 0, group.usage).astype(int)
    return group

filledIn = grouped.apply(fillInForOwner)
newIndex = pa.Index(np.arange(filledIn.customer_id.count()))

import rpy2 as r
from rpy2.robjects.packages import importr
from rpy2.robjects import r, pandas2ri, globalenv
pandas2ri.activate()

base = importr('base')
colorspace = importr('colorspace')
forecast = importr('forecast')
times = importr('timeSeries')
stats = importr('stats')

outfile = 'results.csv'
df_list = []

for customerid, dataForCustomer in filledIn.groupby(by=['customer_id']):
    startYear = dataForCustomer.head(1).iloc[0].yr
    startMonth = dataForCustomer.head(1).iloc[0].mnth
    endYear = dataForCustomer.tail(1).iloc[0].yr
    endMonth = dataForCustomer.tail(1).iloc[0].mnth


    customerTS = stats.ts(dataForCustomer.usage.astype(int),
                      start=base.c(startYear,startMonth),
                      end=base.c(endYear, endMonth), 
                      frequency=12)
    r.assign('customerTS', customerTS)

    try:
        seasonal = r('''
                        fit<-tbats(customerTS, seasonal.periods = 12, 
                        use.parallel = TRUE)
                        fit$seasonal
                     ''')
    except: 
        seasonal = 1
    df_list.append({'customer_id': customerid, 'seasonal': seasonal})
    print(f' {customerid} | {seasonal} ')

seasonal_output = pa.DataFrame(df_list)
print(seasonal_output)
seasonal_output.to_csv(outfile)

解决方案

Circling back to answer how I got this to work was by just passing the "original" dataframe into the for loop. My data already had the empty $0 months so I didn't need that part of the code to run. Thank you all for your help

这篇关于R tbats 模型季节性客户标记 无结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆