绘制R中每天和每月的购买次数 [英] Plot the count of purchase per day and month in R

查看:429
本文介绍了绘制R中每天和每月的购买次数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

数据集表示哪一个客户(Cstid =客户ID)在哪一天进行了购买。



我很难找到解决方案来绘制每天和每月的购买次数。



请在下面找到一个数据集的例子,我总共有7505个观测值。

 Cstid日期
1 4195 19/08/17
2 3937 16/08 / 17
3 2163 07/09/17
4 3407 08/10/16
5 4576 04/11/16
6 3164 16/12/16
7 3174 18/08/15
8 1670 18/08/15
9 1671 18/08/15
10 4199 19/07/14
11 4196 19/08 / 14
12 6725 14/09/14
13 3471 14/09/13



<我已经开始转换日期列:

  df $ Date<  -  as.Date(df $ Date,' %d /%m /%Y')

然后计算每个日期的观察次数: / b>

  library(data.table)
dt < - as.data.table(df)
dt [,days:= format(Date,%d。%m。%Y)]
dt1 < - data.frame(dt [,.N,by = days])

试图用以下方式绘制:

  plot(dt1 $ days,dt1 $ N,type =l)

得到以下错误信息:

  plot.window(...)中的错误:需要有限的'xlim'值
另外:警告消息:
1:In xy。 coords(x,y,xlabel,ylabel,log):强制引入的NAs
2:min(x):不丢失min的非缺失参数;返回Inf
3:在max(x)中:没有非缺失参数为max;返回-Inf

有人可以告诉我该如何继续吗?

解决方案

您需要使用%y 来指定一个2位数的年份小写),以便将 Date 列从字符转换为类 Date



如果使用 ggplot2 进行绘图,它也会进行聚合。 geom_bar()默认使用 count 统计。这使我们可以事先计算聚合(计数)。



对于按月汇总,我建议将所有日期映射到每个月的第一天,例如,使用 lubridate :: floor_date()。这样可以在X轴上保持连续的缩放比例。



所以,完整的代码是:

 #将日期从字符转换为类日期使用2位数年份
df $日期< - as.Date(df $ Date,'%d /%m /%y')

library(ggplot2)
#按天汇总
ggplot(df)+ aes(x =日期)+
geom_bar()



<$ p $ ($ d $)
geg_bar(
$ b $ g $) / code>



或者,日期可以映射到人物月,例如2015-08。但是,这将使x轴变成离散的比例,不再显示购买之间的流逝时间:

 #按月汇总使用format()创建离散比例
ggplot(df)+ aes(x = format(Date,%Y-%m))+
geom_bar()

< img src =https://i.stack.imgur.com/asHAF.pngalt =在这里输入图片描述>


The dataset represents which client (Cstid = Customer id) has made a purchase on which day.

I am facing difficulties finding a solution to plot the number of purchase per day and month.

Please find below an example of the dataset, I have in total 7505 observations.

  "Cstid"  "Date"
1  4195     19/08/17
2  3937     16/08/17
3  2163     07/09/17
4  3407     08/10/16
5  4576     04/11/16
6  3164     16/12/16
7  3174     18/08/15
8  1670     18/08/15
9  1671     18/08/15
10 4199     19/07/14
11 4196     19/08/14
12 6725     14/09/14
13 3471     14/09/13

I have started by converting the Date column :

 df$Date <- as.Date(df$Date, '%d/%m/%Y')

Then counted the number of observation per dates using :

library(data.table)
dt <- as.data.table(df)
dt[,days:=format(Date,"%d.%m.%Y")]
dt1 <- data.frame(dt[,.N,by=days])

And tried to plot with :

plot(dt1$days, dt1$N,type="l")

But i get the following error message :

Error in plot.window(...) : need finite 'xlim' values
In addition: Warning messages:
1: In xy.coords(x, y, xlabel, ylabel, log) : NAs introduced by coercion
2: In min(x) : no non-missing arguments to min; returning Inf
3: In max(x) : no non-missing arguments to max; returning -Inf

Could someone please inform how I should proceed?

解决方案

You need to specifiy a 2 digit year using %y (lower case) in order to convert the Date column from character to class Date.

If ggplot2 is used for plotting, it will also do the aggregation. geom_bar() uses the count statistics by default. This spares us to compute the aggregates (counts) beforehand.

For aggregation by month, I recommend to map all dates to the first day of each month, e.g., using lubridate::floor_date(). This keeps a continuous scale on the x-axis.

So, the complete code would be:

# convert Date from character to class Date using a 2 digit year
df$Date <- as.Date(df$Date, '%d/%m/%y')

library(ggplot2)
# aggregate by day
ggplot(df) + aes(x = Date) + 
  geom_bar()

#aggregate by month
ggplot(df) + aes(x = lubridate::floor_date(Date, "month")) + 
  geom_bar()

Alternatively, the dates can be mapped to character month, e.g., "2015-08". But this will turn the x-axis into a discrete scale which no longer shows the elapsed time between purchases:

# aggregate by month using format() to create discrete scale
ggplot(df) + aes(x = format(Date, "%Y-%m")) + 
  geom_bar()

这篇关于绘制R中每天和每月的购买次数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆