Julia DataFrames 等效于 pandas pct_change() [英] Julia DataFrames equivalent of pandas pct_change()

查看:21
本文介绍了Julia DataFrames 等效于 pandas pct_change()的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

目前,我已经编写了以下用于百分比变化计算的函数:

Currently, I have written the below function for percent change calculation:

function pct_change(input::AbstractVector{<:Number})::AbstractVector{Number}
    result = [NaN]
    for i in 2:length(input)
        push!(result, (input[i] - input[i-1])/abs(input[i-1]))
    end
    return result
end

这按预期工作.但想知道是否有类似于 pandas 的 Julia DataFrames 的内置函数 pct_change 我可以直接使用哪个?或者我可以对上述功能做出任何其他更好的方法或改进?

This works as expected. But wanted to know whether there is a built-in function for Julia DataFrames similar to pandas pct_change which I can use directly? Or any other better way or improvements that I can make to my function above?

推荐答案

这是一个非常具体的函数,DataFrames.jl中没有提供,而是TimeSeries.jl中提供.这是一个例子:

This is a very specific function and is not provided in DataFrames.jl, but rather TimeSeries.jl. Here is an example:

julia> using TimeSeries, Dates

julia> ta = TimeArray(Date(2018, 1, 1):Day(1):Date(2018, 12, 31), 1:365);

julia> percentchange(ta);

(对于应该计算的内容还有更多选项)

(there are some more options to what should be calculated)

缺点是它只接受 TimeArray 对象,并且它会丢弃无法计算百分比变化的时段(因为它们保留在 Python 中).

The drawback is that it accepts only TimeArray objects and that it drops periods for which percent change cannot be calculated (as they are retained in Python).

如果您想要自定义定义,请考虑将第一个值表示为 missing 而不是 NaN,如 missing.此外,您的函数不会产生最准确的数字表示(例如,如果您想使用 BigFloat 或使用 Rational 类型的精确计算,它们将被转换为 Float64).以下是避免这些问题的示例替代函数实现:

If you want your custom definition consider denoting the first value as missing rather than NaN, as missing. Also your function will not produce the most accurate representation of the numbers (e.g. if you wanted to use BigFloat or exact calculations using Rational type they will be converted to Float64). Here are example alternative function implementations that avoid these problems:

function pct_change(input::AbstractVector{<:Number})
    res = @view(input[2:end]) ./ @view(input[1:end-1]) .- 1
    [missing; res]
end

function pct_change(input::AbstractVector{<:Number})
    [i == 1 ? missing : (input[i]-input[i-1])/input[i-1] for i in eachindex(input)]
end

现在你在这两种情况下都有:

And now you have in both cases:

julia> pct_change(1:10)
10-element Array{Union{Missing, Float64},1}:
  missing
 1.0
 0.5
 0.33333333333333326
 0.25
 0.19999999999999996
 0.16666666666666674
 0.1428571428571428
 0.125
 0.11111111111111116

julia> pct_change(big(1):10)
10-element Array{Union{Missing, BigFloat},1}:
  missing
 1.0
 0.50
 0.3333333333333333333333333333333333333333333333333333333333333333333333333333391
 0.25
 0.2000000000000000000000000000000000000000000000000000000000000000000000000000069
 0.1666666666666666666666666666666666666666666666666666666666666666666666666666609
 0.1428571428571428571428571428571428571428571428571428571428571428571428571428547
 0.125
 0.111111111111111111111111111111111111111111111111111111111111111111111111111113

julia> pct_change(1//1:10)
10-element Array{Union{Missing, Rational{Int64}},1}:
   missing
 1//1
 1//2
 1//3
 1//4
 1//5
 1//6
 1//7
 1//8
 1//9

返回正确的值.

这篇关于Julia DataFrames 等效于 pandas pct_change()的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆