Julia DataFrames 等效于 pandas pct_change() [英] Julia DataFrames equivalent of pandas pct_change()
问题描述
目前,我已经编写了以下用于百分比变化计算的函数:
Currently, I have written the below function for percent change calculation:
function pct_change(input::AbstractVector{<:Number})::AbstractVector{Number}
result = [NaN]
for i in 2:length(input)
push!(result, (input[i] - input[i-1])/abs(input[i-1]))
end
return result
end
这按预期工作.但想知道是否有类似于 pandas 的 Julia DataFrames 的内置函数 pct_change
我可以直接使用哪个?或者我可以对上述功能做出任何其他更好的方法或改进?
This works as expected. But wanted to know whether there is a built-in function for Julia DataFrames similar to pandas pct_change
which I can use directly? Or any other better way or improvements that I can make to my function above?
推荐答案
这是一个非常具体的函数,DataFrames.jl中没有提供,而是TimeSeries.jl中提供.这是一个例子:
This is a very specific function and is not provided in DataFrames.jl, but rather TimeSeries.jl. Here is an example:
julia> using TimeSeries, Dates
julia> ta = TimeArray(Date(2018, 1, 1):Day(1):Date(2018, 12, 31), 1:365);
julia> percentchange(ta);
(对于应该计算的内容还有更多选项)
(there are some more options to what should be calculated)
缺点是它只接受 TimeArray
对象,并且它会丢弃无法计算百分比变化的时段(因为它们保留在 Python 中).
The drawback is that it accepts only TimeArray
objects and that it drops periods for which percent change cannot be calculated (as they are retained in Python).
如果您想要自定义定义,请考虑将第一个值表示为 missing
而不是 NaN
,如 missing
.此外,您的函数不会产生最准确的数字表示(例如,如果您想使用 BigFloat
或使用 Rational
类型的精确计算,它们将被转换为 Float64代码>).以下是避免这些问题的示例替代函数实现:
If you want your custom definition consider denoting the first value as missing
rather than NaN
, as missing
. Also your function will not produce the most accurate representation of the numbers (e.g. if you wanted to use BigFloat
or exact calculations using Rational
type they will be converted to Float64
). Here are example alternative function implementations that avoid these problems:
function pct_change(input::AbstractVector{<:Number})
res = @view(input[2:end]) ./ @view(input[1:end-1]) .- 1
[missing; res]
end
或
function pct_change(input::AbstractVector{<:Number})
[i == 1 ? missing : (input[i]-input[i-1])/input[i-1] for i in eachindex(input)]
end
现在你在这两种情况下都有:
And now you have in both cases:
julia> pct_change(1:10)
10-element Array{Union{Missing, Float64},1}:
missing
1.0
0.5
0.33333333333333326
0.25
0.19999999999999996
0.16666666666666674
0.1428571428571428
0.125
0.11111111111111116
julia> pct_change(big(1):10)
10-element Array{Union{Missing, BigFloat},1}:
missing
1.0
0.50
0.3333333333333333333333333333333333333333333333333333333333333333333333333333391
0.25
0.2000000000000000000000000000000000000000000000000000000000000000000000000000069
0.1666666666666666666666666666666666666666666666666666666666666666666666666666609
0.1428571428571428571428571428571428571428571428571428571428571428571428571428547
0.125
0.111111111111111111111111111111111111111111111111111111111111111111111111111113
julia> pct_change(1//1:10)
10-element Array{Union{Missing, Rational{Int64}},1}:
missing
1//1
1//2
1//3
1//4
1//5
1//6
1//7
1//8
1//9
返回正确的值.
这篇关于Julia DataFrames 等效于 pandas pct_change()的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!