朱莉娅 |数据框 |替换缺失值 [英] Julia | DataFrame | Replacing missing Values
问题描述
对于 DataFrame
中的列,我们如何用 0.0
替换 missing
值?
How can we replace missing
values with 0.0
for a column in a DataFrame
?
推荐答案
有几种不同的方法可以解决这个问题(适用于 Julia 1.x):
There are a few different approaches to this problem (valid for Julia 1.x):
可能最简单的方法是使用基础 Julia 中的 replace!
或 replace
.这是一个带有 replace!
的示例:
Probably the easiest approach is to use replace!
or replace
from base Julia. Here is an example with replace!
:
julia> using DataFrames
julia> df = DataFrame(x = [1, missing, 3])
3×1 DataFrame
│ Row │ x │
│ │ Int64⍰ │
├─────┼─────────┤
│ 1 │ 1 │
│ 2 │ missing │
│ 3 │ 3 │
julia> replace!(df.x, missing => 0);
julia> df
3×1 DataFrame
│ Row │ x │
│ │ Int64⍰ │
├─────┼────────┤
│ 1 │ 1 │
│ 2 │ 0 │
│ 3 │ 3 │
但是,请注意,此时列 x
的类型仍然允许缺失值:
However, note that at this point the type of column x
still allows missing values:
julia> typeof(df.x)
Array{Union{Missing, Int64},1}
当打印出数据帧时,这也由列 x
中 Int64
后面的问号表示.您可以使用 disallowmissing!
(来自 DataFrames.jl 包):
This is also indicated by the question mark following Int64
in column x
when the data frame is printed out. You can change this by using disallowmissing!
(from the DataFrames.jl package):
julia> disallowmissing!(df, :x)
3×1 DataFrame
│ Row │ x │
│ │ Int64 │
├─────┼───────┤
│ 1 │ 1 │
│ 2 │ 0 │
│ 3 │ 3 │
或者,如果您按如下方式使用 replace
(不带感叹号),则输出将已经不允许缺失值:
Alternatively, if you use replace
(without the exclamation mark) as follows, then the output will already disallow missing values:
julia> df = DataFrame(x = [1, missing, 3]);
julia> df.x = replace(df.x, missing => 0);
julia> df
3×1 DataFrame
│ Row │ x │
│ │ Int64 │
├─────┼───────┤
│ 1 │ 1 │
│ 2 │ 0 │
│ 3 │ 3 │
Base.ismissing 与逻辑索引
您可以使用带有逻辑索引的 ismissing
为数组中所有缺失的条目分配一个新值:
Base.ismissing with logical indexing
You can use ismissing
with logical indexing to assign a new value to all missing entries of an array:
julia> df = DataFrame(x = [1, missing, 3]);
julia> df.x[ismissing.(df.x)] .= 0;
julia> df
3×1 DataFrame
│ Row │ x │
│ │ Int64⍰ │
├─────┼────────┤
│ 1 │ 1 │
│ 2 │ 0 │
│ 3 │ 3 │
Base.coalesce
另一种方法是使用coalesce
:
julia> df = DataFrame(x = [1, missing, 3]);
julia> df.x = coalesce.(df.x, 0);
julia> df
3×1 DataFrame
│ Row │ x │
│ │ Int64 │
├─────┼───────┤
│ 1 │ 1 │
│ 2 │ 0 │
│ 3 │ 3 │
DataFramesMeta
replace
和 coalesce
都可以与 DataFramesMeta.jl 包:
DataFramesMeta
Both replace
and coalesce
can be used with the @transform
macro from the DataFramesMeta.jl package:
julia> using DataFramesMeta
julia> df = DataFrame(x = [1, missing, 3]);
julia> @transform(df, x = replace(:x, missing => 0))
3×1 DataFrame
│ Row │ x │
│ │ Int64 │
├─────┼───────┤
│ 1 │ 1 │
│ 2 │ 0 │
│ 3 │ 3 │
julia> df = DataFrame(x = [1, missing, 3]);
julia> @transform(df, x = coalesce.(:x, 0))
3×1 DataFrame
│ Row │ x │
│ │ Int64 │
├─────┼───────┤
│ 1 │ 1 │
│ 2 │ 0 │
│ 3 │ 3 │
其他文档
- Julia 手册
- Julia 手册-函数参考
- DataFrames.jl 手册
这篇关于朱莉娅 |数据框 |替换缺失值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!