R 帮助将非数字列转换为数字 [英] R help converting non numeric column to numeric
问题描述
我正在尝试帮助我的销售总监朋友了解他记录的通话数据.他特别感兴趣的一栏是处置".此列具有字符串值,我正在尝试将它们转换为数值(即未回答"转换为 1,已回答"转换为 2 等)并删除任何未输入值的行.我创建了数据框,用作.numeric,创建和删除了列/行等,但无济于事.我只是想运行简单的 R 代码来给他一些见解.非常感谢任何和所有帮助.提前致谢!
I'm trying to help my friend, Director of Sales, make sense of his logged call data. There is one column in particular in which he is interested, "Disposition". This column has string values and I'm trying to convert them to numeric values (i.e. "Not Answered" converted to 1, "Answered" converted to 2, etc.) and remove any row with no values entered. I've created data frames, used as.numeric, created and deleted columns/rows, etc. to no avail. I'm just trying to run simple R code to give him some insight. Any and all help is much appreciated. Thanks in advance!
附言由于存在大量敏感信息(个人电话号码和电子邮件),我不确定是否应该提供一些代码.
P.S. I'm unsure as to whether I should provide some code due to the fact that there is a lot of delicate information (personal phone numbers and emails).
推荐答案
或者,您可以创建一个新列并使用 ifelse
语句用数值填充它.为了说明这一点,让我们假设这是您的数据框:
Alternatively, you can create a new column and fill it with the numeric values using an ifelse
statement. To illustrate, let's assume this is your dataframe:
df <- data.frame(
Disposition = c(rep(c("answer", "no answer", "whatever", NA),3)),
Anything = c(rnorm(12))
)
df
Disposition Anything
1 answer 2.54721951
2 no answer 1.07409803
3 whatever 0.60482744
4 <NA> 2.08405038
5 answer 0.31799860
6 no answer -1.17558239
7 whatever 0.94206106
8 <NA> 0.45355501
9 answer 0.01787330
10 no answer -0.07629330
11 whatever 0.83109679
12 <NA> -0.06937357
现在您定义一个新列,例如 df$Analysis,并根据 df$Disposition 中的信息为其分配编号:
Now you define a new column, say df$Analysis, and assign to it numbers based on the information in df$Disposition:
df$Analysis <- ifelse(df$Disposition=="no answer", 1,
ifelse(df$Disposition=="answer", 2, 3))
df
Disposition Anything Analysis
1 answer 2.54721951 2
2 no answer 1.07409803 1
3 whatever 0.60482744 3
4 <NA> 2.08405038 NA
5 answer 0.31799860 2
6 no answer -1.17558239 1
7 whatever 0.94206106 3
8 <NA> 0.45355501 NA
9 answer 0.01787330 2
10 no answer -0.07629330 1
11 whatever 0.83109679 3
12 <NA> -0.06937357 NA
这种方法的优点是你保持原始信息不变.如果您现在想要删除数据框中的 Na 值,请使用 na.omit
.注意:这不仅会删除 df$Disposition 中的 NA 值,还会删除任何列中带有 NA 的任何行:
The advantage of this method is that you keep the original information unchanged. If you now want to remove Na values in the dataframe, use na.omit
. NB: this will remove not only the NA values in df$Disposition but any row with NA in any column:
df_clean <- na.omit(df)
df_clean
Disposition Anything Analysis
1 answer 2.5472195 2
2 no answer 1.0740980 1
3 whatever 0.6048274 3
5 answer 0.3179986 2
6 no answer -1.1755824 1
7 whatever 0.9420611 3
9 answer 0.0178733 2
10 no answer -0.0762933 1
11 whatever 0.8310968 3
这篇关于R 帮助将非数字列转换为数字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!