检查所有值是否在多个列上都是数字并将其转换为数字 [英] Check if all values are numeric over multiple columns and convert them to numeric

查看:78
本文介绍了检查所有值是否在多个列上都是数字并将其转换为数字的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据框,所有列都是这样的字符。

I have a dataframe with all the columns being character like this.

ID <- c("A","A","A","A","A","A","A","A","B","B","B","B","B","B","B","B")
ToolID <- c("CCP_A","CCP_A","CCQ_A","CCQ_A","IOT_B","CCP_B","CCQ_B","IOT_B",
            "CCP_A","CCP_A","CCQ_A","CCQ_A","IOT_B","CCP_B","CCQ_B","IOT_B")
Step <- c("Step_A","Step_A","Step_B","Step_C","Step_D","Step_D","Step_E","Step_F",
          "Step_A","Step_A","Step_B","Step_C","Step_D","Step_D","Step_E","Step_F")
Measurement <- c("Length","Breadth","Width","Height",NA,NA,NA,NA,
                 "Length","Breadth","Width","Height",NA,NA,NA,NA)
Passfail <- c("Pass","Pass","Fail","Fail","Pass","Pass","Pass","Pass",
              "Pass","Pass","Fail","Fail","Pass","Pass","Pass","Pass")
Points <- as.character(c(7,5,3,4,0,0,0,0,17,15,13,14,0,0,0,0))
Average <- as.character(c(7.5,6.5,7.1,6.6,NA,NA,NA,NA,17.5,16.5,17.1,16.6,NA,NA,NA,NA))
Sigma <- as.character(c(2.5,2.5,2.1,2.6,NA,NA,NA,NA,12.5,12.5,12.1,12.6,NA,NA,NA,NA))
Tool <- c("ABC_1","ABC_2","ABD_1","ABD_2","COB_1","COB_2","COB_1","COB_2",
          "ABC_1","ABC_2","ABD_1","ABD_2","COB_1","COB_2","COB_1","COB_2")
Dose <- as.character(c(NA,NA,NA,NA,17.1,NA,NA,17.3,NA,NA,NA,NA,117.1,NA,NA,117.3))
Machine <- c("CO2","CO6","CO3","CO6","CO2,CO6","CO2,CO3,CO4","CO2,CO3","CO2",
             "CO2","CO6","CO3","CO6","CO2,CO6","CO2,CO3,CO4","CO2,CO3","CO2")

df <- data.frame(ID,ToolID,Step,Measurement,Passfail,Points,Average,Sigma,Tool,Dose,Machine)

我正在尝试检查这些字符数值的向量,然后将具有数值的向量转换为数值。我在R中使用 varhandle包

I am trying to check these character vectors for numeric values and then convert those with numeric values to numeric. I use the "varhandle" package in R to do it

library(varhandle)

if(all(check.numeric(df$Machine, na.rm=TRUE))){
  # convert the vector to numeric
  df$Machine <- as.numeric(df$Machine)
}

此方法有效,但效率不高,因为我必须手动输入上述列名。如何在循环中更有效地执行此操作,或者如何对多个列使用向量化?我的实际数据集大约有350列。有人可以指出我正确的方向吗?

This works but is inefficient because I have to manually enter the column names like above. How can I do it more efficiently in a loop or use vectorization over multiple columns? My actual dataset has around 350 columns. Can someone point me in the right direction?

推荐答案

我们可以使用中的 parse_guess 函数readr 包,该包基本上试图猜测列的类型。

We can use parse_guess function from readr package which basically tries to guess the type of columns.

library(readr)
library(dplyr)

df1 <- df %>% mutate_all(parse_guess)


str(df1)
#'data.frame':  16 obs. of  11 variables:
# $ ID         : chr  "A" "A" "A" "A" ...
# $ ToolID     : chr  "CCP_A" "CCP_A" "CCQ_A" "CCQ_A" ...
# $ Step       : chr  "Step_A" "Step_A" "Step_B" "Step_C" ...
# $ Measurement: chr  "Length" "Breadth" "Width" "Height" ...
# $ Passfail   : chr  "Pass" "Pass" "Fail" "Fail" ...
# $ Points     : int  7 5 3 4 0 0 0 0 17 15 ...
# $ Average    : num  7.5 6.5 7.1 6.6 NA NA NA NA 17.5 16.5 ...
# $ Sigma      : num  2.5 2.5 2.1 2.6 NA NA NA NA 12.5 12.5 ...
# $ Tool       : chr  "ABC_1" "ABC_2" "ABD_1" "ABD_2" ...
# $ Dose       : num  NA NA NA NA 17.1 NA NA 17.3 NA NA ...
# $ Machine    : chr  "CO2" "CO6" "CO3" "CO6" ...

这篇关于检查所有值是否在多个列上都是数字并将其转换为数字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆