R Dataframe中的级别 [英] Levels in R Dataframe

查看:54
本文介绍了R Dataframe中的级别的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我从.csv文件导入了数据,并附加了数据集。

我的问题:一个变量是整数形式,具有295个级别。我需要使用此变量来创建其他变量,但我不知道如何处理这些级别。

I imported data from a .csv file, and attached the dataset.
My problem: one variable is in integer form and has 295 levels. I need to use this variable to create others, but I don't know how to deal with the levels.

这些是什么,以及如何处理它们?

What are these, and how do I deal with them?

推荐答案

当您使用read.table(或read.csv?-您未指定)读入数据时,添加参数stringAsFactors = FALSE。然后,您将获得字符数据。

When you read in the data with read.table (or read.csv? - you didn't specify), add the argument stringsAsFactors = FALSE. Then you will get character data instead.

如果您希望该列为整数,那么您必须具有不可解释为整数的数据,因此在阅读后将其转换为数字。

If you are expecting integers for the column then you must have data that is not interpretable as integers, so convert to numeric after you've read it.

txt <- c("x,y,z", "1,2,3", "a,b,c")

d <- read.csv(textConnection(txt))
sapply(d, class)
       x        y        z 
##"factor" "factor" "factor" 

## we don't want factors, but characters
d <- read.csv(textConnection(txt), stringsAsFactors = FALSE)
sapply(d, class)

#          x           y           z 
#"character" "character" "character" 

## convert x to numeric, and wear NAs for non numeric data
as.numeric(d$x)

#[1]  1 NA
#Warning message:
#NAs introduced by coercion 

最后,如果您想忽略这些输入详细信息并从因子使用中提取整数级别,例如as.numeric(levels(d $ x))[d $ x],根据?factor中的警告。

Finally, if you want to ignore these input details and extract the integer levels from the factor use e.g. as.numeric(levels(d$x))[d$x], as per "Warning" in ?factor.

这篇关于R Dataframe中的级别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆