读取选项卡分隔数据到R [英] Reading Tab Delimited Data in to R

查看:148
本文介绍了读取选项卡分隔数据到R的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



首先我尝试过:

  data<  -  read.table(data.csv,sep =\t)

但它正在读取一些数字变量作为因子



所以我试图读取数据基于什么类型我想要每个变量是这样的:

  data<  -  read.table(data.csv \t,colClasses = c(character,numeric,numeric,character,boolean,numeric))

但是当我尝试这个时,它给我一个错误:


(4)'

(file,what,nmax,sep,dec,quote,skip,nlines, / blockquote>

我认为可能是原始文件中有一些数字值,但我不确定。

解决方案

没有看到您的数据,您有以下几种情况之一:您没有所有选项卡分隔数据;在单个观察中有嵌入的选项卡;或其他人的遗嘱。



你可以排序的方法是设置选项(stringsAsFactors = FALSE)然后使用你的第一行。



检查 str(data),并尝试找出哪些行是罪魁祸首。一些数值作为因子读取的原因是因为在该列中有一些R将其解释为一个字符,因此它将整个列强制为字符。它通常需要一些挖掘,但问题几乎肯定与您的输入文件。



这是一个常见的数据变化问题,祝你好运!


I am trying to read a large tab delimited file in to R.

First I tried this:

data <- read.table("data.csv", sep="\t")

But it is reading some of the numeric variables in as factors

So I tried to read in the data based on what type I want each variable to be like this:

data <- read.table("data.csv", sep="\t", colClasses=c("character","numeric","numeric","character","boolean","numeric"))

But when I try this it gives me an error:

Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings, : scan() expected 'a real', got '"4"'

I think it might be that there are quotes around some of the numeric values in the original raw file, but I'm not sure.

解决方案

Without seeing your data, you have one of a few things: you don't have all tabs separating the data; there are embeded tabs in single observations; or a litnay of others.

The way you can sort this out is to set options(stringsAsFactors=FALSE) then use your first line.

Check out str(data) and try to figure out which rows are the culprits. The reason some of the numeric values are reading as factors is because there is something in that column that R is interpreting as a character and so it coerces the whole column to character. It usually takes some digging but the problem is almost surely with your input file.

This is a common data munging issue, good luck!

这篇关于读取选项卡分隔数据到R的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆