附加信息行到data.frame的开始写作中的R为TXT时 [英] Appending information lines to the beginning of a data.frame when writing to txt in R

查看:779
本文介绍了附加信息行到data.frame的开始写作中的R为TXT时的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我刚刚收到的追加文本行的开头一个previous问题。 TXT输出。这很适合于我的例子中的数据,但我现在认识到我的实际数据有 POSIXlt 类,它包含的日子和时刻值之间的空间的变量(例如 2001-01-01 10时00分01秒)。这似乎是造成问题的R,了解数据的列数自己的。我曾试图给出的previous问题两项建议的变化,但似乎没有任何工作。我甚至尝试写为.csv文件,以更好地确定分离,但是这也失败了。

任何帮助将大大AP preciated。难道我也许标新立异吗?如果我只是做一个单独的readme.txt文件包含变量说明和避免这一切的挫折?我想要的数据集,以符合逻辑和不言自明的,以未来的用户。

的例子:

  ###例数据集
头< -
#variables:
#sal - 盐度[PSU]
#TEMP - 温度[摄氏度]
#datetime - 日期[YYYY-MM-DD HH:MM:SS]N'LT; - 10
DF< - data.frame(SAL = runif(N,30,37),温度= runif(N,15,17),日期时间= as.POSIXlt(10:00:01 2001-01-01))
DF###创建.TXT(或.csv?)
#选项1
FN< - 存在data.txt
沉(FN)
猫(头)
DF
水槽()
函数read.table(FN)
#ERROR在扫描(文件,什么,n最大,九月,十二月,报价,跳过,nlines,na.strings,:
#1号线没有5元#option 2
FN< - 存在data.txt
writeLines(头,FN)
write.table(DF,FN,追加= TRUE,报价= FALSE)
指令#warning消息:
#In write.table(DF,FN,追加= TRUE,报价= FALSE):
#追加列名到文件
函数read.table(FN)
#ERROR在扫描(文件,什么,n最大,九月,十二月,报价,跳过,nlines,na.strings,:
#1号线没有5元#option 3
FN< - data.csv
沉(FN)
猫(头)
write.csv(DF)
水槽()
read.csv(FN,标题= TRUE)
#ERROR在函数read.table(文件=文件,头=头,月=月,报价=报价,:
#不是列名的详细列


解决方案

您可以通过使用逗号(例如)而不是空格来分隔数据列这项工作。你当然那时候需要指定 =九月,参数都 write.table()函数read.table()

(顺便说一句,在众多可能的参数提供给 write.table()额外的控制是一个原因,一般preFER 写。表(DF,...,追加= TRUE)结束
沉(FN); DF;沉()。随着沉()中,data.frame被写入到同样的方式文件时,它会被打印到控制台,让您在其重新$ P的细节要少得多控制$ psentation。)

  FN<  - 存在data.txt
writeLines(头,FN)
write.table(DF,FN,追加= TRUE,报价= TRUE,月=)##从文件中读取数据现在工作正常
DD< - 函数read.table(FN,标题= TRUE,月=)
头(DD,4)
#萨尔温日期时间
#1 35.28238 16.48981 2001-01-01 10:00:01
#2 31.80891 16.68704 2001-01-01 10:00:01
#3 32.22510 15.87365 2001-01-01 10:00:01
#4 33.13408 16.60193 2001-01-01 10:00:01

I just received some great advice on a previous question regarding appending text lines to the beginning of a .txt output. This worked well for my example data, but I now realize that my actual data has a variable of the POSIXltclass, which contains a space between the day and hour values (e.g. "2001-01-01 10:00:01"). This seems to be causing problems for R to understand how many columns of data their are. I have tried variations on both suggestions given to the previous question, but nothing seems to work. I even tried writing as a .csv file to better define the separations, but this also failed.

Any help would be greatly appreciated. Am I perhaps doing something unorthodox here? Should I just make a separate "readme.txt" file to contain the variable descriptions and avoid all of this frustration? I want the data sets to be logical and self-explanatory to future users.

Examples:

###Example dataset
Head <- 
"#variables:
#sal - Salinity [PSU]
#temp - Temperature [degrees Celsius]
#datetime - Date [yyyy-mm-dd hh:mm:ss]

"

n <- 10
df <- data.frame(sal=runif(n, 30, 37), temp=runif(n, 15, 17), datetime=as.POSIXlt("2001-01-01 10:00:01"))
df

###Create .txt (or .csv?)
#option 1
fn <- "data.txt"
sink(fn)
cat(Head)
df
sink()
read.table(fn)
#Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  : 
#  line 1 did not have 5 elements

#option 2
fn <- "data.txt"
writeLines(Head, fn)
write.table(df, fn, append=TRUE, quote=FALSE)
#Warning message:
#In write.table(df, fn, append = TRUE, quote = FALSE) :
#  appending column names to file
read.table(fn)
#Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  : 
#  line 1 did not have 5 elements

#option 3
fn <- "data.csv"
sink(fn)
cat(Head)
write.csv(df)
sink()
read.csv(fn, header=TRUE)
#Error in read.table(file = file, header = header, sep = sep, quote = quote,  : 
#  more columns than column names

解决方案

You can make this work by using commas (for example) instead of whitespace to separate the data columns. You'll of course then need to specify the sep="," argument to both write.table() and read.table().

(Incidentally, the extra control provided by the many possible arguments to write.table() is one reason to generally prefer write.table(df, ..., append=TRUE) over sink(fn); df; sink(). With sink(), the data.frame gets written to a file in same way it would be printed to the console, giving you much less control over details of its representation.)

fn <- "data.txt"
writeLines(Head, fn)
write.table(df, fn, append=TRUE, quote=TRUE, sep=",")

## Reading data from the file now works fine 
dd <- read.table(fn, header=TRUE, sep=",")
head(dd, 4)
#        sal     temp            datetime
# 1 35.28238 16.48981 2001-01-01 10:00:01
# 2 31.80891 16.68704 2001-01-01 10:00:01
# 3 32.22510 15.87365 2001-01-01 10:00:01
# 4 33.13408 16.60193 2001-01-01 10:00:01

这篇关于附加信息行到data.frame的开始写作中的R为TXT时的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆