在 R 中写入 txt 时,将信息行附加到 data.frame 的开头 [英] Appending information lines to the beginning of a data.frame when writing to txt in R

查看:29
本文介绍了在 R 中写入 txt 时,将信息行附加到 data.frame 的开头的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我刚刚收到了一些关于将文本行附加到 .txt 输出开头的之前的问题的一些很好的建议.这对我的示例数据很有效,但我现在意识到我的实际数据有一个 POSIXlt 类的变量,它在日期和小时值之间包含一个空格(例如2001-01-0110:00:01").这似乎导致 R 无法理解它们有多少列数据.我已经尝试对上一个问题的两个建议进行修改,但似乎没有任何效果.我什至尝试将其编写为 .csv 文件以更好地定义分色,但这也失败了.

I just received some great advice on a previous question regarding appending text lines to the beginning of a .txt output. This worked well for my example data, but I now realize that my actual data has a variable of the POSIXltclass, which contains a space between the day and hour values (e.g. "2001-01-01 10:00:01"). This seems to be causing problems for R to understand how many columns of data their are. I have tried variations on both suggestions given to the previous question, but nothing seems to work. I even tried writing as a .csv file to better define the separations, but this also failed.

任何帮助将不胜感激.我可能在这里做了一些非正统的事情吗?我应该制作一个单独的readme.txt"吗?文件来包含变量描述并避免所有这些挫折?我希望数据集对未来的用户来说是合乎逻辑的和不言自明的.

Any help would be greatly appreciated. Am I perhaps doing something unorthodox here? Should I just make a separate "readme.txt" file to contain the variable descriptions and avoid all of this frustration? I want the data sets to be logical and self-explanatory to future users.

###Example dataset
Head <- 
"#variables:
#sal - Salinity [PSU]
#temp - Temperature [degrees Celsius]
#datetime - Date [yyyy-mm-dd hh:mm:ss]

"

n <- 10
df <- data.frame(sal=runif(n, 30, 37), temp=runif(n, 15, 17), datetime=as.POSIXlt("2001-01-01 10:00:01"))
df

###Create .txt (or .csv?)
#option 1
fn <- "data.txt"
sink(fn)
cat(Head)
df
sink()
read.table(fn)
#Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  : 
#  line 1 did not have 5 elements

#option 2
fn <- "data.txt"
writeLines(Head, fn)
write.table(df, fn, append=TRUE, quote=FALSE)
#Warning message:
#In write.table(df, fn, append = TRUE, quote = FALSE) :
#  appending column names to file
read.table(fn)
#Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  : 
#  line 1 did not have 5 elements

#option 3
fn <- "data.csv"
sink(fn)
cat(Head)
write.csv(df)
sink()
read.csv(fn, header=TRUE)
#Error in read.table(file = file, header = header, sep = sep, quote = quote,  : 
#  more columns than column names

推荐答案

您可以通过使用逗号(例如)代替空格来分隔数据列来实现此目的.您当然需要为 write.table()read.table() 指定 sep="," 参数.

You can make this work by using commas (for example) instead of whitespace to separate the data columns. You'll of course then need to specify the sep="," argument to both write.table() and read.table().

(顺便说一句,write.table() 的许多可能参数提供的额外控制是通常更喜欢 write.table(df, ..., append=TRUE) 结束sink(fn);df;接收器().使用 sink(),data.frame 会以与打印到控制台相同的方式写入文件,从而减少对其表示细节的控制.)

(Incidentally, the extra control provided by the many possible arguments to write.table() is one reason to generally prefer write.table(df, ..., append=TRUE) over sink(fn); df; sink(). With sink(), the data.frame gets written to a file in same way it would be printed to the console, giving you much less control over details of its representation.)

fn <- "data.txt"
writeLines(Head, fn)
write.table(df, fn, append=TRUE, quote=TRUE, sep=",")

## Reading data from the file now works fine 
dd <- read.table(fn, header=TRUE, sep=",")
head(dd, 4)
#        sal     temp            datetime
# 1 35.28238 16.48981 2001-01-01 10:00:01
# 2 31.80891 16.68704 2001-01-01 10:00:01
# 3 32.22510 15.87365 2001-01-01 10:00:01
# 4 33.13408 16.60193 2001-01-01 10:00:01

这篇关于在 R 中写入 txt 时,将信息行附加到 data.frame 的开头的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆