R - 在特定行之后从 .txt 文件中读取行 [英] R - Reading lines from a .txt-file after a specific line

查看:41
本文介绍了R - 在特定行之后从 .txt 文件中读取行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一堆输出 .txt 文件,其中包含一个大参数列表和一个 X-Y 坐标集.我需要从所有文件中提取这些坐标,以便仅将这些行导入向量.这将适用于

I have a bunch of output .txt-files that consists of a large parameter list and a X-Y-coordinate set. I need to extract these coordinates from all files so that only those lines are imported to a vector. This would work fine with

impcoord<-read.table("file.txt",skip= ,nrow= ,...)

但文件在不同长度的支持参数后打印坐标集.

but the files print the coordinate sets after different lengths of supporting parameters.

幸运的是,坐标总是在包含某些单词的行之后开始.

Luckily the coordinates always start after a line containing certain words.

因此我的问题是,如何在这些单词之后开始阅读 .txt 文件?假设它们是:

Thus my question is, how do I start reading the .txt-file after these words? Let's say they are:

coordinatesXY

非常感谢您的时间和帮助!

Thanks alot for your time and help!

-奥利

--编辑--

抱歉造成混乱.

文件部分如下:

##XYDATA= (X++(Y..Y))
131071    -2065
131070    -4137
131069    -6408
131068    -8043 
...       ...
...       ...

第一行是 skip 应该结束的地方,接下来的坐标需要导入到一个向量中.如您所见,X 坐标从 131071 开始到 0.

The first line being the one where skip should end and the following coordinates need to be imported to a vector. As you can see the X-coordinates start from 131071 and end to 0.

推荐答案

1) read.pattern read.pattern 可以使用 gsubfn 中的具体图案.在此示例中,我们匹配行首、可选空格、1 个或多个数字、1 个或多个空格、可选减号后跟 1 个或多个数字、可选空格、行尾.与正则表达式的括号部分匹配的部分作为 data.frame 中的列返回.这个自包含示例中的 text = Lines 可以替换为 "myfile.txt",例如,如果数据来自文件.修改模式以适应.

1) read.pattern read.pattern in gsubfn can be used to read only lines matching a specific pattern. In this example we match beginning of line, optional space(s), 1 or more digits, 1 or more spaces, an optional minus followed by 1 or more digits, optional space(s), end of line. The portions matching the parenthesized portions of the regexp are returned as columns in a data.frame. text = Lines in this self contained example can be replaced with "myfile.txt", say, if the data is coming from a file. Modify the pattern to suit.

Lines <- "junk
junk
##XYDATA= (X++(Y..Y))
131071    -2065
131070    -4137
131069    -6408
131068    -8043"

library(gsubfn)
DF <- read.pattern(text = Lines, pattern = "^ *(\d+) +(-?\d+) *$")

给予:

> DF
      V1    V2
1 131071 -2065
2 131070 -4137
3 131069 -6408
4 131068 -8043

2) 读取两次 仅使用基数 R 的另一种可能性是简单地读取一次以确定 skip= 的值,第二次使用那个值.从文件中读取 myfile.txt"myfile.txt" 替换 text = LinestextConnection(Lines)> .

2) read twice Another possibility using only base R is simply to read it once to determine the value of skip= and a second time to do the actual read using that value. To read from a file myfile.txt replace text = Lines and textConnection(Lines) with "myfile.txt" .

read.table(text = Lines, 
    skip = grep("##XYDATA=", readLines(textConnection(Lines))))

添加一些修改并添加了第二种方法.

Added Some revisions and added second approach.

这篇关于R - 在特定行之后从 .txt 文件中读取行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆