R - 在特定行之后从.txt文件中读取行 [英] R - Reading lines from a .txt-file after a specific line

查看:704
本文介绍了R - 在特定行之后从.txt文件中读取行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一堆输出.txt文件,包含一个大参数列表和一个X-Y坐标集。我需要从所有文件中提取这些坐标,以便只将这些行导入到矢量中。这可以正常工作

I have a bunch of output .txt-files that consists of a large parameter list and a X-Y-coordinate set. I need to extract these coordinates from all files so that only those lines are imported to a vector. This would work fine with

impcoord<-read.table("file.txt",skip= ,nrow= ,...)

但文件在不同长度的支持参数后打印坐标集。

but the files print the coordinate sets after different lengths of supporting parameters.

幸运的是坐标总是在包含某些单词的行之后开始。

Luckily the coordinates always start after a line containing certain words.

因此我的问题是,如何开始阅读.txt -file后这些话?假设它们是:

Thus my question is, how do I start reading the .txt-file after these words? Let's say they are:

coordinatesXY

非常感谢您的时间和帮助!

Thanks alot for your time and help!

-Olli

- -Edit -

--Edit--

对于这种混淆感到抱歉。

Sorry for the confusion.

该文件的部分如下:

##XYDATA= (X++(Y..Y))
131071    -2065
131070    -4137
131069    -6408
131068    -8043 
...       ...
...       ...

第一行是 skip 应该结束,并且需要将以下坐标导入到矢量。如您所见,X坐标从131071开始并结束为0.

The first line being the one where skip should end and the following coordinates need to be imported to a vector. As you can see the X-coordinates start from 131071 and end to 0.

推荐答案

1)read.pattern < gsubfn中的/ strong> read.pattern 可用于只读取与特定模式匹配的行。在此示例中,我们匹配行的开头,可选空格,1个或多个数字,1个或多个空格,可选减号后跟1个或多个数字,可选空格,行尾。匹配正则表达式的括号部分的部分作为data.frame中的列返回。此自包含示例中的 text = Lines 可以替换为myfile.txt,例如,如果数据是来自一个文件。修改模式以适应。

1) read.pattern read.pattern in gsubfn can be used to read only lines matching a specific pattern. In this example we match beginning of line, optional space(s), 1 or more digits, 1 or more spaces, an optional minus followed by 1 or more digits, optional space(s), end of line. The portions matching the parenthesized portions of the regexp are returned as columns in a data.frame. text = Lines in this self contained example can be replaced with "myfile.txt", say, if the data is coming from a file. Modify the pattern to suit.

Lines <- "junk
junk
##XYDATA= (X++(Y..Y))
131071    -2065
131070    -4137
131069    -6408
131068    -8043"

library(gsubfn)
DF <- read.pattern(text = Lines, pattern = "^ *(\\d+) +(-?\\d+) *$")

给予:

> DF
      V1    V2
1 131071 -2065
2 131070 -4137
3 131069 -6408
4 131068 -8043

2)阅读两次仅使用基数R的另一种可能性是只读一次以确定<$ c $的值c> skip = 并第二次使用该值进行实际读取。从文件中读取 myfile.txt 替换 text = Lines textConnection(Lines) with myfile.txt

2) read twice Another possibility using only base R is simply to read it once to determine the value of skip= and a second time to do the actual read using that value. To read from a file myfile.txt replace text = Lines and textConnection(Lines) with "myfile.txt" .

read.table(text = Lines, 
    skip = grep("##XYDATA=", readLines(textConnection(Lines))))

已添加部分修订并添加了第二种方法。

Added Some revisions and added second approach.

这篇关于R - 在特定行之后从.txt文件中读取行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆