从巨大的文本文件中读取最后n行 [英] Reading the last n lines from a huge text file

查看:74
本文介绍了从巨大的文本文件中读取最后n行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

  file_in<  -  file(myfile.log,r) 
x< - readLines(file_in,n = -100)

等待...

任何帮助将不胜感激

解决方案

我会使用 scan 来解决这个问题,以防万一你知道日志有多少行:

  scan(foo.txt,sep =\\\
,what =char(0),skip = 100)

如果你不知道需要跳过多少,你别无选择,只能走向


  • 阅读所有内容并记录最后n行(如果可行),使用 scan(foo.txt,
  • =\ n,what = list(NULL))来找出有多少条记录,或者使用一些算法来通过这个文件来使用
  • ,每次只保留最后一行


最后一个选项可能如下所示: (x,n,...){
con< - file(x)$ |

  ReadLastLines< b $ b open(con)
out < - scan(con,n,what =char(0),sep =\\\
,quiet = TRUE,...)

while(TRUE){
tmp < - scan(con,1,what =char(0),sep =\\\
,quiet = TRUE)
if(长度(tmp)== 0){close(con); (out [-1],tmp)
}

}

允许:
$ b $ pre $ read $ L $

  ReadLastLines(foo.txt,100,skip = 1e + 7)

如果你知道你有超过一千万行。这可以节省您的阅读时间,当你开始有非常大的日志。

$ hr
$ b 编辑:其实,我甚至不会使用R,因为你的文件的大小。在Unix上,您可以使用 tail 命令。有一个Windows版本,以及在工具包中的某个地方。我还没有尝试过。


I've tried something like this

file_in <- file("myfile.log","r")
x <- readLines(file_in, n=-100)

but I'm still waiting...

Any help would be greatly appreciated

解决方案

I'd use scan for this, in case you know how many lines the log has :

scan("foo.txt",sep="\n",what="char(0)",skip=100)

If you have no clue how many you need to skip, you have no choice but to move towards either

  • reading in everything and taking the last n lines (in case that's feasible),
  • using scan("foo.txt",sep="\n",what=list(NULL)) to figure out how many records there are, or
  • using some algorithm to go through the file, keeping only the last n lines every time

The last option could look like :

ReadLastLines <- function(x,n,...){    
  con <- file(x)
  open(con)
  out <- scan(con,n,what="char(0)",sep="\n",quiet=TRUE,...)

  while(TRUE){
    tmp <- scan(con,1,what="char(0)",sep="\n",quiet=TRUE)
    if(length(tmp)==0) {close(con) ; break }
    out <- c(out[-1],tmp)
  }
  out
}

allowing :

ReadLastLines("foo.txt",100)

or

ReadLastLines("foo.txt",100,skip=1e+7)

in case you know you have more than 10 million lines. This can save on the reading time when you start having extremely big logs.


EDIT : In fact, I'd not even use R for this, given the size of your file. On Unix, you can use the tail command. There is a windows version for that as well, somewhere in a toolkit. I didn't try that out yet though.

这篇关于从巨大的文本文件中读取最后n行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆