获取最后10行非常大的文本文件的制造> 10GB [英] Get last 10 lines of very large text file > 10GB

查看:121
本文介绍了获取最后10行非常大的文本文件的制造> 10GB的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

什么是显示最后10行一个非常大的文本文件(这个特定的文件超过10GB)最有效的方法。我想只是写一个简单的C#应用​​程序,但我不知道如何有效地做到这一点。

What is the most efficient way to display the last 10 lines of a very large text file (this particular file is over 10GB). I was thinking of just writing a simple C# app but I'm not sure how to do this effectively.

推荐答案

读取到文件的末尾,然后寻求向后,直到你将了解到十种换行符,然后读取交给最终考虑各种编码。一定要处理情况下,文件中的行数少于十。下面是(如您标记这在C#)的实施,推广到找到最后 numberOfTokens 位于路径恩在编码 codeD所在的令牌分离是重新通过 tokenSeparator ;结果返回为字符串(这可以通过返回来改善一个的IEnumerable<串> 枚举令牌)

Read to the end of the file, then seek backwards until you find ten newlines, and then read forward to the end taking into consideration various encodings. Be sure to handle cases where the number of lines in the file is less than ten. Below is an implementation (in C# as you tagged this), generalized to find the last numberOfTokens in the file located at path encoded in encoding where the token separator is represented by tokenSeparator; the result is returned as a string (this could be improved by returning an IEnumerable<string> that enumerates the tokens).

public static string ReadEndTokens(string path, Int64 numberOfTokens, Encoding encoding, string tokenSeparator) {

    int sizeOfChar = encoding.GetByteCount("\n");
    byte[] buffer = encoding.GetBytes(tokenSeparator);


    using (FileStream fs = new FileStream(path, FileMode.Open)) {
        Int64 tokenCount = 0;
        Int64 endPosition = fs.Length / sizeOfChar;

        for (Int64 position = sizeOfChar; position < endPosition; position += sizeOfChar) {
            fs.Seek(-position, SeekOrigin.End);
            fs.Read(buffer, 0, buffer.Length);

            if (encoding.GetString(buffer) == tokenSeparator) {
                tokenCount++;
                if (tokenCount == numberOfTokens) {
                    byte[] returnBuffer = new byte[fs.Length - fs.Position];
                    fs.Read(returnBuffer, 0, returnBuffer.Length);
                    return encoding.GetString(returnBuffer);
                }
            }
        }

        // handle case where number of tokens in file is less than numberOfTokens
        fs.Seek(0, SeekOrigin.Begin);
        buffer = new byte[fs.Length];
        fs.Read(buffer, 0, buffer.Length);
        return encoding.GetString(buffer);
    }
}

这篇关于获取最后10行非常大的文本文件的制造&gt; 10GB的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆