使用 PHP 从文件中读取最后一行(即“尾部")的最佳方法是什么? [英] What is the best way to read last lines (i.e. "tail") from a file using PHP?

查看:36
本文介绍了使用 PHP 从文件中读取最后一行(即“尾部")的最佳方法是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我的 PHP 应用程序中,我需要 从末尾开始读取多行许多文件(主要是日志).有时我只需要最后一个,有时我需要数十或数百.基本上,我想要像 Unix tail 一样灵活的东西命令.

这里有一些关于如何从文件中获取最后一行的问题(但是我需要 N 行),并且给出了不同的解决方案.我不确定哪个一个是最好的,哪个表现更好.

解决方案

方法概述

在互联网上搜索,我遇到了不同的解决方案.我可以将它们分组三种方式:

  • naive 那些使用 file() PHP 函数的;
  • 欺骗那些在系统上运行tail命令的;
  • 强大 使用 fseek() 愉快地在打开的文件中跳转.

我最终选择(或编写)了五种解决方案,一种天真的,一种欺骗和三个强大.

  1. 最简洁的.

    最后的想法

    解决方案#5 强烈推荐用于一般用例:有效伟大的与每个文件大小和读取几行时表现特别好.

    避免解决方案#1,如果您应该读取大于 10 KB 的文件.

    解决方案#2#3对于我运行的每个测试来说,并不是最好的:#2 从不低于2 毫秒,#3 受数量的影响很大您问的行(仅适用于 1 或 2 行).

    In my PHP application I need to read multiple lines starting from the end of many files (mostly logs). Sometimes I need only the last one, sometimes I need tens or hundreds. Basically, I want something as flexible as the Unix tail command.

    There are questions here about how to get the single last line from a file (but I need N lines), and different solutions were given. I'm not sure about which one is the best and which performs better.

    解决方案

    Methods overview

    Searching on the internet, I came across different solutions. I can group them in three approaches:

    • naive ones that use file() PHP function;
    • cheating ones that runs tail command on the system;
    • mighty ones that happily jump around an opened file using fseek().

    I ended up choosing (or writing) five solutions, a naive one, a cheating one and three mighty ones.

    1. The most concise naive solution, using built-in array functions.
    2. The only possible solution based on tail command, which has a little big problem: it does not run if tail is not available, i.e. on non-Unix (Windows) or on restricted environments that don't allow system functions.
    3. The solution in which single bytes are read from the end of file searching for (and counting) new-line characters, found here.
    4. The multi-byte buffered solution optimized for large files, found here.
    5. A slightly modified version of solution #4 in which buffer length is dynamic, decided according to the number of lines to retrieve.

    All solutions work. In the sense that they return the expected result from any file and for any number of lines we ask for (except for solution #1, that can break PHP memory limits in case of large files, returning nothing). But which one is better?

    Performance tests

    To answer the question I run tests. That's how these thing are done, isn't it?

    I prepared a sample 100 KB file joining together different files found in my /var/log directory. Then I wrote a PHP script that uses each one of the five solutions to retrieve 1, 2, .., 10, 20, ... 100, 200, ..., 1000 lines from the end of the file. Each single test is repeated ten times (that's something like 5 × 28 × 10 = 1400 tests), measuring average elapsed time in microseconds.

    I run the script on my local development machine (Xubuntu 12.04, PHP 5.3.10, 2.70 GHz dual core CPU, 2 GB RAM) using the PHP command line interpreter. Here are the results:

    Solution #1 and #2 seem to be the worse ones. Solution #3 is good only when we need to read a few lines. Solutions #4 and #5 seem to be the best ones. Note how dynamic buffer size can optimize the algorithm: execution time is a little smaller for few lines, because of the reduced buffer.

    Let's try with a bigger file. What if we have to read a 10 MB log file?

    Now solution #1 is by far the worse one: in fact, loading the whole 10 MB file into memory is not a great idea. I run the tests also on 1MB and 100MB file, and it's practically the same situation.

    And for tiny log files? That's the graph for a 10 KB file:

    Solution #1 is the best one now! Loading a 10 KB into memory isn't a big deal for PHP. Also #4 and #5 performs good. However this is an edge case: a 10 KB log means something like 150/200 lines...

    You can download all my test files, sources and results here.

    Final thoughts

    Solution #5 is heavily recommended for the general use case: works great with every file size and performs particularly good when reading a few lines.

    Avoid solution #1 if you should read files bigger than 10 KB.

    Solution #2 and #3 aren't the best ones for each test I run: #2 never runs in less than 2ms, and #3 is heavily influenced by the number of lines you ask (works quite good only with 1 or 2 lines).

    这篇关于使用 PHP 从文件中读取最后一行(即“尾部")的最佳方法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆