在UNIX中,对多个小文件或一个大文件进行排序是否更好? [英] in unix, is it better to sort multiple small files, or one big file?
问题描述
所以我有多个正在处理的文件,需要将它们组合和排序.在合并文件时先对每个文件排序并使用sort -m选项,或者先对它们进行组合再排序,会更有效.
So I have multiple files that I am working on and they need to be combined and sorted. Would it be more efficient to sort each file first and use the sort -m option when combining the files, or to combine them first and then sort.
还是一样?我的理解是,unix本质上使用合并排序,因此,一个大文件是否会被分离,排序和重组?所以它不应该有所作为吗?
Or is it the same? My understanding is that unix uses merge sort so in essence, would the one big file be separated, sorted, and recombined anyway? So it shouldn't make a difference?
推荐答案
将文件排序在一起.
sort file1 file2 file3 file4
除非您有大量的时间进行调查,否则sort命令比将文件分成适当大小的块,将它们独立排序并重新组合要好得多.
Unless you have a lot of time to investigate, the sort command will do a better job than you in breaking the files into appropriately-sized chunks, sort them independently and recombine them.
这篇关于在UNIX中,对多个小文件或一个大文件进行排序是否更好?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!