排序大文件(10G) [英] Sorting big file (10G)

查看：553 发布时间：2020/4/23 10:36:19 linux sorting bigdata large-files

本文介绍了排序大文件(10G)的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试对存储在文件中的大表进行排序.该文件的格式为 (ID，intValue)

I'm trying to sort a big table stored in a file. The format of the file is (ID, intValue)

数据按ID排序，但是我需要的是使用intValue降序对数据进行排序.

The data is sorted by ID, but what I need is to sort the data using the intValue, in descending order.

例如

ID  | IntValue
1   | 3
2   | 24
3   | 44
4   | 2

此表

ID  | IntValue
3   | 44
2   | 24
1   | 3
4   | 2

如何使用Linux sort命令进行操作?还是您建议另一种方式?

How can I use the Linux sort command to do the operation? Or do you recommend another way?

推荐答案

如何使用Linux sort命令进行操作?还是您建议另一种方式?

How can I use the Linux sort command to do the operation? Or do you recommend another way?

正如其他人已经指出的那样，请参见man sort中的-k& -t命令行选项，说明如何按字符串中的某些特定元素进行排序.

As others have already pointed out, see man sort for -k & -t command line options on how to sort by some specific element in the string.

现在，sort还具有帮助对可能不适合放入RAM的大型文件进行排序的功能.即-m命令行选项，它允许将已排序的文件合并为一个. (有关概念，请参见合并排序.)整个过程相当简单:

Now, the sort also has facility to help sort huge files which potentially don't fit into the RAM. Namely the -m command line option, which allows to merge already sorted files into one. (See merge sort for the concept.) The overall process is fairly straight forward:

将大文件分成小块.例如，将split工具与-l选项一起使用.例如:

Split the big file into small chunks. Use for example the split tool with the -l option. E.g.:

split -l 1000000 huge-file small-chunk

对较小的文件进行排序.例如

Sort the smaller files. E.g.

for X in small-chunk*; do sort -t'|' -k2 -nr < $X > sorted-$X; done

合并排序的较小文件.例如

Merge the sorted smaller files. E.g.

sort -t'|' -k2 -nr -m sorted-small-chunk* > sorted-huge-file

清理:rm small-chunk* sorted-small-chunk*

您唯一需要特别注意的是列标题.

The only thing you have to take special care about is the column header.

这篇关于排序大文件(10G)的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

排序大文件(10G) [英] Sorting big file (10G)

问题描述

推荐答案

相关文章

服务器开发最新文章

热门教程

热门工具

登录关闭

排序大文件(10G) [英] Sorting big file (10G)

问题描述

推荐答案

相关文章

服务器开发最新文章

热门教程

热门工具

登录 关闭

登录关闭