拆分文件 - Java / Linux [英] Split File - Java/Linux

查看：125 发布时间：2018/12/21 20:31:44 java linux

本文介绍了拆分文件 - Java / Linux的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有一个包含近2.5亿个字符的大文件。现在，我想把它分成每个包含3000万个字符的部分（所以前8个部分将包含3000万个，最后部分将包含1000万个字符）。另一点是我希望在下一部分的开头包含每个文件的最后1000个字符（意味着第1部分的最后1000个字符附加在第2部分的开头 - 所以，第2部分包含3000万个1000字符，依此类推）。任何人都可以帮我如何编程（使用Java）或使用Linux命令（以快速方式）。

I have a large file contains nearly 250 million characters. Now, I want to split it into parts of each contains 30 million characters ( so first 8 parts will contains 30 million and last part will contain 10 million character). Another point is that I want to include last 1000 characters of each file at the beginning of the next part (means part 1's last 1000 characters append in 2nd part's begining - so, 2nd part contains 30 million 1000 characters and so on). Can anybody help me how to do it programmaticaly (using Java) or using Linux commands (in a fast way).

推荐答案

一个方法是使用常规的unix命令拆分文件，并在前一个文件的前1000个字节前加上。

One way is to use regular unix commands to split the file and the prepend the last 1000 bytes from the previous file.

首先拆分文件：

split -b 30000000 inputfile part.

然后，对于每个部分（忽略了farst，从一个新的文件开始，最后1000个字节来自previous：

Then, for each part (ignoring the farst make a new file starting with the last 1000 bytes from the previous:

unset prev
for i in part.*
do if [ -n "${prev}" ]
  then 
    tail -c 1000 ${prev} > part.temp
    cat ${i} >> part.temp
    mv part.temp ${i}
  fi
  prev=${i}
done

在组装之前，我们再次迭代文件，忽略第一个并扔掉前1000个字节：

Before assembling we again iterate over the files, ignoring the first and throw away the first 1000 bytes:

unset prev
for i in part.*
do if [ -n "${prev}" ]
  then 
    tail -c +1001 ${i} > part.temp
    mv part.temp ${i}
  fi
  prev=${i}
done

最后一步是重新组合文件：

Last step is to reassemble the files:

cat part.* >> newfile

由于没有解释为什么需要重叠，我只是创建它然后扔掉它。

Since there was no explanation of why the overlap was needed I just created it and then threw it away.

这篇关于拆分文件 - Java / Linux的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

拆分文件 - Java / Linux [英] Split File - Java/Linux

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

拆分文件 - Java / Linux [英] Split File - Java/Linux

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭