读取一个非常大的单行txt文件并将其拆分 [英] read a very big single line txt file and split it

查看:157
本文介绍了读取一个非常大的单行txt文件并将其拆分的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下问题:
我有一个近500mb的文件。它的文字,全部在一行。文本用虚线结尾分隔,它叫做ROW_DEL并且在文本中如下:

I have the following problem: I have a file which is nearly 500mb big. Its text, all in one line. The text is seperated with a virtual line ending, its called ROW_DEL and is in the text like this:

this is a line ROW_DEL and this is a line

现在我需要进行以下操作,我想将此文件拆分为线,所以我得到这样的文件:

now I need to make the following, I want to split this file into its lines so I get a file like this:

this is a line
and this is a line

这个问题,即使我用Windows文本编辑器打开它,它也会破坏,因为文件很大。

the problem, even if I open it with the windows text editor, it breakes because the file is to big.

是否有可能像我在C#,Java或Python中提到的那样拆分这个文件?什么是最好的灵魂,不要过度使用我的cpu。

Is it possible to split this file like I mentioned with C#, Java or Python? Whats would be the best soultion to dont overkill my cpu.

推荐答案

实际上500mb的文字不是那么大,只是那个记事本吮吸。你可能没有sed可用,因为你在Windows上但至少尝试在python中使用天真的解决方案,我认为它会正常工作:

Actually 500mb of text is not that big, it's just that notepad sucks. You probably don't have sed available since you're on windows but at least try naive solution in python, I think it will work fine:

import os
with open('infile.txt') as f_in, open('outfile.txt', 'w') as f_out:
  f_out.write(f_in.read().replace('ROW_DEL ', os.linesep))

这篇关于读取一个非常大的单行txt文件并将其拆分的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆