python:ascii阅读 [英] python: ascii read

查看:49
本文介绍了python:ascii阅读的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

您好,


我尝试使用

scipy.io.read_array读取Python中的一些大型ascii文件(200MB-2GB),但它没有像我预期的那样工作。整个想法

是找到一个快速的Python例程来读取任意的ascii文件,

替换Yorick(我现在使用它,哪个非常快,但不是

和Python一样通用。 scipy.io.read_array的问题是,它确实很慢,在尝试处理大文件时返回错误,而且b $ b也会改变(削减)文件( scipy.io.read_array处理了2GB

文件后,其大小只有64MB)。


有人能给我提示如何使用Python来完成这项工作正确和

快? (也许还有另一个读入程序。)


谢谢。


问候,

塞巴斯蒂安

解决方案

Sebastian Krause< ca ***** @ gmx.net>写道:

你好,

我试图用Python读取一些大的ascii文件(200MB-2GB)使用
scipy.io.read_array ,但它没有像我预期的那样工作。整个想法是找到一个快速的Python例程来读取任意的ascii文件,以取代Yorick(我现在使用它,它真的很快,但不像Python那样通用) )。 scipy.io.read_array的问题是,它确实很慢,在尝试处理大文件时返回错误,它也会改变(剪切)文件(在scipy.io.read_array处理完之后) 2GB
文件,其大小只有64MB)。

有人能给我提示如何使用Python正确地完成这项工作并且快速吗? (也许还有另一个读入程序。)




如果你需要的就是你所说的 - 将大量的ASCII数据读入

内存 - 它很难被击败

data = open(''thefile.txt'')。read()


mmap may实际上对于许多用途来说是可取的,但它实际上并不是
读取(它将文件_maps_变成内存)。

Alex


Sebastian Krause写道:

你好,

我试图用Python读取Python中的一些大型ascii文件(200MB-2GB) > scipy.io.read_array,但它没有像我预期的那样工作。整个想法是找到一个快速的Python例程来读取任意的ascii文件,以取代Yorick(我现在使用它,它真的很快,但不像Python那样通用) )。 scipy.io.read_array的问题是,它确实很慢,在尝试处理大文件时返回错误,它也会改变(剪切)文件(在scipy.io.read_array处理完之后) 2GB
文件,其大小只有64MB)。

有人能给我提示如何使用Python正确地完成这项工作并且快速吗? (也许还有另一个读入程序。)




它是什么类型的数据?您想对

数据执行哪些操作?你在什么平台上?


你看到的一些scipy.io.read_array行为看起来像bug。如果您要向scipy-dev邮件列表发送完整的错误报告,我们将非常感谢
。谢谢。


-

Robert Kern
rk * **@ucsd.edu


在地狱的地方,草地长得很高

梦想的坟墓是否允许死亡。

- Richard Harter


我没有明确提到ascii文件应该作为
$读入b $ b数组(整数或浮点数)。

使用open()和read()非常快,但只读入数据

字符串,它也不适用于大文件。


Sebastian


Alex Martelli写道:

Sebastian Krause < CA ***** @ gmx.net>写道:

你好,

我试着用Python阅读一些大的ascii文件(200MB-2GB)使用
scipy .io.read_array,但它没有像我预期的那样工作。整个想法是找到一个快速的Python例程来读取任意的ascii文件,以取代Yorick(我现在使用它,它真的很快,但不像Python那样通用) )。 scipy.io.read_array的问题是,它确实很慢,在尝试处理大文件时返回错误,它也会改变(剪切)文件(在scipy.io.read_array处理完之后) 2GB
文件,其大小只有64MB)。

有人能给我提示如何使用Python正确地完成这项工作并且快速吗? (也许还有另一个读入程序。)



如果你需要的就是你所说的 - 将大量的ASCII数据读入内存 - 它''难以击败
data = open(''thefile.txt'')。read()

mmap实际上可能更适用于许多用途,但它实际上不是<读取(将_maps_文件放入内存中)。

Alex



Hello,

I tried to read in some large ascii files (200MB-2GB) in Python using
scipy.io.read_array, but it did not work as I expected. The whole idea
was to find a fast Python routine to read in arbitrary ascii files, to
replace Yorick (which I use right now and which is really fast, but not
as general as Python). The problem with scipy.io.read_array was, that it
is really slow, returns errors when trying to process large files and it
also changes (cuts) the files (after scipy.io.read_array processed a 2GB
file its size was only 64MB).

Can someone give me hint how to use Python to do this job correctly and
fast? (Maybe with another read-in routine.)

Thanks.

Greetings,
Sebastian

解决方案

Sebastian Krause <ca*****@gmx.net> wrote:

Hello,

I tried to read in some large ascii files (200MB-2GB) in Python using
scipy.io.read_array, but it did not work as I expected. The whole idea
was to find a fast Python routine to read in arbitrary ascii files, to
replace Yorick (which I use right now and which is really fast, but not
as general as Python). The problem with scipy.io.read_array was, that it
is really slow, returns errors when trying to process large files and it
also changes (cuts) the files (after scipy.io.read_array processed a 2GB
file its size was only 64MB).

Can someone give me hint how to use Python to do this job correctly and
fast? (Maybe with another read-in routine.)



If all you need is what you say -- read a huge amount of ASCII data into
memory -- it''s hard to beat
data = open(''thefile.txt'').read()

mmap may in fact be preferable for many uses, but it doesn''t actually
read (it _maps_ the file into memory instead).
Alex


Sebastian Krause wrote:

Hello,

I tried to read in some large ascii files (200MB-2GB) in Python using
scipy.io.read_array, but it did not work as I expected. The whole idea
was to find a fast Python routine to read in arbitrary ascii files, to
replace Yorick (which I use right now and which is really fast, but not
as general as Python). The problem with scipy.io.read_array was, that it
is really slow, returns errors when trying to process large files and it
also changes (cuts) the files (after scipy.io.read_array processed a 2GB
file its size was only 64MB).

Can someone give me hint how to use Python to do this job correctly and
fast? (Maybe with another read-in routine.)



What kind of data is it? What operations do you want to perform on the
data? What platform are you on?

Some of the scipy.io.read_array behavior that you see look like bugs. We
would greatly appreciate it if you were to send a complete bug report to
the scipy-dev mailing list. Thank you.

--
Robert Kern
rk***@ucsd.edu

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter


I did not explictly mention that the ascii file should be read in as an
array of numbers (either integer or float).
To use open() and read() is very fast, but does only read in the data as
string and it also does not work with large files.

Sebastian

Alex Martelli wrote:

Sebastian Krause <ca*****@gmx.net> wrote:

Hello,

I tried to read in some large ascii files (200MB-2GB) in Python using
scipy.io.read_array, but it did not work as I expected. The whole idea
was to find a fast Python routine to read in arbitrary ascii files, to
replace Yorick (which I use right now and which is really fast, but not
as general as Python). The problem with scipy.io.read_array was, that it
is really slow, returns errors when trying to process large files and it
also changes (cuts) the files (after scipy.io.read_array processed a 2GB
file its size was only 64MB).

Can someone give me hint how to use Python to do this job correctly and
fast? (Maybe with another read-in routine.)


If all you need is what you say -- read a huge amount of ASCII data into
memory -- it''s hard to beat
data = open(''thefile.txt'').read()

mmap may in fact be preferable for many uses, but it doesn''t actually
read (it _maps_ the file into memory instead).
Alex



这篇关于python:ascii阅读的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆