读取带有字符串的文件并使用loadtxt浮动 [英] Reading a file with string and float with loadtxt

查看:115
本文介绍了读取带有字符串的文件并使用loadtxt浮动的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要阅读位于此页面和python.

I need to read the data set available at this page with python.

它们非常精确地定义了每一列的数据类型. 我如何使用loadtxt(这是一个numpy函数)来读取此数据集.我尝试在dtype选项中提供数据类型,但没有用.

They are very precise how to define the data type of each column. How can I use loadtxt (it's a numpy function) to read this dataset. I tried giving the data type in the dtype option but it didn't work.

推荐答案

您链接的网站中的表彼此之间非常不同,并且在不同的列中具有不同的类型.

Tables in the site you link are very different from each other and you have different types in different columns.

您需要为每个表定义一个record type.
记录类型使您可以在同一数组上声明字符串,整数和浮点数.它的定义和使用方式如下例所示:

You need to define a record type for each table.
A record type allows you to declare strings, integers, floats on the same array. It is defined and used like in this example:

>>> recordtype = dtype([('name', str_, 20), ('age', int32), ('weight', float32)])
>>> people = array([('Joaquin', 51, 60.0), ('Cat', 18, 8.6)], dtype=recordtype)
>>> people
array([('Joaquin', 51, 60.0), ('Cat', 18, 8.600000381469727)], dtype=[('name', '<U20'), ('age', '<i4'), ('weight', '<f4')])

另一方面,您的行包含诸如'...'之类的内容,这些内容破坏了其上数据的一致性.因此,如果您需要直接从文件中读取数据,则需要将转换器函数用于loadtxt转换器参数.

On the other hand you have rows with contents such as '...' that break the coherence of the data on it. So if you need to read directly from the file, you would need to use a converter function for loadtxt converters parameter.

或者,由于loadtext也接受生成器作为输入,因此您可以处理生成器中的行,并用干净的行来输入loadtext.

Alternatively, as loadtext accepts also a generator as input, you could process lines in the generator and feed loadtext with cleaned lines.

最后,您还应该设置skiprows参数以消除表格标题

Finally you should also set the skiprows parameter to eliminate table headings

这篇关于读取带有字符串的文件并使用loadtxt浮动的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆