蟒蛇Numpy.loadtxt具有不同的字符串项,但知道行格式 [英] Python Numpy.loadtxt with varied string entries but know line format

查看:917
本文介绍了蟒蛇Numpy.loadtxt具有不同的字符串项,但知道行格式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

忙着寻找到loadtxt专门的限制。我有一个多维阵列

 为Python loadtxt#样品头
很随意的文字:¤mixedwith¤strings¤numbers
300057¤9989¤34956¤1
110087¤9189¤24466¤4
#EOF

我可以作为一个字符串(未知长度)都在阅读,然后转换为整数,后来花车。这是我在这里:

 导入numpy的是NP
txtdata = np.loadtxt('Mytxtfile.txt',分隔符= CHR(164)评论=#,DTYPE ='海峡')

不过,我想知道是否可以提取,直接进入一个多维数组。如:

 >>>
['很随意的文字:','拌','弦','数字']
 [300057,9989,34956,1]
 [110087,9189,24466,4]]

我试过没有成功这个DTYPE命令:

  DTYPE = [('A','海峡'),('B','诠释'),('C','廉政')]


解决方案

  txtdata = np.loadtxt(
    'Mytxtfile.txt',定界符= CHR(164),注释=#,skiprows = 1,
    DTYPE = [('一个','| S6'),('B','&下; 6-14'),('C','&下; 6-14'),('D','&下; 6-14')] )

您的样本数据显示4列,因此指定 DTYPE 明确,你需要这样的:

  DTYPE = [('A','| S6'),('B','< 0-14'),('C','< 0-14'), ('D','< 0-14')]

需要注意的是numpy的不具有可变宽度'海峡' DTYPE。你必须预先指定的字节数。例如,| S6'指定6字节的字符串DTYPE。

如果你事先不知道有多少字节可能在字符串列(S),那么它可能是使用的 numpy.genfromtxt

  txtdata = np.genfromtxt('Mytxtfile.txt',分隔符= CHR(164)评论=#,
                        名称= TRUE,DTYPE =无)

DTYPE =无告诉 genfromtxt ,以便为DTYPE智能猜测。

Busy looking into the limits of loadtxt specifically. I have a multi-dimensional array:

# Sample header for python loadtxt
Very random text:¤mixed with¤strings¤numbers
300057¤9989¤34956¤1
110087¤9189¤24466¤4
# EOF

I can read this all in as a string (unknown length) and then convert to integers and floats later. This I have here:

import numpy as np
txtdata = np.loadtxt('Mytxtfile.txt',delimiter=chr(164),comments="#",dtype='str')

However I would like to know if it is possible to extract, directly into a multidimensional array. Such as:

>>> 
[['Very random text:','mixed with','strings','numbers']
 [300057,9989,34956,1]
 [110087, 9189, 24466, 4]]

I tried this dtype command with no success:

dtype=[('a', 'str'),('b','int'),('c','int')]

解决方案

txtdata = np.loadtxt(
    'Mytxtfile.txt', delimiter=chr(164), comments="#", skiprows=1,
    dtype=[('a', '|S6'), ('b', '<i4'), ('c', '<i4'), ('d', '<i4')])

Your sample data shows 4 columns, so to specify the dtype explicitly, you would need something like:

dtype=[('a', '|S6'), ('b', '<i4'), ('c', '<i4'), ('d', '<i4')]

Note that NumPy does not have a variable-width 'str' dtype. You have to specify the number of bytes in advance. For example, '|S6' specifies a 6-byte string dtype.

If you do not know in advance how many bytes may be in the string column(s), then it may be more convenient to use numpy.genfromtxt:

txtdata = np.genfromtxt('Mytxtfile.txt', delimiter=chr(164), comments="#",
                        names=True, dtype=None)

dtype=None tells genfromtxt to make an intelligent guess for the dtype.

这篇关于蟒蛇Numpy.loadtxt具有不同的字符串项,但知道行格式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆