将numpy unicode数组写入文本文件 [英] Write numpy unicode array to a text file

查看：94 发布时间：2020/5/18 20:27:06 python numpy unicode

本文介绍了将numpy unicode数组写入文本文件的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试将包含unicode元素的numpy数组导出到文本文件.

I'm trying to export a numpy array that contains unicode elements to a text file.

到目前为止，我可以执行以下操作，但没有任何Unicode字符:

So far I got the following to work, but doesn't have any unicode character:

import numpy as np

array_unicode=np.array([u'maca' u'banana',u'morango'])

with open('array_unicode.txt','wb') as f:
    np.savetxt(f,array_unicode,fmt='%s')

如果我将'c'从'maca'更改为'ç'，则会收到错误消息:

If I change 'c' from 'maca' to 'ç' I get an error:

import numpy as np

array_unicode=np.array([u'maça' u'banana',u'morango'])

with open('array_unicode.txt','wb') as f:
    np.savetxt(f,array_unicode,fmt='%s')

跟踪:

Traceback (most recent call last):
  File "<ipython-input-48-24ff7992bd4c>", line 8, in <module>
    np.savetxt(f,array_unicode,fmt='%s')
  File "C:\Anaconda2\lib\site-packages\numpy\lib\npyio.py", line 1158, in savetxt
    fh.write(asbytes(format % tuple(row) + newline))
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe7' in position 2: ordinal not in range(128)

如何从numpy中设置savetxt来编写unicode字符?

How can I set savetxt from numpy to write unicode characters?

推荐答案

在Python3(ipthon-qt终端)中，我可以这样做:

In Python3 (ipthon-qt terminal) I can do:

In [12]: b=[u'maça', u'banana',u'morango']

In [13]: np.savetxt('test.txt',b,fmt='%s')

In [14]: cat test.txt
ma�a
banana
morango

In [15]: with open('test1.txt','w') as f:
    ...:     for l in b:
    ...:         f.write('%s\n'%l)
    ...:         

In [16]: cat test1.txt
maça
banana
morango

Py2和3中的

savetxt都坚持以'wb'字节模式保存.您的错误行具有asbytes函数.

savetxt in both Py2 and 3 insists on saving in 'wb', byte mode. Your error line has that asbytes function.

在我的示例中，b是一个列表，但这没关系.

In my example b is a list, but that doesn't matter.

In [17]: c=np.array(['maça', 'banana','morango'])

In [18]: c
Out[18]: 
array(['maça', 'banana', 'morango'], 
      dtype='<U7')

写入相同的内容.在py3中，默认的字符串类型是unicode，因此不需要u标记-可以.

writes the same. In py3 the default string type is unicode, so the u tag isn't needed - but is ok.

在Python2中，我用简单的写法得到了错误

In Python2 I get your error with a plain write

>>> b=[u'maça' u'banana',u'morango']
>>> with open('test.txt','w') as f:
...    for l in b:
...        f.write('%s\n'%l)
... 
Traceback (most recent call last):
  File "<stdin>", line 3, in <module>
UnicodeEncodeError: 'ascii' codec can't encode character u'\xe7' in position 2: ordinal not in range(128)

添加encode可以得到很好的输出:

adding the encode gives a nice output:

>>> b=[u'maça', u'banana',u'morango']
>>> with open('test.txt','w') as f:
...    for l in b:
...        f.write('%s\n'%l.encode('utf-8'))
0729:~/mypy$ cat test.txt
maça
banana
morango

encode是字符串方法，因此必须应用于数组(或列表)的各个元素.

encode is a string method, so has to be applied to the individual elements of an array (or list).

回到py3端，如果我使用encode我会得到:

Back on the py3 side, if I use the encode I get:

In [26]: c1=np.array([l.encode('utf-8') for l in b])

In [27]: c1
Out[27]: 
array([b'ma\xc3\xa7a', b'banana', b'morango'], 
      dtype='|S7')

In [28]: np.savetxt('test.txt',c1,fmt='%s')

In [29]: cat test.txt
b'ma\xc3\xa7a'
b'banana'
b'morango'

但是使用正确的格式，普通的写法可以正常工作:

but with the correct format, the plain write works:

In [33]: with open('test1.txt','wb') as f:
    ...:     for l in c1:
    ...:         f.write(b'%s\n'%l)
    ...:         

In [34]: cat test1.txt
maça
banana
morango

混合unicode和2代Python的乐趣就很大.

Such are the joys of mixing unicode and the 2 Python generations.

如果有帮助，这是np.savetxt使用的np.lib.npyio.asbytes函数的代码(以及wb文件模式):

In case it helps, here's the code for the np.lib.npyio.asbytes function that np.savetxt uses (along with the wb file mode):

def asbytes(s):    # py3?
    if isinstance(s, bytes):
        return s
    return str(s).encode('latin1')

(请注意，编码固定为"latin1").

(note the encoding is fixed as 'latin1').

np.char库将各种字符串方法应用于numpy数组的元素，因此np.array([x.encode...])可以表示为:

The np.char library applies a variety of string methods to the elements of a numpy array, so the np.array([x.encode...]) can be expressed as:

In [50]: np.char.encode(b,'utf-8')
Out[50]: 
array([b'ma\xc3\xa7a', b'banana', b'morango'], 
      dtype='|S7')

这可能很方便，尽管过去的测试表明它不能节省时间.仍然必须将Python方法应用于每个元素.

This can be convenient, though past testing indicates that it is not a time saver. It still has to apply the Python method to each element.

这篇关于将numpy unicode数组写入文本文件的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

将numpy unicode数组写入文本文件 [英] Write numpy unicode array to a text file

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

将numpy unicode数组写入文本文件 [英] Write numpy unicode array to a text file

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭