替换numpy数组中的值时,防止字符串被截断 [英] Prevent strings being truncated when replacing values in a numpy array

查看:399
本文介绍了替换numpy数组中的值时,防止字符串被截断的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

让我说我有数组ab

a = np.array([1,2,3])
b = np.array(['red','red','red'])

如果我要对这些数组应用类似的花式索引

If I were to apply some fancy indexing like this to these arrays

b[a<3]="blue"

我得到的输出是

array(['blu', 'blu', 'red'], dtype='<U3')

我知道问题在于,因为numpy最初只为3个字符分配空间,因此它不能将整个单词blue都适合数组,我可以使用什么解决方法?

I understand that the issue is because of numpy initially allocating space only for 3 characters at first hence it cant fit the whole word blue into the array, what work around can I use?

我现在正在做

b = np.array([" "*100 for i in range(3)])
b[a>2] = "red"
b[a<3] = "blue"

但这只是一种解决方法,这是我代码中的错误吗?还是numpy有问题,我该如何解决?

but it's just a work around, is this a fault in my code? Or is it some issue with numpy, how can I fix this?

推荐答案

您可以通过将bdtype设置为"object"来处理可变长度的字符串:

You can handle variable length strings by setting the dtype of b to be "object":

import numpy as np
a = np.array([1,2,3])
b = np.array(['red','red','red'], dtype="object")

b[a<3] = "blue"

print(b)

此输出:

['blue' 'blue' 'red']

dtype将处理字符串或其他常规Python对象.这也必然意味着在幕后您将拥有一个numpy指针数组,因此不要指望使用原始数据类型时获得的性能.

This dtype will handle strings, or other general Python objects. This also necessarily means that under the hood you'll have a numpy array of pointers, so don't expect the performance you get when using a primitive datatype.

这篇关于替换numpy数组中的值时,防止字符串被截断的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆