Python:根据数组中的值拆分NumPy数组 [英] Python: Split NumPy array based on values in the array

查看:748
本文介绍了Python:根据数组中的值拆分NumPy数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个大数组:

[(1.0, 3.0, 1, 427338.4297000002, 4848489.4332)
 (1.0, 3.0, 2, 427344.7937000003, 4848482.0692)
 (1.0, 3.0, 3, 427346.4297000002, 4848472.7469) ...,
 (1.0, 1.0, 7084, 427345.2709999997, 4848796.592)
 (1.0, 1.0, 7085, 427352.9277999997, 4848790.9351)
 (1.0, 1.0, 7086, 427359.16060000006, 4848787.4332)]

我想根据数组中的第二个值(3.0、3.0、3.0 ... 1.0、1.0、10)将此数组拆分为多个数组.

I want to split this array into multiple arrays based on the 2nd value in the array (3.0, 3.0, 3.0...1.0,1.0,10).

每次第二个值更改时,我都想要一个新数组,因此基本上每个新数组都具有相同的第二个值.我已经在Stack Overflow上进行了查找,并且知道了命令

Every time the 2nd value changes, I want a new array, so basically each new array has the same 2nd value. I've looked this up on Stack Overflow and know of the command

np.split(array, number)

但是我不是试图将数组拆分为一定数量的数组,而是将其拆分为一个值.我将如何以上面指定的方式拆分数组? 任何帮助将不胜感激!

but I'm not trying to split the array into a certain number of arrays, but rather by a value. How would I be able to split the array in the way specified above? Any help would be appreciated!

推荐答案

您可以使用 numpy.diff 在第一列上:

You can find the indices where the values differ by using numpy.where and numpy.diff on the first column:

>>> arr = np.array([(1.0, 3.0, 1, 427338.4297000002, 4848489.4332),
 (1.0, 3.0, 2, 427344.7937000003, 4848482.0692),
 (1.0, 3.0, 3, 427346.4297000002, 4848472.7469),
 (1.0, 1.0, 7084, 427345.2709999997, 4848796.592),
 (1.0, 1.0, 7085, 427352.9277999997, 4848790.9351),
 (1.0, 1.0, 7086, 427359.16060000006, 4848787.4332)])
>>> np.split(arr, np.where(np.diff(arr[:,1]))[0]+1)
[array([[  1.00000000e+00,   3.00000000e+00,   1.00000000e+00,
          4.27338430e+05,   4.84848943e+06],
       [  1.00000000e+00,   3.00000000e+00,   2.00000000e+00,
          4.27344794e+05,   4.84848207e+06],
       [  1.00000000e+00,   3.00000000e+00,   3.00000000e+00,
          4.27346430e+05,   4.84847275e+06]]),
 array([[  1.00000000e+00,   1.00000000e+00,   7.08400000e+03,
          4.27345271e+05,   4.84879659e+06],
       [  1.00000000e+00,   1.00000000e+00,   7.08500000e+03,
          4.27352928e+05,   4.84879094e+06],
       [  1.00000000e+00,   1.00000000e+00,   7.08600000e+03,
          4.27359161e+05,   4.84878743e+06]])]

说明:

首先,我们将在第二个第二列中获取项目:

Here first we are going to fetch the items in the second 2 column:

>>> arr[:,1]
array([ 3.,  3.,  3.,  1.,  1.,  1.])

现在要找出项目实际更改的位置,我们可以使用numpy.diff:

Now to find out where the items actually change we can use numpy.diff:

>>> np.diff(arr[:,1])
array([ 0.,  0., -2.,  0.,  0.])

任何非零的东西都意味着它旁边的项目是不同的,我们可以使用numpy.where查找非零项目的索引,然后将其加1,因为该项目的实际索引比1大.返回的索引:

Any thing non-zero means that the item next to it was different, we can use numpy.where to find the indices of non-zero items and then add 1 to it because the actual index of such item is one more than the returned index:

>>> np.where(np.diff(arr[:,1]))[0]+1
array([3])

这篇关于Python:根据数组中的值拆分NumPy数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆