Python:根据数组中的值拆分 NumPy 数组 [英] Python: Split NumPy array based on values in the array
问题描述
我有一个大数组:
[(1.0, 3.0, 1, 427338.4297000002, 4848489.4332)(1.0, 3.0, 2, 427344.7937000003, 4848482.0692)(1.0, 3.0, 3, 427346.4297000002, 4848472.7469) ...,(1.0, 1.0, 7084, 427345.2709999997, 4848796.592)(1.0, 1.0, 7085, 427352.9277999997, 4848790.9351)(1.0, 1.0, 7086, 427359.16060000006, 4848787.4332)]
我想根据数组中的第二个值 (3.0, 3.0, 3.0...1.0,1.0,10) 将该数组拆分为多个数组.
每次第二个值改变时,我想要一个新数组,所以基本上每个新数组都有相同的第二个值.我在 Stack Overflow 上查过这个,知道命令
np.split(array, number)
但我不是想将数组拆分成一定数量的数组,而是按一个值.我如何才能以上面指定的方式拆分数组?任何帮助将不胜感激!
您可以使用 numpy.where
和 numpy.diff
在第一列:
说明:
首先,我们将获取第二列中的项目:
<预><代码>>>>arr[:,1]数组([ 3., 3., 3., 1., 1., 1.])现在要找出项目实际更改的位置,我们可以使用numpy.diff
:
任何非零的东西都意味着它旁边的项目是不同的,我们可以使用numpy.where
来找到非零项目的索引,然后给它加1,因为实际的索引这样的项目比返回的索引多一:
I have one big array:
[(1.0, 3.0, 1, 427338.4297000002, 4848489.4332)
(1.0, 3.0, 2, 427344.7937000003, 4848482.0692)
(1.0, 3.0, 3, 427346.4297000002, 4848472.7469) ...,
(1.0, 1.0, 7084, 427345.2709999997, 4848796.592)
(1.0, 1.0, 7085, 427352.9277999997, 4848790.9351)
(1.0, 1.0, 7086, 427359.16060000006, 4848787.4332)]
I want to split this array into multiple arrays based on the 2nd value in the array (3.0, 3.0, 3.0...1.0,1.0,10).
Every time the 2nd value changes, I want a new array, so basically each new array has the same 2nd value. I've looked this up on Stack Overflow and know of the command
np.split(array, number)
but I'm not trying to split the array into a certain number of arrays, but rather by a value. How would I be able to split the array in the way specified above? Any help would be appreciated!
You can find the indices where the values differ by using numpy.where
and numpy.diff
on the first column:
>>> arr = np.array([(1.0, 3.0, 1, 427338.4297000002, 4848489.4332),
(1.0, 3.0, 2, 427344.7937000003, 4848482.0692),
(1.0, 3.0, 3, 427346.4297000002, 4848472.7469),
(1.0, 1.0, 7084, 427345.2709999997, 4848796.592),
(1.0, 1.0, 7085, 427352.9277999997, 4848790.9351),
(1.0, 1.0, 7086, 427359.16060000006, 4848787.4332)])
>>> np.split(arr, np.where(np.diff(arr[:,1]))[0]+1)
[array([[ 1.00000000e+00, 3.00000000e+00, 1.00000000e+00,
4.27338430e+05, 4.84848943e+06],
[ 1.00000000e+00, 3.00000000e+00, 2.00000000e+00,
4.27344794e+05, 4.84848207e+06],
[ 1.00000000e+00, 3.00000000e+00, 3.00000000e+00,
4.27346430e+05, 4.84847275e+06]]),
array([[ 1.00000000e+00, 1.00000000e+00, 7.08400000e+03,
4.27345271e+05, 4.84879659e+06],
[ 1.00000000e+00, 1.00000000e+00, 7.08500000e+03,
4.27352928e+05, 4.84879094e+06],
[ 1.00000000e+00, 1.00000000e+00, 7.08600000e+03,
4.27359161e+05, 4.84878743e+06]])]
Explanation:
Here first we are going to fetch the items in the second 2 column:
>>> arr[:,1]
array([ 3., 3., 3., 1., 1., 1.])
Now to find out where the items actually change we can use numpy.diff
:
>>> np.diff(arr[:,1])
array([ 0., 0., -2., 0., 0.])
Any thing non-zero means that the item next to it was different, we can use numpy.where
to find the indices of non-zero items and then add 1 to it because the actual index of such item is one more than the returned index:
>>> np.where(np.diff(arr[:,1]))[0]+1
array([3])
这篇关于Python:根据数组中的值拆分 NumPy 数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!