将pandas DataFrame中的对角三角形设置为NaN [英] Set diagonal triangle in pandas DataFrame to NaN
问题描述
给出以下数据框:
import pandas as pd
import numpy as np
a = np.arange(16).reshape(4, 4)
df = pd.DataFrame(data=a, columns=['a','b','c','d'])
我想产生以下结果:
df([[ NaN, 1, 2, 3],
[ NaN, NaN, 6, 7],
[ NaN, NaN, NaN, 11],
[ NaN, NaN, NaN, NaN]])
到目前为止,我已经尝试使用np.tril_indicies
,但是它仅适用于将df转换为numpy数组的情况,并且仅适用于整数赋值(不适用于np.nan):
So far I've tried using np.tril_indicies
, but it only works with a df turned back into a numpy array, and it only works for integer assignments (not np.nan):
il1 = np.tril_indices(4)
a[il1] = 0
给予:
array([[ 0, 1, 2, 3],
[ 0, 0, 6, 7],
[ 0, 0, 0, 11],
[ 0, 0, 0, 0]])
...这几乎是我在寻找的东西,但是在分配NaN时遇到了麻烦:
...which is almost what I'm looking for, but barfs at assigning NaN:
ValueError: cannot convert float NaN to integer
同时:
df[il1] = 0
给予:
TypeError: unhashable type: 'numpy.ndarray'
因此,如果我想用NaN填充数据框的底部三角形,是否必须1)必须是一个numpy数组,或者我可以直接使用pandas来做到这一点? 2)是否有一种方法可以用NaN填充底部三角形,而不是使用numpy.fill_diagonal
并在整个DataFrame中逐行递增偏移量?
So if I want to fill the bottom triangle of a dataframe with NaN, does it 1) have to be a numpy array, or can I do this with pandas directly? And 2) Is there a way to fill bottom triangle with NaN rather than using numpy.fill_diagonal
and incrementing the offset row by row down the whole DataFrame?
另一个失败的解决方案: 用零填充np数组的对角线,然后在零处屏蔽并重新分配给np.nan.当应将其保留为零时,它将对角线上方的零值转换为NaN!
Another failed solution: Filling the diagonal of np array with zeros, then masking on zero and reassigning to np.nan. It converts zero values above the diagonal as NaN when they should be preserved as zero!
推荐答案
您需要强制转换为float
a
,因为NaN
中的type
是float
:
You need cast to float
a
, because type
of NaN
is float
:
import numpy as np
a = np.arange(16).reshape(4, 4).astype(float)
print (a)
[[ 0. 1. 2. 3.]
[ 4. 5. 6. 7.]
[ 8. 9. 10. 11.]
[ 12. 13. 14. 15.]]
il1 = np.tril_indices(4)
a[il1] = np.nan
print (a)
[[ nan 1. 2. 3.]
[ nan nan 6. 7.]
[ nan nan nan 11.]
[ nan nan nan nan]]
df = pd.DataFrame(data=a, columns=['a','b','c','d'])
print (df)
a b c d
0 NaN 1.0 2.0 3.0
1 NaN NaN 6.0 7.0
2 NaN NaN NaN 11.0
3 NaN NaN NaN NaN
这篇关于将pandas DataFrame中的对角三角形设置为NaN的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!