删除行和ValueError数组的长度不同 [英] remove rows and ValueError Arrays were different lengths
问题描述
我的数据框有一个子类别,在每个类别(cat
,dog
,bird
)下,都会显示统计信息.如果行包含count
和freq
中的信息,则需要删除这些行,并且仅保留具有sd
和mean
值的行.某些值为NaN
.
My dataframe has subcategory, under each category (cat
, dog
, bird
), stats information is presented. I need to remove the rows if they contain info in count
and freq
, and only keep rows with sd
and mean
values. Some values are NaN
.
ValueError
出现在我的代码中.
df:
var stats A B C
cat mean 2 3 4
NaN sd 2 1 3
NaN count 5 2 6
NaN freq 3 1 19
dog mean 8 1 2
NaN sd 2 1 3
NaN count 4 6 1
NaN freq 3 1 19
bird mean 2 3 4
NaN sd 2 1 3
NaN count 5 2 6
NaN freq NaN NaN NaN
我的代码:
rows = ['count', 'freq']
df = [df.stats != rows]
预期结果
var stats A B C
cat mean 2 3 4
NaN sd 2 1 3
dog mean 8 1 2
NaN sd 2 1 3
bird mean 2 3 4
NaN sd 2 1 3
错误:
File "pandas/_libs/lib.pyx", line 805, in pandas._libs.lib.vec_compare
(pandas/_libs/lib.c:14288)
ValueError: Arrays were different lengths: 819 vs 9
我不确定如何检查数组长度,但是在我的Excel电子表格中,所有列和行都具有相同的长度.此错误是由数据中的NaN/空单元格引起的吗?
I am not sure how to check the array length, but in my excel spreadsheet, all columns and rows have the same length. Is this error caused by NaN/empty cell in my data?
谢谢!
推荐答案
!=
在这里不起作用.使用pd.Series.isin
获取掩码,然后将其用于过滤数据框.
!=
will not work here. Use pd.Series.isin
to obtain a mask you'll then use to filter your dataframe.
m = ~df.stats.isin(['count', 'freq'])
print(m)
0 True
1 True
2 False
3 False
4 True
5 True
6 False
7 False
8 True
9 True
10 False
11 False
Name: stats, dtype: bool
print(df[m])
var stats A B C
0 cat mean 2.0 3.0 4.0
1 NaN sd 2.0 1.0 3.0
4 dog mean 8.0 1.0 2.0
5 NaN sd 2.0 1.0 3.0
8 bird mean 2.0 3.0 4.0
9 NaN sd 2.0 1.0 3.0
这篇关于删除行和ValueError数组的长度不同的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!