如何选择行中至少一个元素中包含特定值的行? [英] How to select the rows that contain a specific value in at least one of the elements in a row?
问题描述
我有一个DataFrame
DF
和一个列表,例如List1
. List1
是从DF
创建的,具有DF
中存在的元素,但没有重复.我需要执行以下操作:
1.从List1
中选择包含特定元素的DF
行(例如,迭代List1
中的所有元素)
2.将它们从0重新索引到任意行数,因为选择的行可能具有不连续的索引.
I have a DataFrame
DF
and a list, say List1
. List1
is created from the DF
and it has the elements present in DF
but without repetitions. I need to do the following:
1. Select the rows of DF
that contain a specific element from List1
(for instance, iterating all the elements in List1
)
2. Re-index them from 0 to whatever the number of rows are because the rows selected may have non continuous indices.
样品输入:
List1=['Apple','Orange','Banana','Pineapple','Pear','Tomato','Potato']
Sample DF
EQ1 EQ2 EQ3
0 Apple Orange NaN
1 Banana Potato NaN
2 Pear Tomato Pineapple
3 Apple Tomato Pear
4 Tomato Potato Banana
现在,如果我要访问包含Apple
的行,则这些行将为0和3.但是我希望将它们重命名为0和1(重新索引).搜索Apple
后,应采用List1
中的下一个元素,并应执行类似的步骤.此后,我还有其他操作要执行,因此我需要在整个List1
中循环整个过程.我希望我已经解释清楚了,这是我的相同代码,该代码无法正常工作:
Now if I want access to the rows that contain Apple
, those would be 0 and 3. But I'd like them renamed as 0 and 1(Re-indexing). After Apple
is searched, the next element from List1
should be taken and similar steps are to be carried out. I have other operations to perform after this, so I need to loop the whole process throughout List1
. I hope I have explained it well and here is my codelet for the same, which is not working:
for eq in List1:
MCS=DF.loc[MCS_Simp_green[:] ==eq] #Indentation was missing
MCS= MCS.reset_index(drop=True)
<Remaining operations>
推荐答案
我认为您需要 any
:
I think you need isin
with any
:
List1=['Apple','Orange','Banana','Pineapple','Pear','Tomato','Potato']
for eq in List1:
#print df.isin([eq]).any(1)
#print df[df.isin([eq]).any(1)]
df1 = df[df.isin([eq]).any(1)].reset_index(drop=True)
print df1
EQ1 EQ2 EQ3
0 Apple Orange NaN
1 Apple Tomato Pear
EQ1 EQ2 EQ3
0 Apple Orange NaN
EQ1 EQ2 EQ3
0 Banana Potato NaN
1 Tomato Potato Banana
EQ1 EQ2 EQ3
0 Pear Tomato Pineapple
EQ1 EQ2 EQ3
0 Pear Tomato Pineapple
1 Apple Tomato Pear
EQ1 EQ2 EQ3
0 Pear Tomato Pineapple
1 Apple Tomato Pear
2 Tomato Potato Banana
EQ1 EQ2 EQ3
0 Banana Potato NaN
1 Tomato Potato Banana
要存储值,可以使用dict
理解:
For storing values you can use dict
comprehension:
dfs = {eq: df[df.isin([eq]).any(1)].reset_index(drop=True) for eq in List1}
print dfs['Apple']
EQ1 EQ2 EQ3
0 Apple Orange NaN
1 Apple Tomato Pear
print dfs['Orange']
EQ1 EQ2 EQ3
0 Apple Orange NaN
这篇关于如何选择行中至少一个元素中包含特定值的行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!