从具有重复列的DF中基于列表选择行 [英] Selecting rows - based on a list - from a DF with duplicated columns

查看：66 发布时间：2020/5/24 4:26:51 python pandas

本文介绍了从具有重复列的DF中基于列表选择行的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有以下数据框:

import pandas as pd
rep = pd.DataFrame.from_items([('Probe', ['x', 'y', 'z']), ('Gene', ['foo', 'bar', 'qux']), ('Probe',['x','y','z']), ("RP",[1.00,2.33,4.5])], orient='columns')

哪个会产生:

In [11]: rep
Out[11]:
  Probe Gene Probe    RP
0     x  foo     x  1.00
1     y  bar     y  2.33
2     z  qux     z  4.50

请注意，那里有重复的列. 我要做的是基于列表选择行:

Note that there are duplicate column there. What I want to do is to select the row based on a list:

ls = ["x", "z", "i"]

提供此:

  Probe Gene Probe    RP
0     x  foo     x  1.00
2     z  qux     z  4.50

请注意，我们希望基于上面的原始DF保留列.

Note that we'd like to preserve the columns based on the original DF above.

为什么失败了?

In [9]: rep[rep[[0]].isin(ls)]
ValueError: cannot reindex from a duplicate axis

什么是正确的方法? isin的替代品吗?

What's the right way to do it? Any alternative to isin?

推荐答案

您应在此处使用iloc:

You should use iloc here:

In [11]: rep.iloc[rep.iloc[0].isin(ls).values]
Out[11]:
  Probe Gene Probe   RP
0     x  foo     x  1.0
2     z  qux     z  4.5

这首先创建布尔矢量(作为一维数组而不是DataFrame)，您可以将其用作掩码:

This first creates the boolean vector (as a one-dimensional array rather than a DataFrame), and you can use this as a mask:

In [12]: rep.iloc[0].isin(ls).values
Out[12]: array([ True, False,  True, False], dtype=bool)

这篇关于从具有重复列的DF中基于列表选择行的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

从具有重复列的DF中基于列表选择行 [英] Selecting rows - based on a list - from a DF with duplicated columns

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

从具有重复列的DF中基于列表选择行 [英] Selecting rows - based on a list - from a DF with duplicated columns

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭