使用pandas过滤数据 [英] Filtering data with pandas

查看：195 发布时间：2017/2/25 0:46:52 python csv filter pandas

本文介绍了使用pandas过滤数据的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我是Pandas的新手，我想将它应用到我已经写的脚本。
我有一个csv文件，我从其中提取数据，并使用列候选人'，'最终轨道'和'状态<

I'm a newbie to Pandas and I'm trying to apply it to a script that I have already written. I have a csv file from which I extract the data, and use the columns 'candidate', 'final track' and 'status' for my data frame.

我的问题是，我想过滤数据，使用Wes Mckinney的10分钟教程中显示的方法（' http://nbviewer.ipython.org/urls/ gist.github.com/wesm/4757075/raw/a72d3450ad4924d0e74fb57c9f62d1d895ea4574/PandasTour.ipynb '）。在 In [80]中：他使用 aapl_bars.close_price ['2009-10-15'] 。

My problem is, I would like to filter the data, using perhaps the method shown in Wes Mckinney's 10min tutorial ('http://nbviewer.ipython.org/urls/gist.github.com/wesm/4757075/raw/a72d3450ad4924d0e74fb57c9f62d1d895ea4574/PandasTour.ipynb'). In the section In [80]: he uses aapl_bars.close_price['2009-10-15'].

我想使用类似的方法来选择所有具有 * 状态的数据。如果该行中没有*，则也会删除其他列中的数据。

I would like to use a similar method to select all the data which have * as a status. Data from the other columns are also deleted if there is no * in that row.

我的代码：

def establish_current_tacks(filename): df=pd.read_csv(filename) cols=[df.iloc[:,0], df.iloc[:,10], df.iloc[:,11]] current_tracks=pd.concat(cols, axis=1) return current_tracks

我的 DataFrame ：

>>> current_tracks <class 'pandas.core.frame.DataFrame'> Int64Index: 707 entries, 0 to 706 Data columns (total 3 columns): candidate 695 non-null values final track 670 non-null values status 670 non-null values dtypes: float64(1), object(2)

想要使用 current_tracks.status ['*'] ，但不起作用

推荐答案

由于你想要过滤的数据是基于的，因此，不是数据框架索引的一部分，而是一个常规列，你需要这样做：

Since the data you want to filter based on is not part of the data frame's index, but instead is a regular column, you need to do something like this:

current_tracks[current_tracks.status == '*']

完整示例：

import pandas as pd current_tracks = pd.DataFrame({'candidate': ['Bob', 'Jim', 'Alice'], 'final_track': [10, 15, 13], 'status': ['*', '.', '*']}) current_tracks Out[3]: candidate final_track status 0 Bob 10 * 1 Jim 15 . 2 Alice 13 * current_tracks[current_tracks.status == '*'] Out[4]: candidate final_track status 0 Bob 10 * 2 Alice 13 *

如果 status 是您的数据框架索引的一部分，您的原始语法将有效：

If status was part of your dataframe's index, your original syntax would have worked:

current_tracks = current_tracks.set_index('status') current_tracks.candidate['*'] Out[8]: status * Bob * Alice Name: candidate, dtype: object

这篇关于使用pandas过滤数据的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

使用pandas过滤数据 [英] Filtering data with pandas

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

使用pandas过滤数据 [英] Filtering data with pandas

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭