pandas :找到其名称包含特定字符串的列 [英] Pandas: find column whose name contains a specific string

查看:208
本文介绍了 pandas :找到其名称包含特定字符串的列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以,我有一个列名称的数据框,我想找到包含某个字符串,但不完全匹配它的一个。我在'spike-2'列名称中搜索'spike''spiked-in''spike'部分是总是连续的)。

我希望列名作为字符串或变量返回,所以稍后用 df ['name'] df [name] 照常。我试图找到办法做到这一点,无济于事。任何提示?

解决方案

只要迭代 DataFrame.columns

  import pandas as pd 
是一个例子,在这个例子中,
data = {spike-2':[1,2,3],'hey spke':[4,5,6],'spiked-in':[7,8,9],'no ':[10,11,12]}
df = pd.DataFrame(data)

spike_cols = [col在df.columns中col如果'spike'在col]
print(list(df.columns))
print(spike_cols)

输出:

  ['hey spke','no','spike-2','spiked-in'] 
[ 'spike-2','spiked-in']

解释:$ b $


  1. df.columns 返回一个列名称列表

  2. <$如果'spike'in col] 在列表 df.columns 上迭代了变量 [col for df.columns code> col 并将其添加到resulti ng列表,如果 col 包含'spike'。这个语法是列表理解

如果您只想得到匹配列的结果数据集,可以这样做:

  df2 = df.filter(regex ='spike')
print(df2)

输出:

$ $ $ $ $ $ $ $ $ $ $ $ $ 0 1 7
1 2 8
2 3 9


So, I have a dataframe with column names, and I want to find the one that contains a certain string, but does not exactly match it. I'm searching for 'spike' in column names like 'spike-2', 'hey spike', 'spiked-in' (the 'spike' part is always continuous).

I want the column name to be returned as a string or a variable, so I access the column later with df['name'] or df[name] as normal. I've tried to find ways to do this, to no avail. Any tips?

解决方案

Just iterate over DataFrame.columns, now this is an example in which you will end up with a list of column names that match:

import pandas as pd

data = {'spike-2': [1,2,3], 'hey spke': [4,5,6], 'spiked-in': [7,8,9], 'no': [10,11,12]}
df = pd.DataFrame(data)

spike_cols = [col for col in df.columns if 'spike' in col]
print(list(df.columns))
print(spike_cols)

Output:

['hey spke', 'no', 'spike-2', 'spiked-in']
['spike-2', 'spiked-in']

Explanation:

  1. df.columns returns a list of column names
  2. [col for col in df.columns if 'spike' in col] iterates over the list df.columns with the variable col and adds it to the resulting list if col contains 'spike'. This syntax is list comprehension.

If you only want the resulting data set with the columns that match you can do this:

df2 = df.filter(regex='spike')
print(df2)

Output:

   spike-2  spiked-in
0        1          7
1        2          8
2        3          9

这篇关于 pandas :找到其名称包含特定字符串的列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆