Pandas str.contains-在字符串中搜索多个值并在新列中打印这些值 [英] Pandas str.contains - Search for multiple values in a string and print the values in a new column

查看：594 发布时间：2020/5/24 1:53:46 python string pandas

本文介绍了Pandas str.contains-在字符串中搜索多个值并在新列中打印这些值的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我刚刚开始使用Python进行编码，并希望构建一个解决方案，在该解决方案中，您将搜索字符串以查看其是否包含一组给定的值.

我在R中找到了一个类似的解决方案，该解决方案使用了Stringr库:

所以我意识到我没有给出很好的解释，对此感到抱歉.

下面是一个示例，其中我匹配字符串中的水果名称，并且根据是否在字符串中找到任何匹配项，它将在新列中打印true或false.这是我的问题:我不想打印出true或false而是打印出它在字符串(例如)中找到的名称.苹果，橘子等.

import pandas as pd
import numpy as np

text = [('I want to buy some apples.', 0),
         ('Oranges are good for the health.', 0),
         ('John is eating some grapes.', 0),
         ('This line does not contain any fruit names.', 0),
         ('I bought 2 blueberries yesterday.', 0)]
labels = ['Text','Random Column']

df = pd.DataFrame.from_records(text, columns=labels)

df.insert(2, "MatchedValues", np.nan)

foods =['apples', 'oranges', 'grapes', 'blueberries']

pattern = '|'.join(foods)

df['MatchedValues'] = df['Text'].str.contains(pattern, case=False)

print(df)

结果

                                          Text  Random Column  MatchedValues
0                   I want to buy some apples.              0           True
1             Oranges are good for the health.              0           True
2                  John is eating some grapes.              0           True
3  This line does not contain any fruit names.              0          False
4            I bought 2 blueberries yesterday.              0           True

想要的结果

                                          Text  Random Column  MatchedValues
0                   I want to buy some apples.              0           apples
1             Oranges are good for the health.              0           oranges
2                  John is eating some grapes.              0           grapes
3  This line does not contain any fruit names.              0          NaN
4            I bought 2 blueberries yesterday.              0           blueberries

解决方案

这是一种方法:

foods =['apples', 'oranges', 'grapes', 'blueberries']

def matcher(x):
    for i in foods:
        if i.lower() in x.lower():
            return i
    else:
        return np.nan

df['Match'] = df['Text'].apply(matcher)

#                                           Text        Match
# 0                   I want to buy some apples.       apples
# 1             Oranges are good for the health.      oranges
# 2                  John is eating some grapes.       grapes
# 3  This line does not contain any fruit names.          NaN
# 4            I bought 2 blueberries yesterday.  blueberries

I just started coding in Python and want to build a solution where you would search a string to see if it contains a given set of values.

The following code seems to work but i also want to output the three values that i'm looking for and this solution will only output one value:

#Inserting new column
df.insert(5, "New_Column", np.nan)

#Searching old column
df['New_Column'] = np.where(df['Column_with_text'].str.contains('value1|value2|value3', case=False, na=False), 'value', 'NaN')

------ Edit ------

So i realised i didn't give that good of an explanation, sorry about that.

Below is an example where i match fruit names in a string and depending on if it finds any matches in the string it will print out either true or false in a new column. Here's my question: Instead of printing out true or false i want to print out the name it found in the string eg. apples, oranges etc.

import pandas as pd
import numpy as np

text = [('I want to buy some apples.', 0),
         ('Oranges are good for the health.', 0),
         ('John is eating some grapes.', 0),
         ('This line does not contain any fruit names.', 0),
         ('I bought 2 blueberries yesterday.', 0)]
labels = ['Text','Random Column']

df = pd.DataFrame.from_records(text, columns=labels)

df.insert(2, "MatchedValues", np.nan)

foods =['apples', 'oranges', 'grapes', 'blueberries']

pattern = '|'.join(foods)

df['MatchedValues'] = df['Text'].str.contains(pattern, case=False)

print(df)

Result

                                          Text  Random Column  MatchedValues
0                   I want to buy some apples.              0           True
1             Oranges are good for the health.              0           True
2                  John is eating some grapes.              0           True
3  This line does not contain any fruit names.              0          False
4            I bought 2 blueberries yesterday.              0           True

Wanted result

                                          Text  Random Column  MatchedValues
0                   I want to buy some apples.              0           apples
1             Oranges are good for the health.              0           oranges
2                  John is eating some grapes.              0           grapes
3  This line does not contain any fruit names.              0          NaN
4            I bought 2 blueberries yesterday.              0           blueberries

解决方案

Here is one way:

foods =['apples', 'oranges', 'grapes', 'blueberries']

def matcher(x):
    for i in foods:
        if i.lower() in x.lower():
            return i
    else:
        return np.nan

df['Match'] = df['Text'].apply(matcher)

#                                           Text        Match
# 0                   I want to buy some apples.       apples
# 1             Oranges are good for the health.      oranges
# 2                  John is eating some grapes.       grapes
# 3  This line does not contain any fruit names.          NaN
# 4            I bought 2 blueberries yesterday.  blueberries

这篇关于Pandas str.contains-在字符串中搜索多个值并在新列中打印这些值的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Pandas str.contains-在字符串中搜索多个值并在新列中打印这些值 [英] Pandas str.contains - Search for multiple values in a string and print the values in a new column

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Pandas str.contains-在字符串中搜索多个值并在新列中打印这些值 [英] Pandas str.contains - Search for multiple values in a string and print the values in a new column

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭