列包含第2列 [英] Column contains column 2
本文介绍了列包含第2列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个数据框.我想测试(C)在每一行中(B)列中的数字是否在字符串(A)列中.
I have a dataframe. I would like to test whether, (C), on each row, the number in column (B) is in the string, column (A).
df = pd.DataFrame({'A': ["me 123", "me-123", "1234", "me 12", "123 and"],
'B': [123, 123, 123, 123, 6]})
我想得到:
A B C
0 me 123 123 1
1 me-123 123 1
2 1234 123 0
3 me 12 123 0
4 123 and 6 0
各种方法几乎可以解决这个问题(1):
Various approaches nearly manage this (1):
df['C'] = [str(y) in x for x , y in zip(df.A.str.split(' '),df.B)]
A B C
0 me 123 123 True
1 me-123 123 False
2 1234 123 False
3 me 12 123 False
4 123 and 6 False
或(2):
df['C'] = [str(y) in x for x , y in zip(df.A,df.B)]
A B C
0 me 123 123 True
1 me-123 123 True
2 1234 123 True
3 me 12 123 False
4 123 and 6 False
或(3):
df['C']=df.A.str.contains(r'\b(?:{})\b'.format('|'.join(df.B.astype(str)))).astype(int)
A B C
0 me 123 123 1
1 me-123 123 1
2 1234 123 0
3 me 12 123 0
4 123 and 6 1
或(4):
def fun (A,B):
return str(B) in str(A)
f = np.vectorize(fun, otypes=[int])
df["C"] = f(df['A'], df['B'])
A B C
0 me 123 123 1
1 me-123 123 1
2 1234 123 1
3 me 12 123 0
4 123 and 6 0
或(5):
df['A1'] = df['A'] .apply(word_tokenize)
无法识别-是空格.请问如何获得顶部的结果?
Doesn't recognise - as a space. How can I get the result at the top please?
推荐答案
从extract
df.A.str.extract('(\d+)', expand=False).astype(int).eq(df.B,0).astype(int)
Out[347]:
0
0 1
1 1
2 0
3 0
4 0
这篇关于列包含第2列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文