阻止字符串 pandas 中的七位数字 [英] blocking seven digit numbers in string pandas
问题描述
背景
我有以下示例 df
import pandas as pd
df = pd.DataFrame({'Text':['This person num is 111-888-8888 and other',
'dont block 23 here',
'two numbers: 001-002-1234 and some other 123-123-1234 here',
'block this 666-666-6666',
'1-510-999-9999 is one more'],
'P_ID': [1,2,3,4,5],
'N_ID' : ['A1', 'A2', 'A3','A4', 'A5']})
N_ID P_ID Text
0 A1 1 This person num is 111-888-8888 and other
1 A2 2 dont block 23 here
2 A3 3 two numbers: 001-002-1234 and some other 123-1...
3 A4 4 block this 666-666-6666
4 A5 5 1-510-999-9999 is one more
目标
1) 屏蔽所有七位数字,例如111-888-8888
变成 **Block**
1) Block all seven digit numbers e.g. 111-888-8888
becomes **Block**
2) 避免阻塞非七位数字,例如23
2) Avoid blocking non-seven digit numbers e.g. 23
3) 创建新列
尝试过
我已经尝试了以下
df['New_Text'] = df['Text'].str.replace(r'\d+','**Block**')
但它会屏蔽所有数字
也尝试过
我也尝试使用许多其他版本更改 \d+
,例如/^\d{7}$/
取自 正则表达式正好七位数 和例如 ^[0-9]{7}
取自正则表达式匹配"<七位数字>-<文件名>>只有一组七位数,例如 \b[0-9]{7}(?![0-9])
取自REGEX 连续得到七个数字?但它们都是不工作.
I have also tried changing the \d+
with many other version e.g. /^\d{7}$/
taken from Regexp exactly seven digits and e.g ^[0-9]{7}
taken from
Regex to match "<seven digits> - <filename>" with only one set of seven digits and e.g \b[0-9]{7}(?![0-9])
taken from
REGEX To get seven numbers in a row? but they all don't work.
期望输出
N_ID P_ID Text New_Text
0 This person num is **Block** and other
1 dont block 23 here
2 two numbers: **Block** and some other **Block**
3 block this **Block**
4 1-**Block** is one more
问题
如何调整我的代码以达到我想要的输出?
How do I tweak my code to achieve my desired output?
推荐答案
你可以试试这个正则表达式.((?:[\d]-?){7,})
You can try this regex expression. ((?:[\d]-?){7,})
最后一段代码是这样的
df['New_Text'] = df['Text'].str.replace(r'((?:[\d]-?){7,})','**Block**')
这篇关于阻止字符串 pandas 中的七位数字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!