有没有一种基于规则的空间匹配方法来匹配模式? [英] is there a method of rule based matching of spacy to match patterns?
本文介绍了有没有一种基于规则的空间匹配方法来匹配模式?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
text1= "it_PRON is_AUX a_DET beautiful_ADJ apple_NOUN"
text2= "it_PRON is_AUX a_DET beautiful_ADJ and_CCONJ big_ADJ apple_NOUN"
因此,如果我们有一个adj后跟名词(Noun)或一个adj后跟(PUNCT或CCONJ)后跟一个adj后跟一个名词(Noun)
,我想创建一个基于规则的匹配摘录因此,我希望在输出中包含:
text1 = [beautiful_ADJ apple_NOUN]
text2= [beautiful_ADJ and_CCONJ big_ADJ apple_NOUN]
我尝试这样做,但我找不到允许这样做的正确模式:
from spacy.matcher import Matcher,PhraseMatcher
import spacy
import spacy
from spacy.matcher import Matcher
matchers = {"first_processing": Matcher(nlp.vocab, validate=True)}
nlp = spacy.load("en_core_web_sm")
pattern = [{},{},{}] #################################### we must find the right pattern
matchers["first_processing"].add("process_1", None, pattern)
nlp = spacy.load("en_core_web_sm")
doc = nlp("it_PRON is_AUX a_DET beautiful_ADJ and_CCONJ big_ADJ apple_NOUN")
a=matcher(doc)
for match_id, start, end in a:
text = doc[start:end].text
print(text)
推荐答案
我知道您有texts = ["it is a beautiful apple", "it is a beautiful and big apple"]
,并计划定义几个Matcher
模式来提取您拥有的文本中的某些位置模式。
您可以定义具有所需模式的列表列表,并作为第三个+参数传递给matcher.add
:
from spacy.matcher import Matcher,PhraseMatcher
import spacy
from spacy.matcher import Matcher
nlp = spacy.load("en_core_web_sm")
matcher = Matcher(nlp.vocab,validate=True)
patterns = [
[{'POS': 'ADJ'}, {'POS': 'NOUN'}],
[{'POS': 'ADJ'}, {'POS': 'CCONJ'}, {'POS': 'ADJ'}, {'POS': 'NOUN'}],
[{'POS': 'ADJ'}, {'POS': 'PUNCT'}, {'POS': 'ADJ'}, {'POS': 'NOUN'}]
]
matcher.add("process_1", None, *patterns)
texts= ["it is a beautiful apple", "it is a beautiful and big apple"]
for text in texts:
doc = nlp(text)
matches = matcher(doc)
for _, start, end in matches:
print(doc[start:end].text)
# => beautiful apple
# beautiful and big apple
# big apple
这篇关于有没有一种基于规则的空间匹配方法来匹配模式?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文