句子结构分析 [英] Sentence structure analysis
问题描述
我正在尝试查看句子的结构相似性,特别是动词、形容词、名词的位置.例如,我有三个(或更多)句子,如下所示:
I am trying to look at the structure similarity of sentences, specifically to the position of verbs, adj, nouns. For instance, I have three (or more) sentences which look likes as follows:
I ate an apple pie, yesterday.
I ate an orange, yesterday.
I eat a lemon, today.
所有这些都以代词 (I) 开头,然后是动词(吃/吃)和名词(苹果派、橙子、柠檬),最后是副词(昨天/明天).
All of them starts with a pronoun (I) followed by a verb (ate/eat) and a noun (apple pie, orange, lemon) and, finally, an adverb (yesterday/tomorrow).
我想知道是否有一种方法可以识别结构,即句子中的PRONOUN VERB NOUN ADVERB.
I would like to know if there is a way to identify the structure, i.e. PRONOUN VERB NOUN ADVERB in the sentence.
如果我将其视为熊猫数据框:
If I think of it as a pandas dataframe:
SENTENCE
I ate an apple pie, yesterday.
I ate an orange, yesterday.
I eat a lemon, today.
我需要如下内容:
SENTENCE STRUCTURE
I ate an apple pie, yesterday. PRONOUN VERB NOUN ADJECTIVE
I ate an orange, yesterday. PRONOUN VERB NOUN ADJECTIVE
I eat a lemon, today. PRONOUN VERB NOUN ADJECTIVE
你知道我怎样才能得到这个(或类似的)结果吗?
Do you know how I can get this (or similar) results?
推荐答案
这是一个使用 spacy 的简单示例:
Here is a simple example using spacy:
import spacy
import pandas as pd
# load english language model
nlp = spacy.load('en_core_web_sm',disable=['ner','textcat'])
text = "I ate an apple pie, yesterday."
# create spacy
doc = nlp(text)
pos = ""
for token in doc:
pos += token.pos_ + " "
# create dataframe
df = pd.DataFrame([[text, pos]], columns=['Sentence', 'Structure'])
print(df)
输出为:
Sentence Structure
0 I ate an apple pie, yesterday. PRON VERB DET NOUN NOUN PUNCT NOUN PUNCT
这篇关于句子结构分析的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!