Stanford-NER定制，用于对软件编程关键字进行分类 [英] Stanford-NER customization to classify software programming keywords

查看：265 发布时间：2018/12/11 22:34:54 java nlp classification stanford-nlp

本文介绍了Stanford-NER定制，用于对软件编程关键字进行分类的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我是NLP的新手，我使用Stanford NER工具对一些随机文本进行分类，以提取软件编程中使用的特殊关键字。

I am new in NLP and I used Stanford NER tool to classify some random text to extract special keywords used in software programming.

问题是，我不知道如何对Stanford NER中的分类器和文本注释器进行更改以识别软件编程关键字。例如：

The problem is, I don't no how to do changes to the classifiers and text annotators in Stanford NER to recognize software programming keywords. For example:

today Java used in different operating systems (Windows, Linux, ..)

分类结果如下：

Java "Programming_Language"
Windows "Operating_System"
Linux "Operating_system"

请问如何自定义StanfordNER分类器以满足我的需求？

Would you please help on how to customize the StanfordNER classifiers to satisfied my needs?

推荐答案

我认为它在斯坦福NER常见问题解答部分 http://nlp.stanford.edu/software/crf- faq.shtml＃一个。

I think it is quite well documented in Stanford NER faq section http://nlp.stanford.edu/software/crf-faq.shtml#a.

以下是步骤：

在属性文件中将地图更改为指定训练数据的注释方式（或
结构化）

map = word = 0，myfeature = 1，answer = 2

map = word=0,myfeature=1,answer=2

在 src \edu\stanford \ nlp \ sequences \ SeqClassifierFlags中。 java

添加一个标志，表示您要使用新功能，我们称之为useMyFeature
public boolean useLabelSource = false ，Add
public boolean useMyFeature = true;

Add a flag stating that you want to use your new feature, let's call it useMyFeature Below public boolean useLabelSource = false , Add public boolean useMyFeature= true;

在中的同一文件中setProperties（Properties props，boolean printProps）
之后的方法 else if（key.equalsIgnoreCase（useTrainLexicon））{..} 告诉工具，如果这个标志是开/关的话

In same file in setProperties(Properties props, boolean printProps) method after else if (key.equalsIgnoreCase("useTrainLexicon")) { ..} tell tool, if this flag is on/off for you

else if (key.equalsIgnoreCase("useMyFeature")) {
      useMyFeature= Boolean.parseBoolean(val);
}

在 src / edu / stanford / nlp / ling / CoreAnnotations.java ，添加以下
部分

In src/edu/stanford/nlp/ling/CoreAnnotations.java, add following section

public static class myfeature implements CoreAnnotation<String> {
  public Class<String> getType() {
    return String.class;
  }
}

在 src /edu/stanford/nlp/ling/AnnotationLookup.java in
public enumKeyLookup {..} in bottom add

In src/edu/stanford/nlp/ling/AnnotationLookup.java in public enumKeyLookup{..} in bottom add

MY_TAG（CoreAnnotations.myfeature.class，myfeature）

MY_TAG(CoreAnnotations.myfeature.class,"myfeature")

在src \\ \\ uuu\stanford\\\ lp\\\\NERFeatureFactory.java，取决于它的
类型，添加

In src\edu\stanford\nlp\ie\NERFeatureFactory.java, depending on the "type" of feature it is, add in

protected Collection<String> featuresC(PaddedList<IN> cInfo, int loc)

if(flags.useRahulPOSTAGS){
    featuresC.add(c.get(CoreAnnotations.myfeature.class)+"-my_tag");
}

调试：
除此之外，还有一些方法可以将功能转储到文件中，使用它们来查看事情是如何完成的。另外，我认为你也需要花一些时间在调试器上：P

Debugging: In addition to this, there are methods which dump the features on file, use them to see how things are getting done under hood. Also, I think you would have to spend some time with debugger too :P

这篇关于Stanford-NER定制，用于对软件编程关键字进行分类的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Stanford-NER定制，用于对软件编程关键字进行分类 [英] Stanford-NER customization to classify software programming keywords

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

Stanford-NER定制，用于对软件编程关键字进行分类 [英] Stanford-NER customization to classify software programming keywords

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭