斯坦福解析器问题 [英] Stanford Parser questions

查看:153
本文介绍了斯坦福解析器问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在编写一个与NLP(自然语言解析器)配合使用的项目.我正在使用斯坦福解析器.

I am writing a project that works with NLP (natural language parser). I am using the stanford parser.

我创建了一个接受语句的线程池,并使用它们运行解析器. 当我创建一个线程时,一切正常,但是当我创建更多线程时,我得到了错误. 测试"过程是查找具有某些联系的单词. 如果我进行同步,它应该像一个线程一样工作,但仍然会出现错误. 我的问题是我在此代码上有错误:

I create a thread pool that takes sentences and run the parser with them. When I create one thread its all works fine, but when I create more, I get errors. The "test" procedure is finding words that have some connections. If I do an synchronized its supposed to work like one thread but still I get errors. My problem is that I have errors on this code:

public synchronized String test(String s,LexicalizedParser lp )
{

    if (s.isEmpty()) return "";
    if (s.length()>80) return "";
    System.out.println(s);
    String[] sent = s.split(" ");
 Tree parse = (Tree) lp.apply(Arrays.asList(sent));
TreebankLanguagePack tlp = new PennTreebankLanguagePack();
GrammaticalStructureFactory gsf = tlp.grammaticalStructureFactory();
GrammaticalStructure gs = gsf.newGrammaticalStructure(parse);
Collection tdl = gs.typedDependenciesCollapsed();
List list = new ArrayList(tdl);


//for (int i=0;i<list.size();i++)
//System.out.println(list.get(1).toString());

//remove scops and numbers like sbj(screen-4,good-6)->screen good

 Pattern p = Pattern.compile(".*\\((.*?)\\-\\d+,(.*?)\\-\\d+\\).*");

       if (list.size()>2){
    // Split input with the pattern
        Matcher m = p.matcher(list.get(1).toString());
        //check if the result have more than  1 groups
       if (m.find()&& m.groupCount()>1){
           if (m.groupCount()>1)
           {
               System.out.println(list);
 return  m.group(1)+m.group(2);
    }}
}
        return "";

}

我的错误是:

在blogsOpinions.ParserText.(ParserText.java:47) 在blogsOpinions.ThreadPoolTest $ 1.run(ThreadPoolTest.java:50) 在blogsOpinions.ThreadPool $ PooledThread.run(ThreadPoolTest.java:196) 使用跌倒恢复 策略:将构造一个(X ...) 树.线程异常 "PooledThread-21" java.lang.ClassCastException: java.lang.String不能强制转换为 edu.stanford.nlp.ling.HasWord

at blogsOpinions.ParserText.(ParserText.java:47) at blogsOpinions.ThreadPoolTest$1.run(ThreadPoolTest.java:50) at blogsOpinions.ThreadPool$PooledThread.run(ThreadPoolTest.java:196) Recovering using fall through strategy: will construct an (X ...) tree. Exception in thread "PooledThread-21" java.lang.ClassCastException: java.lang.String cannot be cast to edu.stanford.nlp.ling.HasWord

在 edu.stanford.nlp.parser.lexparser.LexicalizedParser.apply(LexicalizedParser.java:289) 在blogsOpinions.ParserText.test(ParserText.java:174)上 在blogsOpinions.ParserText.insertDb(ParserText.java:76) 在blogsOpinions.ParserText.(ParserText.java:47)上 在blogsOpinions.ThreadPoolTest $ 1.run(ThreadPoolTest.java:50) 在blogsOpinions.ThreadPool $ PooledThread.run(ThreadPoolTest.java:196)

at edu.stanford.nlp.parser.lexparser.LexicalizedParser.apply(LexicalizedParser.java:289) at blogsOpinions.ParserText.test(ParserText.java:174) at blogsOpinions.ParserText.insertDb(ParserText.java:76) at blogsOpinions.ParserText.(ParserText.java:47) at blogsOpinions.ThreadPoolTest$1.run(ThreadPoolTest.java:50) at blogsOpinions.ThreadPool$PooledThread.run(ThreadPoolTest.java:196)

以及如何获得像屏幕一样好的主题描述,我想从得到的列表中获得更好的屏幕,而不像list.get(1).

and how can i get the discription of the subject like the screen is very good, and I want to get screen good from the list the I get and not like list.get(1).

推荐答案

您不能在StringList上调用LexicalizedParser.parse;它需要 HasWord 对象.调用

You can't call LexicalizedParser.parse on a List of Strings; it expects a list of HasWord objects. It's much easier to call the apply method on your input string. This will also run a proper tokenizer on your input (instead of your simple split on spaces).

要从返回的Tree中获取诸如主题之类的关系,请调用其

To get relations such as subjectness out of the returned Tree, call its dependencies member.

这篇关于斯坦福解析器问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆