用于自然语言处理的Java或Python [英] Java or Python for Natural Language Processing

查看:194
本文介绍了用于自然语言处理的Java或Python的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道哪种编程语言更适合自然语言处理。 Java Python ?我找到了很多关于它的问题和答案。但是我仍然选择使用哪一个。

I would like to know which programming language is better for natural language processing. Java or Python? I have found lots of questions and answers regarding about it. But I am still lost in choosing which one to use.

我想知道哪个NLP库用于Java,因为有很多库(LingPipe,GATE, OpenNLP,StandfordNLP)。对于Python,大多数程序员推荐使用NLTK。

And I want to know which NLP library to use for Java since there are lots of libraries (LingPipe, GATE, OpenNLP, StandfordNLP). For Python, most programmers recommend NLTK.

但是如果我要从非结构化数据进行一些文本处理或信息提取(只是自由形成的平原)英文文本)获取一些有用的信息,最佳选择是什么? Java还是Python?合适的图书馆?

But if I am to do some text processing or information extraction from unstructured data (just free formed plain English text) to get some useful information, what is the best option? Java or Python? Suitable library?

已更新

我想要做的是提取有用的内容来自非结构化数据的产品信息(例如,用户使用不太标准的英语语言制作不同形式的广告,关于手机或笔记本电脑)

What I want to do is to extract useful product information from unstructured data (E.g. users make different forms of advertisement about mobiles or laptops with not very standard English language)

推荐答案

Java vs Python for NLP是一种偏好或必要性。根据公司/项目的不同,您需要使用其中一个,除非您正在进行项目,否则通常没有多少选择。

Java vs Python for NLP is very much a preference or necessity. Depending on the company/projects you'll need to use one or the other and often there isn't much of a choice unless you're heading a project.

其他比 NLTK (www.nltk.org), python <实际上还有其他文库处理库/ code>

  • TextBlob: http://textblob.readthedocs.org/en/dev/
  • Gensim: http://radimrehurek.com/gensim/
  • Pattern: http://www.clips.ua.ac.be/pattern
  • Spacy:: http://spacy.io
  • Orange: http://orange.biolab.si/features/
  • Pineapple: https://github.com/proycon/pynlpl

(更多信息,请参阅 https://pypi.python.org/pypi?%3Aaction=search&term=natural+语言+处理&提交=搜索

对于 Java ,还有很多其他的,但这是另一个清单:

For Java, there're tonnes of others but here's another list:

  • Freeling: http://nlp.lsi.upc.edu/freeling/
  • OpenNLP: http://opennlp.apache.org/
  • LingPipe: http://alias-i.com/lingpipe/
  • Stanford CoreNLP: http://stanfordnlp.github.io/CoreNLP/ (comes with wrappers for other languages, python included)
  • CogComp NLP: https://github.com/CogComp/cogcomp-nlp

这是基本字符串处理的一个很好的比较,参见 http://nltk.googlecode.com/svn/trunk/doc/howto/nlp -python.html

This is a nice comparison for basic string processing, see http://nltk.googlecode.com/svn/trunk/doc/howto/nlp-python.html

GATE与UIMA与OpenNLP的有用比较,请参阅 https://www.assembla.com/spaces/extraction-of-cost-data / wiki / Gate-vs-UIMA-vs-OpenNLP?version = 4

A useful comparison of GATE vs UIMA vs OpenNLP, see https://www.assembla.com/spaces/extraction-of-cost-data/wiki/Gate-vs-UIMA-vs-OpenNLP?version=4

如果你不确定,这是NLP的语言,我个人说,任何能给你所需分析的语言/输出,请参阅使用哪种语言或工具学习自然语言处理?

If you're uncertain, which is the language to go for NLP, personally i say, "any language that will give you the desired analysis/output", see Which language or tools to learn for natural language processing?

这是最近(2017年)的NLP工具: https://github.com/alvations/awesome-community-curated-nlp

Here's a pretty recent (2017) of NLP tools: https://github.com/alvations/awesome-community-curated-nlp

旧的NLP工具列表(2013): http://web.archive.org/web/20130703190201/http://yauhenklimovich.wordpress.com/2013/05/20/tools-nlp

An older list of NLP tools (2013): http://web.archive.org/web/20130703190201/http://yauhenklimovich.wordpress.com/2013/05/20/tools-nlp

除语言处理工具外,你非常需要 机器学习 要合并到 NLP 管道中的工具。

Other than language processing tools, you would very much need machine learning tools to incorporate into NLP pipelines.

Python Java 中有一整个范围,再一次,它取决于首选项以及库是否足够用户友好:

There's a whole range in Python and Java, and once again it's up to preference and whether the libraries are user-friendly enough:

python中的机器学习库:

Machine Learning libraries in python:

  • Sklearn (Scikit-learn): http://scikit-learn.org/stable/
  • Milk: http://luispedro.org/software/milk
  • Scipy: http://www.scipy.org/
  • Theano: http://deeplearning.net/software/theano/
  • PyML: http://pyml.sourceforge.net/
  • pyBrain: http://pybrain.org/
  • Graphlab Create (Commerical tool but free academic license for 1 year): https://dato.com/products/create/

(更多信息,请参阅 https://pypi.python.org/pypi?%3Aaction=search&term=machine+learning&submit=search

  • Weka: http://www.cs.waikato.ac.nz/ml/weka/index.html
  • Mallet: http://mallet.cs.umass.edu/
  • Mahout: https://mahout.apache.org/

随着近期(2015)<啊ref =http://www.mitpressjournals.org/doi/pdf/10.1162/COLI_a_00239 =noreferrer> NLP中的深度学习海啸,您可以考虑: https://en.wikipedia.org/wiki/Comparison_of_deep_learning_software

With the recent (2015) deep learning tsunami in NLP, possibly you could consider: https://en.wikipedia.org/wiki/Comparison_of_deep_learning_software

我将避免将非深度学习工具列为非偏袒/中立性。

I'll avoid listing deep learning tools out of non-favoritism / neutrality.

其他Stackoverflow问题也要求NLP / ML工具:

Other Stackoverflow questions that also asked for NLP/ML tools:

  • Machine Learning and Natural Language Processing
  • What are good starting points for someone interested in natural language processing?
  • Natural language processing
  • Natural Language Processing in Java (NLP)
  • Is there a good natural language processing library
  • Simple Natural Language Processing Startup for Java
  • What libraries offer basic or advanced NLP methods?
  • Latest good languages and books for Natural Language Processing, the basics
  • (For NER) Entity Extraction/Recognition with free tools while feeding Lucene Index
  • (With PHP) NLP programming tools using PHP?
  • (With Ruby) https://stackoverflow.com/questions/3776361/ruby-nlp-libraries

这篇关于用于自然语言处理的Java或Python的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆