解析算法,用于分析特定的段落或文本 [英] Parsing algorithms which are used for analyzing a particular para or text

查看:123
本文介绍了解析算法,用于分析特定的段落或文本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我们有一个文本文档,如果我们输入一个字符串,那么我们应该使用哪种算法和解析技术来查找与输入字符串相关的文本。

Suppose we have a text document with us and if we input a string then what sort of algorithms and parsing techniques should we use to find the text that which is relevant to the input string.

推荐答案

这里有很多答案,这些答案大都超出了问题的范围。



1.正如Guirec所说,如果你''正在寻找文本中的单词/短语,使用String.Contains(),或者用于查找子串的特定语言。



< edit>只会告诉你字符串是否包含字符串,要找到它你需要使用更像String.IndexOf()的东西,它将返回更大字符串中子字符串的位置。





2.如果您需要匹配数字,单词等内容,请查找正则表达式。维基百科将有一个合理的描述。



如果你的需求比这更复杂,并且你想用某种计算机语言解析文本,你需要得到进入一些认真阅读。解析语言是一项繁重的任务。关于龙书的标准文本 - 谷歌搜索将指导你 - 但有很多选择。



如果你想解析自然语言文本,你正在尝试一项甚至是最好的计算机科学家的任务,并且无法通过算法解决,但需要先进的人工智能方法,我甚至不会开始理解,可能还是很多计算能力IBM的Watson做了一次可以接受的尝试,但是我们大多数人都没有这样的预算。
There are a number of answers here, all mostly beyond the scope of a question here.

1. As Guirec says, if you''re looking for a word/phrase in the text, use String.Contains(), or whatever a particular language offers for finding substrings.

<edit> that will just tell you if the string contains the string, to locate it you''ll need to use something more like String.IndexOf(), which will return the position of the substring in the larger string.


2. If you need to match things like numbers, words, etc. Lookup Regular Expressions. Wikipedia will have a reasonable description.

If your needs are more complex than that, and you want to parse text in some computer language, you''ll need to get into some serious reading. Parsing languages is a heavy task. The standard text on that is the Dragon Book - a google search will direct you to it - but there are many alternatives.

If you''re trying to parse natural language text, you''re attempting a task that stumps even the best computer scientists, and is not solvable by an algorithm, but needs advanced Artificial Intelligence approachs that I don''t even begin to understand, and probably a hell of a lot of computing power. IBM''s Watson managed a passable attempt, but most of us don''t have that kind of budget.


这篇关于解析算法,用于分析特定的段落或文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆