倒排索引和普通旧索引之间有什么区别? [英] What's the difference between an inverted index and a plain old index?

查看:264
本文介绍了倒排索引和普通旧索引之间有什么区别?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在软件工程中,我们一直在创建索引(例如,在数据库中),但我也听到很多人谈论倒排索引。这两者之间有什么根本不同的东西吗?它们听起来像是一回事。

In software engineering we create indexes all the time (e.g., in databases) but I also hear a lot of people talk about inverted indices. Is there something fundamentally different between the two? They sound like the same thing.

推荐答案

一个常见用途是...允许快速全文搜索。

这两种类型表示方向性即可。一个带你通过索引前进,另一个带你通过索引向后(反向)。而已。在这里发现并不神秘。否则这两种类型是相同的,只是您拥有的信息的问题,因此您尝试找到的信息是什么。

The two types denote directionality. One takes you forward through the index, and the other takes you backward (the inverse) through the index. That's it. There's no mystery to uncover here. Otherwise the two types are identical, it's just a question of what information you have, and as a result what information you're trying to find.

为了解决您的问题,我认为实际上没有办法知道为什么使用它是今天的。定义哪个是前进以及哪一个倒置的唯一原因是我们都可以进行对话关于他们,每个人都知道我们正在谈论的方向。想想左和右这两个词:它们是相对的。哪个是无关紧要的,除了每个人都需要同意哪一个是左,哪一个是正确以使这些词具有意义。如果作为一种文化,我们决定左右翻转,那么你就会有同样的问题来弄清楚右转和左转是什么,因为商定的意义已经改变了。然而,命名是任意的,所以哪一个(其本身)无关紧要 - 重要的是我们都同意的含义。

To address your inquiry, I don't think there's actually a way to know why the use is what it is today. The only reason it's important to define which is forward and which one is inverted is so that we can all have a conversation about them, and everyone knows which direction we're talking about. Think about the terms "left" and "right": they are relative. Which is which doesn't matter, except that everyone needs to agree which one is "left" and which one is "right" in order for the words to have meaning. If, as a culture, we decided to flip left and right, then you'd have the same issue figuring out what a "right turn" vs a "left turn" is since the agreed upon meaning had changed. However, the naming is arbitrary, so which one is which (in and of itself) doesn't matter - what matters is that we all agree on the meaning.

在你的评论中,你要问的是请不要只定义条款,你错过了这一点,而且我认为你只是在没有差别的情况下挂断了措辞它们之间。






为了未来读者的利益,我现在将提供几个前向和倒置索引示例:

In your comment where you ask, "please don't just define the terms", you're missing the point, and I think you're just getting hung up on the wording when there is absolutely no difference between them.


For the benefit of future readers, I will now provide several "forward" and "inverted" index examples:

如果您认为索引的反转类似于逆向数学函数,其中逆是一种特殊的东西,有不同的形式,那你错了:这不是这里的情况。

If you're thinking that the inverse of an index is something like the inverse of a function in mathematics, where the inverse is a special thing that has a different form, then you're mistaken: that's not the case here.

在搜索引擎中你有一份文件清单(网站上的网页),你输入一些关键字并获得结果。

In a search engine you have a list of documents (pages on web sites), where you enter some keywords and get results back.

转发索引(或只是索引)是文档列表,以及哪些文字出现在其中。在网络搜索示例中,Google抓取网络,构建文档列表,找出每个页面中显示的单词。

A forward index (or just index) is the list of documents, and which words appear in them. In the web search example, Google crawls the web, building the list of documents, figuring out which words appear in each page.

倒排索引单词列表,以及它们出现的文档。在网络搜索示例中,您提供了单词列表(您的搜索查询),Google会生成文档(搜索结果链接)。

The inverted index is the list of words, and the documents in which they appear. In the web search example, you provide the list of words (your search query), and Google produces the documents (search result links).

它们都是索引 - 它是只是你要去哪个方向的问题。转发是从文档 - >到 - >单词,倒置是从单词 - >到 - >文档。



They are both indexes - it's just a question of which direction you're going. Forward is from documents->to->words, inverted is from words->to->documents.

另一个例子是DNS查找(它采用主机名,并返回IP地址)和反向查找(它采用IP地址,并为您提供主机名)。



Another example is a DNS lookup (which takes a host name, and returns an IP address) and a reverse lookup (which takes an IP address, and gives you the host name).

书后面的索引实际上是一个倒排索引,由上面的例子定义 - 一个单词列表,以及在书中找到它们的位置。在一本书中,目录就像一个正向索引:它是本书所包含的文档(章节)列表,除了不在这些部分中列出单词外,目录只是给出了这些文件(章节)中包含的内容的名称/一般描述。



The index in the back of a book is actually an inverted index, as defined by the examples above - a list of words, and where to find them in the book. In a book, the table of contents is like a forward index: it's a list of documents (chapters) which the book contains, except instead of listing the words in those sections, the table of contents just gives a name/general description of what's contained in those documents (chapters).

手机中的转发索引是您的联系人列表,以及与这些联系人关联的电话号码(小区,家庭,工作)。 倒置索引允许您手动输入电话号码,当您点击拨号时,您会看到此人的姓名,而不是号码,因为您的手机已经取了电话号码并找到了您与之关联的联系人。

The forward index in your cell phone is your list of contacts, and which phone numbers (cell, home, work) are associated with those contacts. The inverted index is what allows you to manually enter a phone number, and when you hit "dial" you see the person's name, rather than the number, because your phone has taken the phone number and found you the contact associated with it.

这篇关于倒排索引和普通旧索引之间有什么区别?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆