两个列表串交会 [英] Intersection of Two Lists Of Strings

查看:150
本文介绍了两个列表串交会的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我不得不沿着这些线路的面试问题:

  

鉴于无序客户两个列表,返回两个列表的交集的列表。也就是说,返回出现在两个列表中的客户的列表。

有些事情我建立:

  • 在假定每个客户都有一个唯一的名字
  • 如果名称是在这两个列表中的一样,这是同一个客户
  • 在该名称的形式名字的姓氏
  • 有没有挂羊头卖狗肉II的,小的,怪异的人物等。

我认为这一点是找到一个有效的算法/使用的数据结构为有效地做到这一点成为可能。

我的进步是这样的:

  • 读取一个列表到存储器,然后读其他列表一项的时间,以查看是否有匹配
  • 按字母顺序排列两份名单,然后开始在一个列表顶部,看看每个项目出现在其他目录
  • 在将两个清单到有序列表,然后用更短的清单逐项检查项目(这种方式,它一个列表中有2项,你只检查这些2项)
  • 将一个列表到一个哈希,并检查按键的存在,从另一个列表

面试官不断地问,下一步是什么?,所以我想我失去了别的东西。

任何其他的技巧有效地做到这一点?

侧面说明,这个问题是蟒蛇,我刚刚看了一下,这似乎这样做尽可能有效。任何想法的集的数据结构/算法是?

解决方案
  1. 将一个列表插入到布隆过滤器,并用它来筛选第二个列表。
  2. 将过滤后第二个列表到布隆过滤器并用它来过滤第一个列表。
  3. 排序两个列表和通过上述的方法之一找到的交叉点。

该方法的好处(除了让你在接受采访时正确地使用半晦涩的数据结构)是,它不需要任何O(n)的存储,直到你有后(以很高的概率)降低了问题的规模


  

面试官不断地问,下一步是什么?,所以我想我失去了别的东西。

也许他们只是一直问那,直到你用完了答案。


HTTP://$c$c.google.com/p /蟒蛇绽放过滤器/ 是一个Python实现的布隆过滤器。

I had an interview question along these lines:

Given two lists of unordered customers, return a list of the intersection of the two lists. That is, return a list of the customers that appear in both lists.

Some things I established:

  • Assume each customer has a unique name
  • If the name is the same in both lists, it's the same customer
  • The names are of the form first name last name
  • There's no trickery of II's, Jr's, weird characters, etc.

I think the point was to find an efficient algorithm/use of data structures to do this as efficiently as possible.

My progress went like this:

  • Read one list in to memory, then read the other list one item at a time to see if there is a match
  • Alphabetize both lists then start at the top of one list and see if each item appears in the other list
  • Put both lists into ordered lists, then use the shorter list to check item by item (that way, it one list has 2 items, you only check those 2 items)
  • Put one list into a hash, and check for the existence of keys from the other list

The interviewer kept asking, "What next?", so I assume I'm missing something else.

Any other tricks to do this efficiently?

Side note, this question was in python, and I just read about sets, which seem to do this as efficiently as possible. Any idea what the data structure/algorithm of sets is?

解决方案

  1. Put one list into a bloom filter and use that to filter the second list.
  2. Put the filtered second list into a bloom filter and use that to filter the first list.
  3. Sort the two lists and find the intersection by one of the methods above.

The benefit of this approach (besides letting you use a semi-obscure data structure correctly in an interview) is that it doesn't require any O(n) storage until after you have (with high probability) reduced the problem size.


The interviewer kept asking, "What next?", so I assume I'm missing something else.

Maybe they would just keep asking that until you run out of answers.


http://code.google.com/p/python-bloom-filter/ is a python implementation of bloom filters.

这篇关于两个列表串交会的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆