Powershell,内置的集合交叉点? [英] Powershell, kind of set intersection built-in?

查看:32
本文介绍了Powershell,内置的集合交叉点?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对于一些需要从一堆松散字母中找到字谜的游戏,我最终实施了一种置换算法来查找所有可能的字谜,并在需要时过滤已知字母位置的字谜(-match顺便说一句,很棒).但是对于较长的单词,这被证明非常容易出错,因为浏览大量乱码并不能真正揭示隐藏在其中的正确单词.

For some game where one would need to find anagrams from a bunch of loose letters I ended up implementing a permutation algorithm to find all possible anagrams and filter those if needed for known letter positions (-match is great, by the way). But for longer words this proved very much error-prone, as skimming a large list of gibberish doesn't really reveal the proper words that were hidden within.

所以我认为如果我会有一个很大的英语单词列表(应该可以在某个地方获得)我可以将我的排列列表与正确的列表相交单词并(希望)从排列列表中获取所有真实单词.

So I thought that if I would have a large list of English words (should be obtainable somewhere) I could just intersect my list of permutations with the list of proper words and get (hopefully) all real words from the permutation list.

由于 PS 中的许多运算符对集合的工作方式不同,我想我可以做类似的事情

Since many operators in PS work differently with collections I thought I could just do something like

$wordlist -contains $permlist

然后回到路口.不幸的是,这并不容易.我想到的其他选项是遍历一个列表并为每个项目执行 -contains:

and get the intersection back. Unfortunately it's not that easy. Other options I have thought of would be to iterate over one list and do a -contains for each item:

$permlist | ? { $wordlist -contains $_ }

我认为这可能会起作用,但也很慢(尤其是当 $wordlistgc wordlist.txt 的结果时).或者我可以构建一个巨大的正则表达式:

This probably would work but is also very slow, I think (especially when $wordlist is the result of a gc wordlist.txt). Or I could build a gigantic regular expression:

$wordlist -matches (($permlist | %{ "^$_`$" }) -join "|")

但这也可能不会很快.我也可以将 findstr 与上述巨大的正则表达式一起使用,但这感觉不对.

But that would probably not be very fast either. I could maybe also use findstr with above gigantic regex but that feels just wrong.

是否有任何我可以使用的内置解决方案,并且比我迄今为止的尝试更好?否则我可能会将单词列表放入哈希表并使用迭代的 -contains 方法,这样应该足够快.

Are there any built-in solutions I could use and that are better than my attempts so far? Otherwise I'd probably put the word list into a hashtable and use the iterative -contains approach which should be fast enough then.

推荐答案

$left = New-HashSet string
$left.Add("foo")
$left.Add("bar")
$right = New-HashSet string
$right.Add("bar")
$right.Add("baz")

$left.IntersectWith($right)
$left.UnionWith($right)

(从 Josh Einstein 借用 New-HashSet)

(borrowing New-HashSet from Josh Einstein)

警告:HashSet 上的那些方法是修改原始集合的就地算法.如果您想对不可变对象进行函数式转换,则需要将 LINQ 带入聚会:

Warning: those methods on HashSet are in-place algorithms that modify the original collection. If you want functional-style transform on immutable objects, you'll need to bring LINQ to the party:

add-type system.core

$asqueryable = [system.linq.queryable].getmethods() | ? { $_.name -eq "AsQueryable" } | select -first 1
$asqueryable = $asqueryable.MakeGenericMethod([string])
$leftAsQueryable = $asqueryable.Invoke($null, (,$left))

$intersect = [system.linq.queryable].getmethods() | ? { $_.name -eq "Intersect" } | select -first 1
$intersect = $intersect.MakeGenericMethod([string])
$result = $intersect.Invoke($null, ($leftAsQueryable, $right))

显然,有人需要将这个静态通用反射废话包装到一个友好的 cmdlet 中!别担心,我正在努力……

Clearly, someone needs to wrap this static-generic-reflection crap into a friendly cmdlet! Don't worry, I'm working on it...

这篇关于Powershell,内置的集合交叉点?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆