如何使用Perl从一组字母中生成单词列表? [英] How can I generate a list of words from a group of letters using Perl?

查看:81
本文介绍了如何使用Perl从一组字母中生成单词列表?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找模块,正则表达式或其他可能适用于此问题的东西.

I was looking for a module, regex, or anything else that might apply to this problem.

如何以编程方式解析字符串并创建已知的英语& |假设我有一个字典表,可以用西班牙语表检查匹配的算法随机化的每个排列吗?

How can I programatically parse the string and create known English &| Spanish words given that I have a dictionary table against which I can check each permutation of the algorithm's randomization for a match?

给出一组字符:EBLAIDL KDIOIDSI ADHFWB

程序应返回:BLADE AID KID KIDS FIDDLE HOLA等....

The program should return: BLADE AID KID KIDS FIDDLE HOLA etc....

我还希望能够定义最小&最大字长以及音节数

I also want to be able to define the minimum & maximum word length as well as the number of syllables

输入长度无所谓,只能是字母,标点无所谓.

The input length doesn't matter, it must be only letters, and punctuation doesn't matter.

感谢您的帮助

编辑
输入字符串中的字母可以重复使用.

EDIT
Letters in the input string can be reused.

例如,如果输入为:ABLED,则输出可能包含:BALLBLEED

For example, if the input is: ABLED then the output may contain: BALL or BLEED

推荐答案

您尚未指定,因此我假设输入中的每个字母只能使用一次.

You haven't specified, so I'm assuming each letter in the input can only be used once.

[您自输入中指定的字母以来可以使用多次,但是如果有人觉得有用,我将在此处保留此帖子.]

[You have since specified letters in the input can be used more than once, but I'm going to leave this post here in case someone finds it useful.]

有效执行此操作的关键是对单词中的字母进行排序.

The key to doing this efficiently is to sort the letters in the words.

abracadabra => AAAAABBCDRR
abroad      => AABDOR
drab        => ABDR

然后很明显,单调"在"abracadabra"中.

Then it becomes clear that "drab" is in "abracadabra".

abracadabra => AAAAABBCDRR
drab        => A    B  DR

那国外"不是.

abracadabra => AAAAABBCD RR
abroad      => AA   B  DOR

我们将排序后的字母称为签名".如果您可以从"A"的签名中删除字母以获得"B"的签名,则单词"A"中的单词"B".使用正则表达式模式很容易检查.

Let's call the sorted letter the "signature". Word "B" in is in word "A" if you can remove letters from the signature of "A" to get the signature of "B". That's easy to check using a regex pattern.

sig('drab') =~ /^A?A?A?A?A?B?B?C?D?R?R?\z/

或者,如果我们消除不必要的回溯以提高效率,我们会得到

Or if if we eliminate needless backtracking for efficiency, we get

sig('drab') =~ /^A?+A?+A?+A?+A?+B?+B?+C?+D?+R?+R?+\z/

现在我们知道我们想要什么模式了,只需要构建它即可.

Now that we know what pattern we want, it's just a matter of building it.

use strict;
use warnings;
use feature qw( say );

sub sig { join '', sort grep /^\pL\z/, split //, uc $_[0] }

my $key = shift(@ARGV);

my $pat = sig($key);
$pat =~ s/.\K/?+/sg;
my $re = qr/^(?:$pat)\z/s;

my $shortest = 9**9**9;
my $longest  = 0;
my $count    = 0;
while (my $word = <>) {
   chomp($word);
   next if !length($word);  # My dictionary starts with a blank line!! 
   next if sig($word) !~ /$re/;
   say $word;
   ++$count;
   $shortest = length($word) if length($word) < $shortest;
   $longest  = length($word) if length($word) > $longest;
}

say "Words:    $count";
if ($count) {
   say "Shortest: $shortest";
   say "Longest:  $longest";
}

示例:

$ perl script.pl EBLAIDL /usr/share/dict/words
A
Abe
Abel
Al
...
libel
lid
lie
lied

Words:    117
Shortest: 1
Longest:  6

这篇关于如何使用Perl从一组字母中生成单词列表?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆