穿越一个特技以获取所有的话 [英] Traversing a trie to get all words

查看:177
本文介绍了穿越一个特技以获取所有的话的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经编写了Perl代码来实际创建一个 Trie 数据结构,给出了一组数组中的单词。现在我在遍历和打印单词时遇到问题。

I have written Perl code to actually create a Trie datastructure given a set of words in an array. Now I have problems traversing and printing the words.

还粘贴了创建的Datastructure的Dumper输出。

Also pasted the Dumper output of the Datastructure created.

遍历后的最后一组单词似乎不正确,因为遍历逻辑肯定缺少某些东西。但是,这个创作很好,工作速度很快。有人可以帮助我吗?

The final set of words after traversal doesn't seem to be right since the traversal logic is certainly missing something. But the trie creation is fine and works fast. Can someone help me here?

这个特技的顶级是哈希


    <每个哈希项都有一个
    字母的键,每个散列指向一个
    数组ref。

数组ref再次包含
哈希表,每个哈希项为
与1相同

Array ref again contains a list of hashes and each hash item is same as 1

如果您看到输出中的第一个单词。

If you see the first word in the output. It comes up as archtopriumwe.

我们应该获得 arc,arch,atop,atrium,awe

代码


use Data::Dumper;
my %mainhash;

## Subroutine
sub storeword   {
    my $type = shift;
    my $fc = shift;
    my $word = shift;
    return if ((not defined $word) or (length($word) == 0));    
    my @letters =  split (//,$word);
    my $len = scalar(@letters) - 1;
    my ($arr_ref,$pass_ref,$flag ,$i,$hashitem,$newitem);
    $pass_ref = $hashitem = $new_item = undef;
    $arr_ref = $type;
    $setstop = 1 if (length($word) == 1);   
    $flag =0;
    for($i = 0;$i {$letters[0]}) {
            $flag =1;
            $pass_ref = $hashitem->{$letters[0]};
            last;
        }
    }
    if ($flag == 0) {
        $newitem->{$letters[0]} = [];   
        push(@$arr_ref,$newitem);
        $pass_ref = $newitem->{$letters[0]};
    }

    storeword($pass_ref,$letters[0],join ('',@letters[ 1..$len]));
}

## Subroutine
sub process {
    my ($prefix,$trie) = @_;
    for my $letter (sort keys %$trie) {
        if ( @{ $trie->{$letter} } ) {
            for my $branch (@{ $trie->{$letter} }) {
                process("$prefix$letter", $branch);
            }
        }
        else {
            print "$prefix$letter\n";
        }
    }
}

##main

##list of words
my @wd =  qw (arc atop awe blob boil fame tub arch atrium);

##inserting each word into the datastructure
foreach my $w (@wd) {
    my @letters = split (//,$w);
    my $len = scalar(@letters) - 1;
    if (not exists $mainhash{$letters[0]})  {
        $mainhash{$letters[0]} = [];
    }
    storeword($mainhash{$letters[0]},$letters[0],join ('',@letters[ 1..$len])); 
}
    print Dumper(%mainhash);
        ## Trying to print each word from trie.
    print("\n List of words\n");    
    process('',\%mainhash);


输出:


$VAR1 = 'a';
$VAR2 = [
          {
            'r' => [
                     {
                       'c' => [
                                {
                                  'h' => []
                                }
                              ]
                     }
                   ]
          },
          {
            't' => [
                     {
                       'o' => [
                                {
                                  'p' => []
                                }
                              ]
                     },
                     {
                       'r' => [
                                {
                                  'i' => [
                                           {
                                             'u' => [
                                                      {
                                                        'm' => []
                                                      }
                                                    ]
                                           }
                                         ]
                                }
                              ]
                     }
                   ]
          },
          {
            'w' => [
                     {
                       'e' => []
                     }
                   ]
          }
        ];
$VAR3 = 'b';
$VAR4 = [
          {
            'l' => [
                     {
                       'o' => [
                                {
                                  'b' => []
                                }
                              ]
                     }
                   ]
          },
          {
            'o' => [
                     {
                       'i' => [
                                {
                                  'l' => []
                                }
                              ]
                     }
                   ]
          }
        ];
$VAR5 = 'f';
$VAR6 = [
          {
            'a' => [
                     {
                       'm' => [
                                {
                                  'e' => []
                                }
                              ]
                     }
                   ]
          }
        ];
$VAR7 = 't';
$VAR8 = [
          {
            'u' => [
                     {
                       'b' => []
                     }
                   ]
          }
        ];

List of words
archtopriumwe
bloboil
fame
tub


推荐答案

你看到你的代码是只打印数据结构中的每个字母一次,而不是每个字一次?并且只为树中的每个顶级字母打印一次换行符,而不是一个字。

Do you see that your code as is is only printing each letter in the datastructure once, instead of once per word it is in? And only printing a newline once for each top-level letter in the tree, not one per word?

要解决这个问题,您需要将一些更多的上下文传递给您的递归子这样的东西:

To fix this, you need to pass some more context into your recursive sub. Something like this:

sub process {
    my ($prefix, $trie) = @_;
    for my $letter (sort keys %$trie) {
        if ( @{ $trie->{$letter} } ) {
            for my $branch (@{ $trie->{$letter} }) {
                process("$prefix$letter", $branch);
            }
        }
        else {
            print "$prefix$letter\n";
        }
    }
}

print("\n List of words\n");
process('', \%mainhash);

这不打印圆弧,因为你没有办法告诉你的数据结构中的弧是字,但例如boi不是。每个字母的值需要提供两件事情:一个布尔指标,这是一个单词的结尾,以及可能的以下字母及其子代码列表。

This doesn't print arc, because you provide no way to tell in your datastructure that arc is a word but e.g. boi is not. The value for each letter needs to provide two things: a boolean indicator that this is the end of a word, and a list of possible following letters and their sub-trie.

这篇关于穿越一个特技以获取所有的话的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆