不区分大小写的三元搜索树 [英] Case Insensitive Ternary Search Tree

查看:149
本文介绍了不区分大小写的三元搜索树的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

一段时间以来,我一直在使用三元搜索树作为要实现的数据结构自动完成下拉组合框。这意味着,当用户键入 fo时,下拉组合框将显示

I had been using Ternary Search Tree for a while, as the data structure to implement a auto complete drop down combo box. Which means, when user type "fo", the drop down combo box will display

foo
食物
足球

foo food football

问题是,我当前使用的三元搜索树区分大小写。我的实现如下。它已被现实世界使用了大约1 ++年。因此,我认为它非常可靠。

The problem is, my current used of Ternary Search Tree is case sensitive. My implementation is as follow. It had been used by real world for around 1++ yeas. Hence, I consider it as quite reliable.

我的三元搜索树代码

但是,我是寻找不区分大小写的三元搜索树,这意味着,当我键入 fo时,下拉组合框将向我显示

However, I am looking for a case insensitive Ternary Search Tree, which means, when I type "fo", the drop down combo box will show me

foO
Food
fooTBall

foO Food fooTBall

以下是TST的一些关键接口,我希望新的区分大小写的TST可能也有类似的接口。

Here are some key interface for TST, where I hope the new case insentive TST may have similar interface too.

    /** 
 * Stores value in the TernarySearchTree. The value may be retrieved using key.
 * @param key A string that indexes the object to be stored.
 * @param value The object to be stored in the tree.
 */    
public void put(String key, E value) {
    getOrCreateNode(key).data = value;
}

/**
 * Retrieve the object indexed by key.
 * @param key A String index.
 * @return Object The object retrieved from the TernarySearchTree.
 */
public E get(String key) {
    TSTNode<E> node = getNode(key);
    if(node==null) return null;
    return node.data;
}

用法示例如下。 TSTSearchEngine使用TernarySearchTree作为核心骨干。

An example of usage is as follow. TSTSearchEngine is using TernarySearchTree as the core backbone.

三元搜索树的用法示例

// There is stock named microsoft and MICROChip inside stocks ArrayList.
TSTSearchEngine<Stock> engine = TSTSearchEngine<Stock>(stocks);
// I wish it would return microsoft and MICROCHIP. Currently, it just return microsoft.
List<Stock> results = engine.searchAll("micro"); 


推荐答案

使我目前进行三元搜索的关键因素之一Tree难于支持不区分大小写的搜索是,我的底层数据结构是一对一映射。请查看以下测试代码:

One of the key factor which make my current Ternary Search Tree difficult to support case insensitive search is that, my underlying data structure is one-to-one mapping. Please look at the following test code :

public void testPut() {
    System.out.println("put");
    Name name0 = new Name("abc");
    Name name1 = new Name("abc");
    TernarySearchTree<Name> instance = new TernarySearchTree<Name>();
    instance.put(name0.toString(), name0);
    instance.put(name1.toString(), name1);
    assertEquals(2, instance.matchPrefix("a").size()); // fail here. Result is 1
}

我当前使用的短期解决方案是TSTSearchEngine封装了整个TernarySearchTree。 TSTSearchEngine包含

What my current short-term solution is that, I am using TSTSearchEngine to wrap up the whole TernarySearchTree. TSTSearchEngine is comprised of

(1)TernarySearchTree,提供要映射的大写键。


(2)一个String-To-ArrayList映射。

(1) A TernarySearchTree, providing UPPER-CASE key to map.

(2) A String-To-ArrayList map.

当我执行以下操作时会发生以下情况:

Here is what happen when I perform :

TSTSearchEngine<Name> engine = TSTSearchEngine<Name>();
engine.put(name0); // name0 is new Name("Abc");
engine.put(name1); // name0 is new Name("aBc");

(1)name0.toString()将转换为大写( ABC)。它将插入到TernarySearchTree中。对于TernarySearchTree, ABC既是键又是值。


(2) ABC将用作映射的键​​,以将name0插入到数组列表中。


(3)name1.toString()将转换为大写( ABC)。它将插入到TernarySearchTree中。 S1将是TernarySearchTree的键和值。


(4) ABC将用作映射的键​​,以将name1插入数组列表。

(1) name0.toString() will be converted to UPPER-CASE ("ABC"). It will be inserted to TernarySearchTree. "ABC" will be both key and value for TernarySearchTree.

(2) "ABC" will use as the key for map, to insert name0 into an array list.

(3) name1.toString() will be converted to UPPER-CASE ("ABC"). It will be inserted to TernarySearchTree. S1 will be both key and value for TernarySearchTree.

(4) "ABC" will use as the key for map, to insert name1 into an array list.

当我尝试

engine.searchAll("a");

(1)TernarySearchTree将返回 ABC。


(2) ABC将用作访问地图的钥匙。 Map将返回一个数组列表,其中包含name0和name1。

(1) TernarySearchTree will return "ABC".

(2) "ABC" will be used as the key to access map. Map will return an array list, which is containing name0 and name1.

此解决方案有效。可以参考以下示例代码新TSTSearchEngine的示例代码

This solution works. The sample code can be referred to Sample Code for New TSTSearchEngine

但是,这可能不是一个有效的解决方案,因为它需要两次搜索。我发现在C ++中有一个实现案例的C ++实现不敏感的三元搜索树。因此,有机会将C ++代码移植到Java。

However, this may not be an effective solution, as it requires two pass of search. I find out there is an implementation in C++ C++ Implementation of Case Insensitive Ternary Search Tree. Hence, there is an opportunity that C++ code can be ported over to Java.

这篇关于不区分大小写的三元搜索树的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆