给定单词列表-什么是Java中单词完成的好算法?权衡:速度/效率/内存占用量 [英] Given a list of words - what would be a good algorithm for word completion in java? Tradeoffs: Speed/efficiency/memory footprint

查看:82
本文介绍了给定单词列表-什么是Java中单词完成的好算法?权衡:速度/效率/内存占用量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在探索潜在的免费/付费应用程序的硬件/软件要求(最终目标是移动Java应用程序).

I'm exploring the hardware/software requirements (ultimate goal is mobile Java app) for a potential free/paid application.

应用程序将以一个简单的目标开始:给定数据库中的相关单词列表,以便能够在单个字符串输入上完成单词填充.

The application will start with this simple goal: Given a list of relevant words in a database, to be able to do word completion on a single string input.

换句话说,我已经知道数据库的内容-但是该算法的内存占用量/速度/搜索效率将决定所支持的数据量.

In other words I already know the contents of the database - but the memory footprint/speed/search efficiency of the algorithm will determine the amount of data supported.

我从一开始就基于后缀的树搜索开始,但是我想知道是否有人对这种简单方法的速度/内存大小的权衡以及与会议中所讨论的更复杂的权衡取舍.

I have been starting at the beginning with suffix-based tree searches, but am wondering if anyone has experience with the speed/memory size tradeoffs of this simple approach vs. the more complex ones being talked about in the conferences.

老实说,初始应用程序在上下文中可能只有不到500个单词,所以这可能无关紧要,但是最终应用程序可以扩展到成千上万的记录-因此,速度与内存占用量有关.

Honestly the initial application only has probably less than 500 words in context so it might not matter, but ultimately the application could expand to tens of thousands or hundreds of thousands of record - thus the question about speed vs. memory footprint.

我想我可以从一些简单的事情开始,然后再切换,但是我希望能早日了解到这种权衡!

I suppose I could start with something simple and switch over later, but I hope to understand the tradeoff earlier!

推荐答案

单词完成建议您要查找以给定前缀开头的所有单词.

Word completion suggests that you want to find all the words that start with a given prefix.

尝试对此非常有用,如果要添加或删除元素,则特别好-其他节点不需要重新分配.

Tries are good for this, and particularly good if you're adding or removing elements - other nodes do not need to be reallocated.

如果字典是相当静态的,并且检索很重要,请考虑一个更简单的数据结构:将单词放在有序向量中!您可以二进制搜索来查找以正确前缀开头的候选人,然后进行线性搜索它的每一面都可以发现所有其他候选人.

If the dictionary is fairly static, and retrieval is important, consider a far simpler data structure: put your words in an ordered vector! You can do binary-search to discover a candidate starting with the correct prefix, and a linear search each side of it to discover all other candidates.

这篇关于给定单词列表-什么是Java中单词完成的好算法?权衡:速度/效率/内存占用量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆