按出现次数对单词列表进行排序的最简单方法 [英] Easiest Way to Sort a List of Words by Occurance
问题描述
在 Java 中,按单词在列表中出现的次数对大量单词 (10,000-20,000) 进行排序的最佳/最简单方法是什么.我尝试了一个基本的实现,但出现内存不足的运行时错误,所以我需要一种更有效的方法.你有什么建议?
What is the best/easiest way to sort a large list of words (10,000-20,000) by the number of times they occur in the list, in Java. I tried a basic implementation but I get an out of memory runtime error, so I need a more efficient way. What would you suggest?
ArrayList<String> occuringWords = new ArrayList<String>();
ArrayList<Integer> numberOccur = new ArrayList<Integer>();
String temp;
int count;
for(int i = 0; i < finalWords.size(); i++){
temp = finalWords.get(i);
count = 0;
for(int j = 0; j < finalWords.size(); j++){
if(temp.equals(finalWords.get(j))){
count++;
finalWords.remove(j);
j--;
}
}
if(numberOccur.size() == 0){
numberOccur.add(count);
occuringWords.add(temp);
}else{
for(int j = 0; j < numberOccur.size(); j++){
if(count>numberOccur.get(j)){
numberOccur.add(j, count);
occuringWords.add(j, temp);
}
}
}
}
其中 finalWords 是所有字符串的列表.我不得不将每个单词出现的次数存储在一个单独的数组列表中,因为我想不出更好的方法来保持它们配对而不使每个单词成为一个单独的对象.
Where finalWords is the list of all of the Strings. I had to store the number of times each word occured in a separate arraylist because I couldn't think of a better way to keep them paired without making each word a separate object.
推荐答案
Multiset 就是您从 google 集合中看到的.该数据结构正是为支持您的用例而构建的.你需要做的就是用你的话来填充它.它会为你保持频率
The Multiset is what you are looking from google collections. That data structure is exactly built to support your use cases. All you need to do is populate it with your words. It will maintain the frequency for you
这篇关于按出现次数对单词列表进行排序的最简单方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!