用前导数字排列字符串的向量 [英] Sorting vector of strings with leading numbers

查看:126
本文介绍了用前导数字排列字符串的向量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在做一个家庭作业问题,需要我从输入文件中读取单词,并输入一个整数k。该解决方案需要打印出一个单词及其频率列表,从最频繁到最频繁。如果唯一字的数量小于k,那么只能输出该字数。

这可能是地图等容器的蛋糕,但是这个问题限制了我只能使用向量和字符串,没有其他的STL容器。

我卡在一个文件中的所有单词列表和它们的对应点频率。现在我需要根据它们的频率对它们进行排序并输出k个单词。

问题是,排序很困难。频率可以是不同的数字。如果我通过填充零来使用 string :: sort()对它们进行排序,我将无法知道有多少个零填充,因为程序员不知道输入。



这是我的函数代码:

$ p $ void $ word_frequencies(ifstream& amp; amp; ; inf,int k)
{
vector< string>输入;
string w;
while(inf> w)
{
remove_punc(w);
input.push_back(w);

sort(input.begin(),input.end());

//初始化频率向量
向量< int>频率;
for(size_t i = 0; i< input.size(); ++ i)freq.push_back(1);

//计算实际频率
int count = 0;
for(size_t i = 0; i< input.size() - 1; ++ i)
{
if(input [i] == input [i + 1])
{
++ count;
} else
{
freq [i] + = count;
count = 0;
}
}

//单词+频率
vector< string> WF;
for(size_t i = 0; i< freq.size() - 1; ++ i)
{

if(freq [i]> 1 || is_unique(input,input [i]))
{
string s = to_string(freq [i])++ input [i];
wf.push_back(s);





$ b $ p $在第一个字的频率?我知道这是凌乱的,所以我正在寻找一个更优雅的解决方案。



谢谢!

解决方案

你想排序你的频率向量,但是那么你就失去了相应的单词的轨道。正如所建议的那样,使用一个自定义比较函数的结构可能是可取的:

pre $ struct $ struct $
std :: string字;
};

bool运算符<(const word_freq& a,const word_freq& b){
return a.freq< b.freq;



$ b现在,有一个 std :: vector< word_freq> ; wf; 并应用 std :: sort(wf.begin(),wf.end())应该排序列表min - > max。要打印频率最高的 k 字,您可以从 wf 列表的后面进行打印。

b $ b

I'm working on a homework problem which requires me to read in words from an input file, and an integer k. The solution needs to print out a list of words and their frequencies, ranging from the most frequent to the k-th most frequent. If the number of unique words is smaller than k then only output that number of words.

This would have been cake with containers like map, but the problem constrains me to be able to use vectors and strings only and no other STL containers.

I'm stuck at the point where I have a list of all the words in a file and their corresponding frequencies. Now I need to sort them according to their frequencies and output k words.

The problem is, sorting is difficult. The frequencies can be of different digits. If I sort them using string::sort() by padding zeros, I won't be able to know how many zeros to pad since input is unknown to the programmer.

Here's my code for the function:

void word_frequencies(ifstream& inf, int k)
{
    vector <string> input;
    string w;
    while (inf >> w)
    {
        remove_punc(w);
        input.push_back(w);
    }
    sort(input.begin(), input.end());

    // initialize frequency vector
    vector <int> freq;
    for (size_t i = 0; i < input.size(); ++i) freq.push_back(1);

    // count actual frequencies
    int count = 0;
    for (size_t i = 0; i < input.size()-1; ++i)
    {
        if (input[i] == input[i+1])
        {
            ++count;
        } else
        {
            freq[i] += count;
            count = 0;
        }
    }

    // words+frequencies
    vector <string> wf;
    for (size_t i = 0; i < freq.size()-1; ++i)
    {

        if (freq[i] > 1 || is_unique(input, input[i]))
        {
            string s = to_string(freq[i]) + " " + input[i];
            wf.push_back(s);
        }
    }
}

Also, should I even couple the frequency with the word in the first place? I know this is messy so I'm looking for a more elegant solution.

Thanks!

解决方案

If I understand you, your problem is that you want to sort your frequency vector, but that then you lose track of their corresponding word. As suggested, using a struct with a custom comparison function is probably desirable:

struct word_freq {
    int freq;
    std::string word;
};

bool operator<(const word_freq& a, const word_freq& b) {
    return a.freq < b.freq;
}

Now, having a std::vector<word_freq> wf; and applying std::sort(wf.begin(), wf.end()) should order your list min -> max. To print the k words with highest frequency you would print from the back of the wf list.

这篇关于用前导数字排列字符串的向量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆