如何获得最短/最长的发布列表 [英] How to Get Shortest/Longest Posting Lists

查看:73
本文介绍了如何获得最短/最长的发布列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在这里写了Class InvertedIndexTable { }:

public interface IInvertedIndex
{
    int IndexSize(string path);
    void Load(string path);
}
class InvertedIndexTable : IInvertedIndex
{
     Dictionary<string, List<string>> index = new Dictionary<string, List<string>>();
     CreateMatrix r = new CreateMatrix(); // an object of another class contains stopwords{A,AN,...}
                                          // and also contains RemoveStopword() method
     public HashSet<string> DistincTerms = new HashSet<string>();
     public List<string> filesCollection = new List<string>();
     public int IndexSize(string pa)
     {
         Load(pa);
         return index.Count;
     }
     public void Load(string path)
      {
          string[] filePaths = Directory.GetFiles(Path.GetFullPath(path));
          foreach (string file in filePaths)
          {
              string contents = File.ReadAllText(file);
              contents = RemoveNonAlphaChars(contents);
              String[] tokensCollection = r.RemoveStopsWords(contents.ToUpper().Split(' '));
              foreach (string token in tokensCollection)
              {
                  if (!r.booleanOperator.Contains(token) && !DistincTerms.Contains(token))
                  {
                      DistincTerms.Add(token);
                  }
              }
          }
          Frequenty(filePaths);
      }
     public void Frequenty(string[] path1)
      {
        foreach (string d in DistincTerms)
        {
            foreach (string f in path1)
            {
                if (File.ReadAllText(f).Contains(d))
                {
                    filesCollection.Add(f);
                }

            }
            index.Add(d, filesCollection);
          }
      }
     private string RemoveNonAlphaChars(string content)
      {
          StringBuilder sb = new StringBuilder();

          foreach (char c in content.ToCharArray())
          {
              if (char.IsLetter(c) || char.IsSeparator(c))
              {
                  sb.Append(c);
              }
          }
          return sb.ToString();
      }
     public  string GetSmallestPosting(string p)
      {
          List<int> numbers = new List<int>();
          if (index != null)
          {
              foreach( KeyValuePair<string,List<string>> i in index)
              {
                  string content= i.Value.ToString();
                  String[] itemsList = content.ToUpper().Split(' ');
                  numbers.Add(itemsList.Length); 
              }

              return numbers.Min().ToString();
          }
          return null;
      }
     public string GetLongestPosting(string p)
      {
          List<int> numbers = new List<int>();
          if (index != null)
          {
              foreach (KeyValuePair<string, List<string>> i in index)
              {

                  string content = i.Value.ToString();
                  String[] itemsList = content.ToUpper().Split(' ');
                  numbers.Add(itemsList.Count());
              }
              return numbers.Max().ToString(); 
          }
          return null;
      }
}

我准备准备button6来显示Class InvertedIndexTable { }的最小和最长发布列表以及Dictionary<string,List<string>> index的KeyValuePair编号. 它可以正常工作,没有任何错误和异常,但是问题是:DictionaryPairsNumbers的返回值是正确的,但是MinSizePosting和MaxSizePosting的返回值是错误的,代码始终为它们两个返回值"1".为什么?怎么了?

I'm going to prepare button6 to show me Smallest and Longest posting lists of Class InvertedIndexTable { } also number of KeyValuePair of Dictionary<string,List<string>> index. it works without any errors and exception,but the problem is : return value for DictionaryPairsNumbers is correct, but return values for MinSizePosting and MaxSizePosting are wrong, code always returns value "1" for both of them. why? what's the matter?

我为button6编写的代码就在这里:

Code I wrote for button6 is right here :

    `  InvertedIndexTable i = new InvertedIndexTabe(); 
    private void button5_Click(object sender, EventArgs e)
    {
     MessageBox.Show("DictionaryPairsNumbers: " + i.IndexSize(textBox1.Text)+"\n\rMaxSizePosting: " + i.GetLongestPosting(textBox1.Text)+"\n\rMinSizePosting: "+ i.GetSmallestPosting(textBox1.Text));
    }
    `

请让我知道是否有任何方式可以达到预期的结果. 我需要的结果是Dictionary index中最短和最长List<string>的大小,我以为我为GetSmallestPosting()GetLongestPosting()方法编写了正确的代码,但似乎我错了,请告诉我这两种方法有什么问题?为什么他们总是返回相同的值???以及为什么该值始终为"1"?

please, let me know if there is any way that I achieve my expected result. the result I need, is size of Shortest and longest List<string> in Dictionary index I thought I wrote right code for GetSmallestPosting() and GetLongestPosting() methods but it seems I was wrong, please tell me what's wrong with these two methods? why they return same values,always??? and why this value is "1",always???

顺便说一句,GetSmallestPosting()找到最短的List<string>GetLongestPosting()找到最长的.

by the way, GetSmallestPosting() finds shortestList<string>of Dictionary<string,List<string>> index and GetLongestPosting() finds the longest one.

感谢您的宝贵时间.

推荐答案

您可以使用Linq执行此操作.

You can use Linq to do this.

向InvertedIndex类添加两个新方法.

Add two new methods to your InvertedIndex class.

最小会遍历字典中的所有键(X)值(列表)对,并返回具有最小项数的列表. Max 与此相反.

Min goes through all key (X) value (List) pairs in your dictionary and returns the list with the smallest Count of items. Max does the exact opposite.

public List<T> GetSmallestPosting()
{
    if(_Index!=null)
       return  _Index.Values.First(v => v.Count == _Index.Min(kv => kv.Value.Count)).ToList();

    return null;
}

public List<T> GetLongestPosting()
{
    if(_Index!=null)
      return   _Index.Values.First(v => v.Count == _Index.Max(kv => kv.Value.Count)).ToList();

    return null;
}

这篇关于如何获得最短/最长的发布列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆