c#中的逆文档频率(IDF)方法出错 [英] error occur in Inverse Document frequency(IDF) method in c#
问题描述
mathaically IDF是
IDF(t,D)= log(文件总数/文件匹配期限);
实际上我已经开发了一个用于文档聚类的应用程序。在这里我有
一个IDF方法就像
private static float FindInverseDocumentFrequency( string term)
{
/ / DocumentVector dv = new DocumentVector();
// 找到包含谁的术语的文档编号文档集合
int count = documentCollection.ToArray()。其中(s = > r.Split(s.ToUpper())。ToArray()。包含(term.ToUpper()))。Count();
/ *
*集合中文档总数与no之比的日志。包含术语
*的文档,我们也可以使用Math.Log(count /(1 + documentCollection.Count))来处理除零情况;
* /
return ( float )Math.Log( ( float )documentCollection.Count /( float )count);
}
此方法在程序中使用以下声明的陈述
documentCollection如
documentCollection = collection。 DocumentList [dv.content] as Hashtable;
DocumentList就像
private DocumentCollection docCollection = new DocumentCollection(){DocumentList = new Hashtable()};
s就是这样的字符串
List< string> removeList = new 列表< string>(){ \\ \\, \ rr, \ n, (, ), [, ], {, }, ,< span class =code-string> 。, , ,};
foreach ( string s in removeList)
{
distinctTerms.Remove(s);
}
r是正则表达式
private static 正则表达式r = 新正则表达式( ([\\t {}()\,:;。\ n]));
IDF方法有一些错误如:
documentcollection.toarray()发生错误 喜欢 < span class =code-keyword> as
'System。 Collections.Hashtable'不包含'ToArray'的定义,并且没有扩展方法'ToArray'接受类型'System.Collections.Hashtable'的第一个参数可以找到(你是否缺少using指令或汇编引用?)
please slove这个错误。
请帮助我。谢谢你
不要使用非泛型类型(此处不适用的专用类型除外)。早在.NET v.2.0引入泛型时,它就已经过时了。看看你在做什么:使用的动态案例作为
运算符。泛型(+经典OOP)的重点是避免它。
使用类型System.Collections.Generic.HashSet< T>
而是使用ToArray< T>()
方法:
https://msdn.microsoft.com/en-us/library/bb359438%28v=vs.110%29。 aspx [ ^ ],
https ://msdn.microsoft.com/en-us/library/bb298736(v = vs.110).aspx [ ^ ],
https://msdn.microsoft.com/en-us/library/bb298736(v=vs.110).aspx [ ^ ]。
-SA
IDF( is a popular measure of a word's importance. The IDF invari- ably appears in a host of heuristic measures used in information retrieval. However, so far the IDF has itself been a heuristic.
mathamatically IDF is the
IDF(t,D)=log(Total Number documents/Number of Document matching term);
Actually i have develop one application for document clustering. in this i have
one IDF method like as
private static float FindInverseDocumentFrequency(string term)
{
// DocumentVector dv = new DocumentVector();
//find the no. of document that contains the term in whole document collection
int count = documentCollection.ToArray().Where(s => r.Split(s.ToUpper()).ToArray().Contains(term.ToUpper())).Count();
/*
* log of the ratio of total no of document in the collection to the no. of document containing the term
* we can also use Math.Log(count/(1+documentCollection.Count)) to deal with divide by zero case;
*/
return (float)Math.Log((float)documentCollection.Count / (float)count);
}
this method use the following declared statments in program
documentCollection like as
documentCollection = collection.DocumentList[dv.content] as Hashtable;
DocumentList is like as
private DocumentCollection docCollection= new DocumentCollection() { DocumentList = new Hashtable() };
s is the string like as
List<string> removeList = new List<string>(){"\"","\r","\n","(",")","[","]","{","}","","."," ",","};
foreach (string s in removeList)
{
distinctTerms.Remove(s);
}
r is the Regular expression
private static Regex r = new Regex("([ \\t{}()\",:;. \n])");
IDF method have some error like as:
"
documentcollection.toarray() occur error like as
"'System.Collections.Hashtable' does not contain a definition for 'ToArray' and no extension method 'ToArray' accepting a first argument of type 'System.Collections.Hashtable' could be found (are you missing a using directive or an assembly reference?)
please slove this error.
please help me.thank u
Don't use non-generic types (except specialized which are not applicable here). The have been rendered obsolete as early as of .NET v.2.0 when generics were introduced. Look what you are doing: using dynamic case withas
operator. The whole point of generics (+ classic OOP) was to avoid it.
Use the typeSystem.Collections.Generic.HashSet<T>
instead, with itsToArray<T>()
methods:
https://msdn.microsoft.com/en-us/library/bb359438%28v=vs.110%29.aspx[^],
https://msdn.microsoft.com/en-us/library/bb298736(v=vs.110).aspx[^],
https://msdn.microsoft.com/en-us/library/bb298736(v=vs.110).aspx[^].
—SA
这篇关于c#中的逆文档频率(IDF)方法出错的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!