如何根据编码从句子中提取文本 [英] How do I extract text from sentence based on codition
问题描述
请在下面找到更多详情。
我有一条文字与我1200005#28102016_0612 Inv.ACHExtractDetail POC.DiscountDetail Scanner Test1& a a关键字列表
keywords.Add( new KeywordCTIBO(){ Keyword = Scanner});
keywords.Add( new KeywordCTIBO(){Keyword = Inv.ACHExtractDetail}) ;
keywords.Add( new KeywordCTIBO(){Keyword = POC.DiscountDetail});
我需要的是我需要提取Inv。 ACHExtractDetail来自文本,因为它出现在列表中。我的意思是当且仅当列表中存在关键字时,第一次出现关键字。
下面的代码工作正常但如果可能的话,我看起来好多了而不使用循环。
列表< KeywordCTIBO> ; keywords = new 列表< KeywordCTIBO>();
KeywordCTIBO matchedKeyword = null ;
keywords.Add( new KeywordCTIBO(){Keyword = 扫描仪});
keywords.Add( new KeywordCTIBO(){Keyword = Inv.ACHExtractDetail});
keywords.Add( new KeywordCTIBO(){Keyword = POC.DiscountDetail});
string text = 1200005#28102016_0612 Inv.ACHExtractDetail POC.DiscountDetail Scanner Test1;
// var matchedKeyWords = keywords.Where(f => text.ToLower()。Split(' ,','',';')。包含(f.Keyword.ToLower()))。选择(p => new {KeyWordBO = p,Index = text.ToLower()。IndexOf(p.Keyword.ToLower ())});
// string [] arr = text.ToLower() .Split(',','',';');
string [] arr = text.ToLower()。Split( ' ,',' ',' ;');
string s = string .Empty;
foreach ( var r in arr){
var matchedKeyWords = keywords.Where(f = > r.Contains (f.Keyword.ToLower()))。选择(p = > new {KeyWordBO = p ,Index = text.ToLower()。IndexOf(p.Keyword.ToLower())});
foreach ( var f in matchedKeyWords)
{
s = f.KeyWordBO.Keyword;
}
}
}
public class KeywordCTIBO
{
public string 关键字{获得跨度>; set ; }
}
我尝试过:
列表< KeywordCTIBO> keywords = new 列表< KeywordCTIBO>();
KeywordCTIBO matchedKeyword = null ;
keywords.Add( new KeywordCTIBO(){Keyword = 扫描仪});
keywords.Add( new KeywordCTIBO(){Keyword = Inv.ACHExtractDetail});
keywords.Add( new KeywordCTIBO(){Keyword = POC.DiscountDetail});
string text = 1200005#28102016_0612 Inv.ACHExtractDetail POC.DiscountDetail Scanner Test1;
// var matchedKeyWords = keywords.Where(f => text.ToLower()。Split(' ,','',';')。包含(f.Keyword.ToLower()))。选择(p => new {KeyWordBO = p,Index = text.ToLower()。IndexOf(p.Keyword.ToLower ())});
// string [] arr = text.ToLower() .Split(',','',';');
string [] arr = text.ToLower()。Split( ' ,',' ',' ;');
string s = string .Empty;
foreach ( var r in arr){
var matchedKeyWords = keywords.Where(f = > r.Contains (f.Keyword.ToLower()))。选择(p = > new {KeyWordBO = p ,Index = text.ToLower()。IndexOf(p.Keyword.ToLower())});
foreach ( var f in matchedKeyWords)
{
s = f.KeyWordBO.Keyword;
}
}
}
public class KeywordCTIBO
{
public string 关键字{获得跨度>; set ; }
}
}
您的概念/方法很好。如果你只想进行优化,我会看看以下几点:
r.Contains(.. <在您的Linq中/ code>是否包含必要?您在想
arr.Contains
?如果是这样,可能值得投入arr
进入HashSet
,并使用不区分大小写的比较器,而不是在每次迭代时转换所有ToLower。- 上面简要提到 - 你在每次迭代时转换所有
ToLower()
。在Linq语句之前将事物转换为小写,或者使用不区分大小写的比较器。
Please find below for more details.
I have a text with me ""1200005 # 28102016_0612 Inv.ACHExtractDetail POC.DiscountDetail Scanner Test1" & also a list of keyword
keywords.Add(new KeywordCTIBO() { Keyword = "Scanner" });
keywords.Add(new KeywordCTIBO() { Keyword = "Inv.ACHExtractDetail" });
keywords.Add(new KeywordCTIBO() { Keyword = "POC.DiscountDetail" });
What i require is i need to extract "Inv.ACHExtractDetail" from text since it is present in the list. What i mean is the first occurrence of keyword from text if and only if it exist in the list.
Below code works fine but i am looking much better if possible without using loops.
List<KeywordCTIBO> keywords = new List<KeywordCTIBO>();
KeywordCTIBO matchedKeyword = null;
keywords.Add(new KeywordCTIBO() { Keyword = "Scanner" });
keywords.Add(new KeywordCTIBO() { Keyword = "Inv.ACHExtractDetail" });
keywords.Add(new KeywordCTIBO() { Keyword = "POC.DiscountDetail" });
string text = "1200005 # 28102016_0612 Inv.ACHExtractDetail POC.DiscountDetail Scanner Test1";
//var matchedKeyWords = keywords.Where(f => text.ToLower().Split(',', ' ', ';').Contains(f.Keyword.ToLower())).Select(p => new { KeyWordBO = p, Index = text.ToLower().IndexOf(p.Keyword.ToLower()) });
//string[] arr = text.ToLower().Split(',', ' ', ';');
string [] arr = text.ToLower().Split(',', ' ', ';');
string s = string.Empty;
foreach(var r in arr){
var matchedKeyWords = keywords.Where(f => r.Contains(f.Keyword.ToLower())).Select(p => new { KeyWordBO = p, Index = text.ToLower().IndexOf(p.Keyword.ToLower()) });
foreach (var f in matchedKeyWords)
{
s = f.KeyWordBO.Keyword;
}
}
}
public class KeywordCTIBO
{
public string Keyword { get; set; }
}
What I have tried:
List<KeywordCTIBO> keywords = new List<KeywordCTIBO>();
KeywordCTIBO matchedKeyword = null;
keywords.Add(new KeywordCTIBO() { Keyword = "Scanner" });
keywords.Add(new KeywordCTIBO() { Keyword = "Inv.ACHExtractDetail" });
keywords.Add(new KeywordCTIBO() { Keyword = "POC.DiscountDetail" });
string text = "1200005 # 28102016_0612 Inv.ACHExtractDetail POC.DiscountDetail Scanner Test1";
//var matchedKeyWords = keywords.Where(f => text.ToLower().Split(',', ' ', ';').Contains(f.Keyword.ToLower())).Select(p => new { KeyWordBO = p, Index = text.ToLower().IndexOf(p.Keyword.ToLower()) });
//string[] arr = text.ToLower().Split(',', ' ', ';');
string [] arr = text.ToLower().Split(',', ' ', ';');
string s = string.Empty;
foreach(var r in arr){
var matchedKeyWords = keywords.Where(f => r.Contains(f.Keyword.ToLower())).Select(p => new { KeyWordBO = p, Index = text.ToLower().IndexOf(p.Keyword.ToLower()) });
foreach (var f in matchedKeyWords)
{
s = f.KeyWordBO.Keyword;
}
}
}
public class KeywordCTIBO
{
public string Keyword { get; set; }
}
}
Your concepts/methods are good. If you just want optimisation, I'd look at the following points:
r.Contains(..
in your Linq - is Contains necessary? Were you thinkingarr.Contains
? If so, may be worth while puttingarr
into aHashSet
, and using a case-insensitive comparer rather than converting everything ToLower on each iteration.- Briefly mentioned above - you are converting everything
ToLower()
on each iteration. Either convert things to lower case before the Linq statement, or use case-insensitive comparers.
这篇关于如何根据编码从句子中提取文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!