如何根据编码从句子中提取文本 [英] How do I extract text from sentence based on codition

查看:59
本文介绍了如何根据编码从句子中提取文本的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

请在下面找到更多详情。



我有一条文字与我1200005#28102016_0612 Inv.ACHExtractDetail POC.DiscountDetail Scanner Test1& a a关键字列表



 keywords.Add( new  KeywordCTIBO(){ Keyword =   Scanner}); 
keywords.Add( new KeywordCTIBO(){Keyword = Inv.ACHExtractDetail}) ;
keywords.Add( new KeywordCTIBO(){Keyword = POC.DiscountDetail});





我需要的是我需要提取Inv。 ACHExtractDetail来自文本,因为它出现在列表中。我的意思是当且仅当列表中存在关键字时,第一次出现关键字。



下面的代码工作正常但如果可能的话,我看起来好多了而不使用循环。



列表< KeywordCTIBO> ; keywords =  new 列表< KeywordCTIBO>(); 
KeywordCTIBO matchedKeyword = null ;
keywords.Add( new KeywordCTIBO(){Keyword = 扫描仪});
keywords.Add( new KeywordCTIBO(){Keyword = Inv.ACHExtractDetail});
keywords.Add( new KeywordCTIBO(){Keyword = POC.DiscountDetail});

string text = 1200005#28102016_0612 Inv.ACHExtractDetail POC.DiscountDetail Scanner Test1;
// var matchedKeyWords = keywords.Where(f => text.ToLower()。Split(' ,','',';')。包含(f.Keyword.ToLower()))。选择(p => new {KeyWordBO = p,Index = text.ToLower()。IndexOf(p.Keyword.ToLower ())});
// string [] arr = text.ToLower() .Split(',','',';');
string [] arr = text.ToLower()。Split( ' ,'' '' ;');
string s = string .Empty;
foreach var r in arr){
var matchedKeyWords = keywords.Where(f = > r.Contains (f.Keyword.ToLower()))。选择(p = > new {KeyWordBO = p ,Index = text.ToLower()。IndexOf(p.Keyword.ToLower())});
foreach var f in matchedKeyWords)
{
s = f.KeyWordBO.Keyword;
}
}
}
public class KeywordCTIBO
{
public string 关键字{获得; set ; }
}





我尝试过:



列表< KeywordCTIBO> keywords =  new 列表< KeywordCTIBO>(); 
KeywordCTIBO matchedKeyword = null ;
keywords.Add( new KeywordCTIBO(){Keyword = 扫描仪});
keywords.Add( new KeywordCTIBO(){Keyword = Inv.ACHExtractDetail});
keywords.Add( new KeywordCTIBO(){Keyword = POC.DiscountDetail});

string text = 1200005#28102016_0612 Inv.ACHExtractDetail POC.DiscountDetail Scanner Test1;
// var matchedKeyWords = keywords.Where(f => text.ToLower()。Split(' ,','',';')。包含(f.Keyword.ToLower()))。选择(p => new {KeyWordBO = p,Index = text.ToLower()。IndexOf(p.Keyword.ToLower ())});
// string [] arr = text.ToLower() .Split(',','',';');
string [] arr = text.ToLower()。Split( ' ,'' '' ;');
string s = string .Empty;
foreach var r in arr){
var matchedKeyWords = keywords.Where(f = > r.Contains (f.Keyword.ToLower()))。选择(p = > new {KeyWordBO = p ,Index = text.ToLower()。IndexOf(p.Keyword.ToLower())});
foreach var f in matchedKeyWords)
{
s = f.KeyWordBO.Keyword;
}
}
}
public class KeywordCTIBO
{
public string 关键字{获得; set ; }
}
}

解决方案

您的概念/方法很好。如果你只想进行优化,我会看看以下几点:



  • r.Contains(.. <在您的Linq中/ code>是否包含必要?您在想 arr.Contains ?如果是这样,可能值得投入 arr 进入 HashSet ,并使用不区分大小写的比较器,而不是在每次迭代时转换所有ToLower。
  • 上面简要提到 - 你在每次迭代时转换所有 ToLower()。在Linq语句之前将事物转换为小写,或者使用不区分大小写的比较器。

Please find below for more details.

I have a text with me ""1200005 # 28102016_0612 Inv.ACHExtractDetail POC.DiscountDetail Scanner Test1" & also a list of keyword

keywords.Add(new KeywordCTIBO() { Keyword = "Scanner" });
          keywords.Add(new KeywordCTIBO() { Keyword = "Inv.ACHExtractDetail" });
          keywords.Add(new KeywordCTIBO() { Keyword = "POC.DiscountDetail" });



What i require is i need to extract "Inv.ACHExtractDetail" from text since it is present in the list. What i mean is the first occurrence of keyword from text if and only if it exist in the list.

Below code works fine but i am looking much better if possible without using loops.

List<KeywordCTIBO> keywords = new List<KeywordCTIBO>();
           KeywordCTIBO matchedKeyword = null;
           keywords.Add(new KeywordCTIBO() { Keyword = "Scanner" });
           keywords.Add(new KeywordCTIBO() { Keyword = "Inv.ACHExtractDetail" });
           keywords.Add(new KeywordCTIBO() { Keyword = "POC.DiscountDetail" });

           string text = "1200005 # 28102016_0612 Inv.ACHExtractDetail POC.DiscountDetail Scanner Test1";
           //var matchedKeyWords = keywords.Where(f => text.ToLower().Split(',', ' ', ';').Contains(f.Keyword.ToLower())).Select(p => new { KeyWordBO = p, Index = text.ToLower().IndexOf(p.Keyword.ToLower()) });
           //string[] arr = text.ToLower().Split(',', ' ', ';');
           string [] arr = text.ToLower().Split(',', ' ', ';');
           string s = string.Empty;
           foreach(var r in arr){
               var matchedKeyWords = keywords.Where(f => r.Contains(f.Keyword.ToLower())).Select(p => new { KeyWordBO = p, Index = text.ToLower().IndexOf(p.Keyword.ToLower()) });
               foreach (var f in matchedKeyWords)
               {
                   s = f.KeyWordBO.Keyword;
               }
           }
       }
       public class KeywordCTIBO
       {
           public string Keyword { get; set; }
       }



What I have tried:

List<KeywordCTIBO> keywords = new List<KeywordCTIBO>();
            KeywordCTIBO matchedKeyword = null;
            keywords.Add(new KeywordCTIBO() { Keyword = "Scanner" });
            keywords.Add(new KeywordCTIBO() { Keyword = "Inv.ACHExtractDetail" });
            keywords.Add(new KeywordCTIBO() { Keyword = "POC.DiscountDetail" });
          
            string text = "1200005 # 28102016_0612 Inv.ACHExtractDetail POC.DiscountDetail Scanner Test1";
            //var matchedKeyWords = keywords.Where(f => text.ToLower().Split(',', ' ', ';').Contains(f.Keyword.ToLower())).Select(p => new { KeyWordBO = p, Index = text.ToLower().IndexOf(p.Keyword.ToLower()) });
            //string[] arr = text.ToLower().Split(',', ' ', ';');
            string [] arr = text.ToLower().Split(',', ' ', ';');
            string s = string.Empty;
            foreach(var r in arr){
                var matchedKeyWords = keywords.Where(f => r.Contains(f.Keyword.ToLower())).Select(p => new { KeyWordBO = p, Index = text.ToLower().IndexOf(p.Keyword.ToLower()) });
                foreach (var f in matchedKeyWords)
                {
                    s = f.KeyWordBO.Keyword;
                }
            }
        }
        public class KeywordCTIBO
        {
            public string Keyword { get; set; }
        }
        }

解决方案

Your concepts/methods are good. If you just want optimisation, I'd look at the following points:

  • r.Contains(.. in your Linq - is Contains necessary? Were you thinking arr.Contains? If so, may be worth while putting arr into a HashSet, and using a case-insensitive comparer rather than converting everything ToLower on each iteration.
  • Briefly mentioned above - you are converting everything ToLower() on each iteration. Either convert things to lower case before the Linq statement, or use case-insensitive comparers.


这篇关于如何根据编码从句子中提取文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆