如何从句子中提取某些单词。 [英] How do I extract certain words from a sentence.

查看:111
本文介绍了如何从句子中提取某些单词。的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找提取单词的最佳方法。



我的句子名为TestTest - CustID:1200005#14102016_0412-ARF或者它可能是TestTest - CustID:1200005#14102016_0412-ARF



这个'CustID:1200005#'可能带或不带空格。我需要从中提取'1200005'。



I am looking for a best method to extract the words.

I have sentence called TestTest - CustID : 1200005#14102016_0412- ARF or it may be TestTest - CustID:1200005#14102016_0412- ARF

This 'CustID : 1200005#' may be with or without spaces. I need to extract '1200005' from this.

string subject = "TestTest - CustID : 1200005#14102016_0412- ARF"

if (subject.Trim().Contains(CustID:))
                {
                    customerId = ExtractFromString(mailSubject, "CustID:","#");
                }
                else if (subject.Trim().Contains(CustID:))
                {
                    customerId = ExtractFromString(mailSubject, "CustID:","#");
                }
}
 
<pre lang="C#">private static List&lt;string&gt; ExtractFromString(string text, string startString, string endString)
       {
           List&lt;string&gt; matched = new List&lt;string&gt;();
           int indexStart = 0, indexEnd = 0;
           bool exit = false;
           while (!exit)
           {
               indexStart = text.IndexOf(startString);
               indexEnd = text.IndexOf(endString);
               if (indexStart != -1 &amp;&amp; indexEnd != -1)
               {
                   matched.Add(text.Substring(indexStart + startString.Length,
                       indexEnd - indexStart - startString.Length));
                   text = text.Substring(indexEnd + endString.Length);
               }
               else
                   exit = true;
           }
           return matched;
       }</pre>







这是有效的,但我正在寻找一些更好的选择,考虑到两个CustID:1200005 #或CustID:1200005#



我尝试过:



我试过的是附加的。








This is working but i am looking for some better option considering both CustID : 1200005# or CustID:1200005#

What I have tried:

What i tried has been appended.


string subject = "TestTest - CustID : 1200005#14102016_0412- ARF"
if (subject.Trim().Contains("CustID"))
                {
                    customerId = ExtractFromString(mailSubject, "CustID:","#");
                }
                else if (subject.Trim().Contains(CustID:))
                {
                    customerId = ExtractFromString(mailSubject, "CustID:","#");
                }
}

<pre lang="C#">private static List&lt;string&gt; ExtractFromString(string text, string startString, string endString)
       {
           List&lt;string&gt; matched = new List&lt;string&gt;();
           int indexStart = 0, indexEnd = 0;
           bool exit = false;
           while (!exit)
           {
               indexStart = text.IndexOf(startString);
               indexEnd = text.IndexOf(endString);
               if (indexStart != -1 &amp;&amp; indexEnd != -1)
               {
                   matched.Add(text.Substring(indexStart + startString.Length,
                       indexEnd - indexStart - startString.Length));
                   text = text.Substring(indexEnd + endString.Length);
               }
               else
                   exit = true;
           }
           return matched;
       }</pre>

推荐答案

你过度工程了。



You're over-engineering it.

string text = "TestTest - CustID : 1200005#14102016_0412- ARF";
string[] parts = text.Split(':');
string[] parts2 = parts[1].Split('#');
string result = parts2[0].Trim();





甚至





or even

string text = "TestTest - CustID : 1200005#14102016_0412- ARF";
string result = text.Split(':')[1].Split('#')[0].Trim();





如果你需要一些错误捕获,你应该使用第一个版本。



If you need some error trapping in there, you should probably use the 1st version.


Quote:

这是有效的



你确定吗?


Are you sure ?

string subject = "TestTest - CustID : 1200005#14102016_0412- ARF"
if (subject.Trim().Contains("CustID"))
{
    if (subject.Trim().Contains(CustomerID:))
    {
        customerId = ExtractFromString(mailSubject, "CustomerID:","#");
    }
    else if (subject.Trim().Contains(CustID:))
    {
        customerId = ExtractFromString(mailSubject, "CustomerID:","#");
    }
}



我看到2个语法错误,3个误用和4个逻辑错误。

来自代码,我怀疑这个例子不完整。

我希望有2个可选空格的主要例子。



它刺痛了我的眼睛。



那么问题和问题是什么?



[UpDate]

当我看到你的新代码:


I see 2 syntax errors, 3 misuse and at 4 error of logic.
And from the code, I suspect the example to not be complete.
I expect 2 main examples with optional spaces.

it stings my eyes.

So what is the question and the problem ?

[UpDate]
When I see your new code:

if (subject.Trim().Contains("CustID"))
    {
        customerId = ExtractFromString(mailSubject, "CustID:","#");
    }
    else if (subject.Trim().Contains(CustID:))
    {
        customerId = ExtractFromString(mailSubject, "CustID:","#");
    }
}



你不明白这段代码是做什么的,它不可能是你的,它甚至不能编译!

建议:正确学习C#


You do not understand what this code does, it can't be yours, it do not even compile !
Advice: Learn properly C#


解决方案1 ​​by John Simmons /非法程序员 [ ^ ]非常好。作为替代方案,您可以使用 Regex类 [ ^ ]用于查找匹配的字符串。



我建议使用 Regex.Match [ ^ ]方法:



Solution 1 by John Simmons / outlaw programmer[^] is very good. As an alternative you can use Regex class[^] which is used to find matched string.

I'd suggest to use Regex.Match[^] method:

string subject = "TestTest - CustID : 1200005#14102016_0412- ARF";
string pattern = @"\d{1,}#";

Match m = Regex.Match(subject, pattern, RegexOptions.Singleline);
if(m.Success)
	Console.WriteLine(m.Value.Replace("#",""));





注意:根据需要改变模式。



详情请见:

正则表达式语言 - 快速参考 [ ^ ]


这篇关于如何从句子中提取某些单词。的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆