正则表达式unicode文本的意外结果??? [英] Regex Unexpected results for unicode text ???

查看：123 发布时间：2019/6/15 8:50:05 regexp

本文介绍了正则表达式unicode文本的意外结果???的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

您好，

我正在尝试从包含阿拉伯语和英语字符串的文本文件中提取一些信息嵌套如下：

I am trying to extract some info from text file which contains Arabic and English strings that are nested as follows :

=================================

XXXX：此处有些名称 YYYYY：和一些文字在这里

XXXX: some Name here YYYYY:and some text here

ZZZZ：01234567890

ZZZZ:01234567890

XXXXXX：这里有一些额外的文字aslo

XXXXXX:some extra text here aslo

= ===============================

================================

XXXX，YYYYY，ZZZZ，XXXXXXX是名称和地址的阿拉伯语单词等等。

Where XXXX, YYYYY,ZZZZ,XXXXXXX are Arabic words for Name and Address and so on.

我需要得到的是每个field_name的（：）之后的信息，也可能是阿拉伯语或英语。

What I need to get is the info after the ( : ) of each field_name which could also be in Arabic or in English.

我有以下代码根据需要工作但仅适用于所有英文文本文件：

I have the code below working as needed but just for ALL English text files:


public static void REGEXX(string file)
        {
            //Declare reader as a new StreamReader with file as the file to use
            System.IO.StreamReader reader = new System.IO.StreamReader(file);
            //Declare text as the reader reading to the end
            //string str = reader.ReadToEnd();

            string str = File.ReadAllText(file, Encoding.GetEncoding(1256));

            var re = new Regex(
            @"\n?Name:\s*(?<name>.+?)\n.+?ID:\s*(?<id>.+?)\n.+?Address:\s*"
            + @"(?<addr>.+?)Notes:",
            RegexOptions.IgnoreCase
            | RegexOptions.Singleline
            | RegexOptions.Compiled);

            //re.Options = RegexOptions.RightToLeft;

            var m = re.Match(str);
            if (m.Success)
            {
                var name = m.Groups["name"].Value;
                var id = m.Groups["id"].Value;
                var addr = m.Groups["addr"].Value;

                CustomerName = name;
                CustomerID = id;
                CustomerAddress = addr;
            }
        }

正则表达式unicode文本的意外结果??? [英] Regex Unexpected results for unicode text ???

问题描述

推荐答案

相关文章

其他开发语言最新文章

热门教程

热门工具

登录关闭

正则表达式unicode文本的意外结果??? [英] Regex Unexpected results for unicode text ???

问题描述

推荐答案

相关文章

其他开发语言最新文章

热门教程

热门工具

登录 关闭

登录关闭