找到非字符的单词 [英] finding Non character word

查看:42
本文介绍了找到非字符的单词的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述


我有点像"不那么好的程序员"。所以请保持答案可以理解。


所以,我有一个这样的文本文件:



FRITZ-ULLMANN-STRAáEÿ9

< span style ="font-size:14px"> WIESBADEN55252



所以,我想找到包含特殊字符或其他语言字体的单词。在第1行  STRAáEÿ这个词包含一些不同于英语的字母,所以,实际上我想把这个词作为我的输出,



感谢提前,



Nandhini K





解决方案

这样的事情可能是一个好的开始但不符合你的要求。

 string items =" FRITZ-ULLMANN-STRAáEÿ9 WIESBADEN552" ;; 

Encoding iso = Encoding.GetEncoding(" ISO-8859-1");
编码utf8 = Encoding.UTF8;
byte [] utfBytes = utf8.GetBytes(items);
byte [] isoBytes = Encoding.Convert(utf8,iso,utfBytes);
string itemsTranslated = iso.GetString(isoBytes);

var characterGroup =(
来自chr in itemsTranslated.ToCharArray()
group chr by chr into grp
select new
{
Letter = grp.Key,
Occurrences = grp.Count(),
Code = Convert.ToInt32((int)grp.Key)
})
.ToList()
.OrderBy((item)=> item.Letter.ToString());

var results =(from characterGroup select item中的item);

StringBuilder sb = new StringBuilder();


foreach(结果中的var项)
{
sb.AppendLine(string.Format(" {0} {1} {2}",
item.Letter,
item.Occurrences,
item.Code));
}


Console.WriteLine(sb.ToString());

结果

  -  2 45 
3 32
2 1 50
5 2 53
9 1 57
A 3 65
á1225
B 1 66
D 1 68
E 3 69
F 1 70
I 2 73
L 2 76
M 1 77
N 3 78
R 2 82
S 2 83
T 2 84
U 1 85
W 1 87
ÿ1 255
Z 1 90



  ;

Hi, I'm kinda a "not-so-good-programmer" so please keep the answers understandable.

So, I have a text file like this:

FRITZ-ULLMANN-STRAáE ÿ 9
WIESBADEN55252

So, I want to find the word contains special character or other language font. In Line 1  the word STRAáE ÿ contains some different letters other than English, So,actually i want the word as my output,

Thanks in advance,

Nandhini K


解决方案

Something like this might be a good start but does not fulfill your requirements.

string items = "FRITZ-ULLMANN-STRAáE ÿ 9 WIESBADEN552";

Encoding iso = Encoding.GetEncoding("ISO-8859-1");
Encoding utf8 = Encoding.UTF8;
byte[] utfBytes = utf8.GetBytes(items);
byte[] isoBytes = Encoding.Convert(utf8, iso, utfBytes);
string itemsTranslated = iso.GetString(isoBytes);

var characterGroup = (
    from chr in itemsTranslated.ToCharArray()
    group chr by chr into grp
    select new
    {
        Letter = grp.Key,
        Occurrences = grp.Count(),
        Code = Convert.ToInt32((int)grp.Key)
    })
    .ToList()
    .OrderBy((item) => item.Letter.ToString());

var results = (from item in characterGroup select item);

StringBuilder sb = new StringBuilder();


foreach (var item in results)
{
    sb.AppendLine(string.Format("{0} {1} {2}",
        item.Letter,
        item.Occurrences,
        item.Code));
}


Console.WriteLine(sb.ToString());

Results

- 2 45
  3 32
2 1 50
5 2 53
9 1 57
A 3 65
á 1 225
B 1 66
D 1 68
E 3 69
F 1 70
I 2 73
L 2 76
M 1 77
N 3 78
R 2 82
S 2 83
T 2 84
U 1 85
W 1 87
ÿ 1 255
Z 1 90

 


这篇关于找到非字符的单词的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆