hunspell输入中的特殊字符被视为空格 [英] Special characters in the input of hunspell are treated as space

查看:113
本文介绍了hunspell输入中的特殊字符被视为空格的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这个问题是在超级用户上提出的,但在7天内仅获得了8次观看. Hunspell知识渊博的人会使用stackoverflow,因此我在这里提出了这个问题.

This question was asked on superuser, but got only 8 views in 7 days. Hunspell knowledgeable people go to stackoverflow, hence my reasking the question here.

我正在用瑞典语词典在命令行中测试hunspell. 交互式模式下的输入将在拼写检查之前用空格替换所有特殊字符(例如åöö).

I am testing hunspell in the command line with a swedish dictionary. The input in the interactive mode replaces all special characters (for example å ä ö) with blanks before spell cheching.

Hunspell 1.3.2
sjögräs
& sj 15 0: SJ, aj, dj, sk, s, j, sej, sju, sjö, sjå, sa, se, ej, st, si
& gr 15 3: ge, g, r, ger, gir, gro, gör, grå, går, gry, er, nr, dr, go, kr
*

sj gr s
& sj 15 0: SJ, aj, dj, sk, s, j, sej, sju, sjö, sjå, sa, se, ej, st, si
& gr 15 3: ge, g, r, ger, gir, gro, gör, grå, går, gry, er, nr, dr, go, kr
*

如您所见,提示的编码有效,在输入和输出中都显示åä和ö.

As you see, the prompt's encoding is working, showing å ä and ö both in the input and the output.

管道给出相同的结果:

echo sjögräs | hunspell -d sv_SE

我试图给hunspell提供不同的选项,包括-i UTF-8-i UTF-16,并保留aff文件的SET ISO8859-1.什么都没用.

I have tried to give different options to hunspell, including -i UTF-8, -i UTF-16, and keeping the aff file's SET ISO8859-1. Nothing worked.

法语也是如此:

C:\Users\gauthier>echo résultat | hunspell -d fr-moderne
Hunspell 1.3.2
*
& sultat 2 2: sultan, rAcsultat

输出还有其他问题.

我在MinGW中编译了hunspell,并将所需的文件移动到了我的路径中的某个位置,但是我认为这些信息不是很相关.

I compiled hunspell in MinGW and moved the resulting needed files to somewhere in my path, but I don't think that this information is very relevant.

如何使hunspell识别其输入中的特殊字符?

How do I make hunspell recognize special characters on its input?

推荐答案

通过回显变量$LC_ALL$LANG,您可以看到终端上具有哪种语言和语言环境配置.

By echoing the variables $LC_ALL or $LANG you can see which language and locale configuration you have on your the terminal.

然后,您可以尝试通过重新定义这些变量将其更改为字符集hunspell.例如,您可以设置

Then you can try to change it to the charset hunspell by redefining those variables. For example, you can set

LC_ALL=en_US.ISO8859-15

LANG=ca_ES.cp1252

我记得,默认字符集是latin1,但是我不确定(我现在不使用Linux).

As I recall, the default character set is latin1, but I'm not sure (I'm not with Linux right now).

尝试这种方法,而不是修改hunspell软件.

Try this approach instead of modifing the hunspell software.

这篇关于hunspell输入中的特殊字符被视为空格的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆