hunspell:打印更正单词的行号 [英] hunspell: Printing the line number of the corrected word

查看:87
本文介绍了hunspell:打印更正单词的行号的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用hunspell纠正我写的一篇论文.不幸的是,它对我没有用,只要它不打印单词的行号(它预计会拼写错误)即可.

因此,现在我正在使用-a选项,以便能够将其通过管道传输到hunspell中.手册页说,-L选项将打印带有拼写错误的单词的行.".但我看不出输出有任何区别.

这是我现在正在做的事情:

cat myessay.txt | hunspell -d en_US,de_DE -a -L

示例输出如下:

& JavaServer 3 412: Java Server, Java-Server, Javasee

"JavaServer"一词位于第78行,如手册页所述,该行的偏移量实际上为412个字符.

有什么我想念的吗?是否有解决此问题的简便方法,还是我真的必须将每行插入hunspell中以找出它在哪行号上?

预先感谢

解决方案

现在,我实际上下载了hunspell的资源并开始工作. 有一个未记录的开关-u,使我可以舒适地使用它.例如

hunspell -u -d en_US,de_DE myessay.txt

这是使用德语和美式词典打印行号的窍门.或者,您也可以使用-U来获取文本的摘录.其他可用的未记录的开关是-u2-u3

但是要小心:这些开关是实验性的,源代码说这些功能缺少Unicode支持.

来自匈牙利文档:

  • -u显示带有替换建议的文件中的典型错误.
  • -u2可以使用sed执行的典型错误及其修复.
  • -U如果要接受使用-u选项收到的所有建议,-U开关将自动替换Hunspell,并将修改后的文件发送到标准输出.补丁示例:hunspell -U original_file> patch_file.错误输出还会再次显示补丁,类似于-u开关.

一些输出示例:

  • -u:Line 2: liveration -> liberation
  • -u2:2s/liveration/liberation/g; # liveration
  • -u3:(null):2: Locate: liveration | Try: liberation

I am trying to use hunspell to correct an essay i have written. Unfortunately it is useless to me, as long as it doesn't print the line number of the word, which it predicts to be misspelled.

So right now I am using the -a option, to be able to pipe it into hunspell. The man page says, that the -L option would "Print lines with misspelled words.". But I don't see any difference in the output.

This is what I do right now:

cat myessay.txt | hunspell -d en_US,de_DE -a -L

An example output looks like this:

& JavaServer 3 412: Java Server, Java-Server, Javasee

The word "JavaServer" is in line 78, and as described in the man-pages, it really has an offset of 412 characters in that line.

Is there something I am missing? Is there an easy solution to this problem, or do I really have to pipe each line into hunspell to find out at which line number it was?

Thanks in advance

解决方案

Now I actually downloaded the sources of hunspell and got down to business. There is an undocumented switch -u that gives me a comfortable output to work with. e.g.

hunspell -u -d en_US,de_DE myessay.txt

This does the trick for printing line numbers using a German and an American dictionary. Alternatively, you can use -U to get an excerpt of the text as well. Other available undocumented switches are -u2 and -u3

But be careful: Those switches are experimental and the source-code says, that those functions are missing Unicode support.

From the Hungarian documentation:

  • -u Display typical errors in the file with a replacement proposal.
  • -u2 Typical bugs and their fixes which can be executed with sed.
  • -U If you want to accept all the suggestions received with the -u option, the -U switch will automatically replace Hunspell and send the modified file to the standard output. Example patch: hunspell -U original_file> patch_file. The error output also shows patches again, similar to the -u switch.

Some output examples:

  • -u: Line 2: liveration -> liberation
  • -u2: 2s/liveration/liberation/g; # liveration
  • -u3: (null):2: Locate: liveration | Try: liberation

这篇关于hunspell:打印更正单词的行号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆