确定一个英文单词的难度 [英] Determine the difficulty of an english word
问题描述
我正在一个字的游戏。我的字数据库包含大约10,000个英文单词(按字母顺序)。我打算在游戏中有5个难度级别。等级1显示了最简单的单词和5级表示最困难的话,相对来说。
I am working a word based game. My word database contains around 10,000 english words (sorted alphabetically). I am planning to have 5 difficulty levels in the game. Level 1 shows the easiest words and Level 5 shows the most difficult words, relatively speaking.
我需要划分10,000个长单词列表分为5个级别,从最简单的话来难的开始。我找了一个程序来为我做。
I need to divide the 10,000 long words list into 5 levels, starting from the easiest words to difficult ones. I am looking for a program to do this for me.
谁能告诉我,如果有一个算法或方法来定量测量英语单词的难度?
我有一些想法围绕着使用的字长的和的词频的为的因素,并提出了一个公式或东西,实现这一目的。
I have some thoughts revolving around using the "word length" and "word frequency" as factors, and come up with a formula or something that accomplishes this.
推荐答案
获取大量语料文本(例如,从古腾堡档案馆),做一个直频率分析,以及眼球的效果。如果他们不看令人满意,体重每一个文字的弗莱什 - 金凯德得分并运行分析再次 - 这显示频繁,但在口头上困难的文章将得到一个分数提升,这是你想要的。
Get a large corpus of texts (e.g. from the Gutenberg archives), do a straight frequency analysis, and eyeball the results. If they don't look satisfying, weight each text with its Flesch-Kincaid score and run the analysis again - words that show up frequently, but in "difficult" texts will get a score boost, which is what you want.
如果你只有10000字,不过,它可能会更快地只是用手频率排序为第一关,然后调整的结果。
If all you have is 10000 words, though, it will probably be quicker to just do the frequency sorting as a first pass and then tweak the results by hand.
这篇关于确定一个英文单词的难度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!