字数：麦克罗伊的解决方案效率如何？ [英] WordCount: how inefficient is McIlroy's solution?

查看：66 发布时间：2020/6/3 20:03:52 algorithm shell sorting word-frequency knuth

本文介绍了字数：麦克罗伊的解决方案效率如何？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

长话短说：1986年，一名访调员要求唐纳德·努斯（Donald Knuth）编写一个程序，该程序以文本和N作为输入，并列出按频率排列的N个最常用词。 Knuth制作了一个10页的Pascal程序，Douglas McIlroy用以下6行shell脚本回答了该问题：

Long story short: in 1986 an interviewer asked Donald Knuth to write a program that takes a text and a number N in input, and lists the N most used words sorted by their frequencies. Knuth produced a 10-pages Pascal program, to which Douglas McIlroy replied with the following 6-lines shell script:

tr -cs A-Za-z '\n' |
tr A-Z a-z |
sort |
uniq -c |
sort -rn |
sed ${1}q

在 http://www.leancrew.com/all-this/2011/12/more-shell-less -egg / 。

当然，他们有非常不同的目标：Knuth展示了他的识字编程概念，并从零开始构建了一切，而McIlroy使用了一些

Of course they had very different goals: Knuth was showing his concepts of literate programming and built everything from scratch, while McIlroy used a few common UNIX utilities to achieve the shortest source code.

我的问题是：那有多糟？

（从运行时的速度来看，观点，因为我很确定我们都同意6行代码比10页的页面更容易理解/维护，无论是否识字编程。）

My question is: how bad is that?
(Purely from a runtime-speed point of view, since I'm pretty sure we all agree that 6 lines of code is easier to understand/maintain than 10 pages, literate programming or not.)

我可以理解 sort -rn | sed $ {1} q 可能不是提取常用词的最有效方法，但是 tr -sc A-za-z’nn | tr A-Z a-z ？对我来说看起来不错。
关于排序| uniq -c ，这是确定频率的非常慢的方法吗？

I can understand that sort -rn | sed ${1}q may not be the most efficient way to extract the common words, but what's wrong with tr -sc A-za-z '\n' | tr A-Z a-z? It looks pretty good to me. About sort | uniq -c, is that a terribly slow way to determine the frequencies?

一些注意事项：

tr 应该是线性时间（？）

sort 我不确定，但是我认为这不是那不好

uniq 也应该是线性时间

产卵进程也应该是线性时间（以进程数计）

tr should be linear time (?)
sort I'm not sure of, but I'm assuming it's not that bad
uniq should be linear time too
spawning processes should be linear time (in the number of processes)

字数：麦克罗伊的解决方案效率如何？ [英] WordCount: how inefficient is McIlroy's solution?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

字数：麦克罗伊的解决方案效率如何？ [英] WordCount: how inefficient is McIlroy&#39;s solution?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

字数：麦克罗伊的解决方案效率如何？ [英] WordCount: how inefficient is McIlroy's solution?

登录关闭