在计算字母时，Haskell可以击败C吗？ [英] Can Haskell beat C at counting letters?

查看：113 发布时间：2018/6/5 10:50:36 haskell

本文介绍了在计算字母时，Haskell可以击败C吗？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

剧透：是的。见下面。

试图优化一个字母计数器来匹配C.我已经把它打到了2倍的赤字。

  letterCount :: B.ByteString  - > V.Vector Int 
 letterCount bs = 
 V.accumulate 
（\ a _  - > a + 1）
（V.replicate 256 0）
 letters1 
 where 
 len = B.length bs 
 letters1 = V.generate len（\i  - >（fromIntegral $！B.index bs i，（）））

一些注意事项：

在我将 Data.Vector 更改为 Data.Vector.Unboxed 之前，它非常慢。为什么会这样？

我认为大部分时间都会花在 accumulate 上。我错了。

70％的时间用于 generate 。

Haskell代码遭受不必要地将Word8转换为Int;此外，（）的无用部队可能实际上或可能不会实际创建。

 导入限定的Data.ByteString为B 
导入限定的Data.Vector.Unboxed为V 
 import System.Environment 
 import Text.Printf 
 
 letterCount :: B.ByteString  - > V.Vector Int 
 letterCount bs = 
 V.accumulate 
（\ a _  - > a + 1）
（V.replicate 256 0）
 letters1 
 where 
 len = B.length bs 
 letters1 = V.generate len（\i  - >（fromIntegral $！B.index bs i，（）））
 
 printCounts :: V.Vector Int  - > IO（）
 printCounts cs = 
 mapM_ 
（uncurry $ printf％c：％d\\\
）
（zip（映射到Enum [0..255]： ：String）（V.toList cs））
 
 main :: IO（）
 main = do 
 filename<  -  fmap head getArgs 
f<  -  B .readFile文件名
 let counts = letterCount f 
 printCounts counts

竞争C代码：

  #include< assert.h> 
 #include< stdio.h> 
 #include< string.h> 
 #include< sys / stat.h> 
 #include< stdlib.h> 
 
 int letcnt [256]; 
 
 int * letter_count（unsigned char * s，unsigned int len）
 {
 int i; 
 memset（letcnt，0，256 * sizeof（int））; 
 for（i = 0; i< len; i ++）{
 letcnt [*（s + i）] ++; 
} 
 return（letcnt）; 
} 
 
 void print_counts（）{
 int i; 
 for（i = 0; i <256; i ++）{
 printf（'％c'：％d \ n，（unsigned char）i，letcnt [i]）; 
 
 $ b $ // st_size 
 int main（int argc，char ** argv）
 {
 assert（argc == 2）; 
 FILE * f = fopen（argv [1]，r）; 
 struct stat st; 
 stat（argv [1]，& st）; 
 off_t len = st.st_size; 
 unsigned char * contents = calloc（len，1）; 
 fread（内容，len，1，f）; 
 fclose（f）; 
 letter_count（contents，len）; 
 print_counts（）; 
返回0; 
}

定时;

  $ time ./a.out / usr / share / dict / words> / dev / null 
 
 real 0m0.012s 
用户0m0.005s 
 sys 0m0.005s 
 
 $ time ./lettercount / usr / share /字典/单词> / dev / null 
 
 real 0m0.017s 
 user 0m0.009s 
 sys 0m0.007s

更新

我认为性能上限取决于此错误： runST不是免费的。并不是说我相信不可能进一步优化，但只要runST带来一些开销，就不太可能接近C.

另外，根据@ Zeta的评论修正了C代码。 / p>

解决方案
是的。如果使用 -fllvm 编译，则Haskell将匹配 User 时间中的C。令人惊讶的是，如果你切换到Lazy Bytestrings，那么Haskell版本将在 Real 时间的C版本上以小幅度但显着的优势胜过。
将限定的Data.ByteString.Lazy.Char8导入为B 将限定的Data.Vector.Unboxed导入为V import System.Environment import Text.Printf letterCount :: B.ByteString - > V.Vector Int letterCount bs = V.unsafeAccumulate （\ a_ - > a + 1）（V.replicate 256 0）（解析bs）解析:: B.ByteString - > V.Vector（Int，（）） parse = V.unfoldr step where step s = if B.null s then Nothing else Just（（ fromIntegral。fromEnum $ B.head s，（）），B.tail s） { - ＃INLINE parse＃ - } printCounts :: V.Vector Int - > IO（） printCounts cs = mapM_ （uncurry $ printf％c：％d\\\ ）（zip（映射到Enum [0..255]：：String）（V.toList cs）） main :: IO（） main = do filename< - fmap head getArgs f< - B .readFile文件名 let counts = letterCount f printCounts counts
请记住编译例如：

ghc -O2 -fllvm letterCount.hs C.我喜欢它！

更新

公平地更新了C代码以使用单个缓冲区，避免了预先大量分配（或任何分配），并且对缓存更友好。现在，Haskell和C代码没有显着差异，对于大型的150M输入文件，运行时间大约为190毫秒：

#include< assert.h> #include< stdio.h> #include< string.h> #include< sys / stat.h> #include< stdlib.h> #define CHUNK 16384 int letcnt [256]; int * letter_count（unsigned char * s，unsigned int len） { int i; for（i = 0; i< len; i ++）{ letcnt [*（s + i）] ++; } return（letcnt）; } int * letter_count_chunks（unsigned int len，FILE * f） { int i; unsigned char chunk [CHUNK]; memset（letcnt，0，sizeof（letcnt））; for（i = 0; i< len - CHUNK; i + = CHUNK）{ fread（chunk，CHUNK，1，f）; letter_count（chunk，CHUNK）; } fread（chunk，len - i，1，f）; letter_count（chunk，len - i）; 返回letcnt; } void print_counts（）{ int i; for（i = 0; i <256; i ++）{ printf（'％c'：％d \ n，（unsigned char）i，letcnt [i]）; $ b $ // st_size int main（int argc，char ** argv） { assert（argc == 2）; FILE * f = fopen（argv [1]，r）; struct stat st; stat（argv [1]，& st）; off_t len = st.st_size; letter_count_chunks（len，f）; fclose（f）; print_counts（）; 返回0; }

Spoiler: Yes. See below.

Trying to optimize a letter counter to match C. I've fought it to a 2x deficit.
letterCount :: B.ByteString -> V.Vector Int letterCount bs = V.accumulate (\a _ -> a + 1) (V.replicate 256 0) letters1 where len = B.length bs letters1 = V.generate len (\i -> (fromIntegral $! B.index bs i, ()))
Some notes:

It was really slow until I changed Data.Vector to Data.Vector.Unboxed. Why is that?

I thought most of the time would be spent in accumulate. I was wrong.

70% of the time is spent in generate.

Haskell code suffers from having to pointlessly convert Word8 to Int; Also, a useless army of () may or may not actually be created.

Full listing:
import qualified Data.ByteString as B import qualified Data.Vector.Unboxed as V import System.Environment import Text.Printf letterCount :: B.ByteString -> V.Vector Int letterCount bs = V.accumulate (\a _ -> a + 1) (V.replicate 256 0) letters1 where len = B.length bs letters1 = V.generate len (\i -> (fromIntegral $! B.index bs i, ())) printCounts :: V.Vector Int -> IO () printCounts cs = mapM_ (uncurry $ printf "%c: %d\n") (zip (map toEnum [0..255] :: String) (V.toList cs)) main :: IO () main = do filename <- fmap head getArgs f <- B.readFile filename let counts = letterCount f printCounts counts
Competing C code:
#include <assert.h> #include <stdio.h> #include <string.h> #include <sys/stat.h> #include <stdlib.h> int letcnt [256]; int* letter_count(unsigned char *s, unsigned int len) { int i; memset(letcnt, 0, 256 * sizeof(int)); for(i = 0; i < len; i++){ letcnt[*(s + i)]++; } return (letcnt); } void print_counts() { int i; for(i = 0; i < 256; i++) { printf("'%c': %d\n", (unsigned char) i, letcnt[i]); } } // st_size int main(int argc, char **argv) { assert(argc == 2); FILE* f = fopen(argv[1], "r"); struct stat st; stat(argv[1], &st); off_t len = st.st_size; unsigned char* contents = calloc(len, 1); fread(contents, len, 1, f); fclose(f); letter_count(contents, len); print_counts(); return 0; }
Timings;
$ time ./a.out /usr/share/dict/words > /dev/null real 0m0.012s user 0m0.005s sys 0m0.005s $ time ./lettercount /usr/share/dict/words > /dev/null real 0m0.017s user 0m0.009s sys 0m0.007s

Update

I think the performance ceiling is down to this bug: runST isn't free. Not that I believe it's impossible to optimize further but unlikely to approach C so long as runST imposes some overhead.

Also, fixed C-code based on @Zeta's comment.
解决方案
Yes. If you compile with -fllvm then Haskell will match C in User time. The big surprise is if you switch to Lazy Bytestrings, the Haskell version will beat the C version on Real time by a small but significant margin.
import qualified Data.ByteString.Lazy.Char8 as B import qualified Data.Vector.Unboxed as V import System.Environment import Text.Printf letterCount :: B.ByteString -> V.Vector Int letterCount bs = V.unsafeAccumulate (\a _ -> a + 1) (V.replicate 256 0) (parse bs) parse :: B.ByteString -> V.Vector (Int, ()) parse = V.unfoldr step where step s = if B.null s then Nothing else Just ((fromIntegral . fromEnum $ B.head s, ()), B.tail s) {-# INLINE parse #-} printCounts :: V.Vector Int -> IO () printCounts cs = mapM_ (uncurry $ printf "%c: %d\n") (zip (map toEnum [0..255] :: String) (V.toList cs)) main :: IO () main = do filename <- fmap head getArgs f <- B.readFile filename let counts = letterCount f printCounts counts
Remember to compile like:
ghc -O2 -fllvm letterCount.hs
So, Vector + ByteString.Lazy + LLVM > C. I love it!

Update

In fairness to C I updated the C code to use a single buffer which avoids doing a big allocation up front (or any allocations at all) and will be more cache friendly. Now the Haskell and C codes show no significant difference, both about 190ms in run-time on a best-of-3 basis against a large, 150M input file:
#include <assert.h> #include <stdio.h> #include <string.h> #include <sys/stat.h> #include <stdlib.h> #define CHUNK 16384 int letcnt [256]; int* letter_count(unsigned char *s, unsigned int len) { int i; for(i = 0; i < len; i++){ letcnt[*(s + i)]++; } return (letcnt); } int* letter_count_chunks(unsigned int len, FILE* f) { int i; unsigned char chunk [CHUNK]; memset(letcnt, 0, sizeof(letcnt)); for(i = 0; i < len - CHUNK; i+= CHUNK) { fread(chunk, CHUNK, 1, f); letter_count(chunk, CHUNK); } fread(chunk, len - i, 1, f); letter_count(chunk, len - i); return letcnt; } void print_counts() { int i; for(i = 0; i < 256; i++) { printf("'%c': %d\n", (unsigned char) i, letcnt[i]); } } // st_size int main(int argc, char **argv) { assert(argc == 2); FILE* f = fopen(argv[1], "r"); struct stat st; stat(argv[1], &st); off_t len = st.st_size; letter_count_chunks(len, f); fclose(f); print_counts(); return 0; }

这篇关于在计算字母时，Haskell可以击败C吗？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在计算字母时，Haskell可以击败C吗？ [英] Can Haskell beat C at counting letters?

问题描述

更新

更新

Update

Update

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

在计算字母时，Haskell可以击败C吗？ [英] Can Haskell beat C at counting letters?

问题描述

更新

更新

Update

Update

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭