用GHC编译非常大的常量 [英] Compiling very large constants with GHC

查看:116
本文介绍了用GHC编译非常大的常量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

今天我要求GHC编译一个8MB的Haskell源文件。 GHC想了大约6分钟,吞下了将近2GB的内存,然后最终放弃了内存不足的错误。

[另外,我很高兴GHC有理由放弃而不是放弃我的整个PC。]

基本上我有一个程序读取文本文件,做一些奇特的解析,构建一个数据结构,然后使用 show 将其转储到一个文件中。我没有在最终的应用程序中包含整个解析器和源数据,而是希望将生成的数据作为编译时常量。通过在 show 的输出中添加一些额外的东西,你可以使它成为一个有效的Haskell模块。但GHC显然不喜欢编译多MB源文件。



(最奇怪的部分是,如果你只是阅读 String I / O和读取数据,它实际上并不需要太多时间或内存。 c $ c>据说效率很低...)

我隐约记得其他人在过去让GHC编译大文件时遇到了麻烦。 FWIW,我尝试使用 -O0 ,这加速了崩溃,但并未阻止它。所以是在Haskell程序中包含大型编译时常量的最佳方法?



(在我的例子中,常量只是一个嵌套 Data.Map 带有一些有趣的标签。)



最初,我认为GHC可能不满意读取模块由一行八百万个字符组成。 (!!)与布局规则等有关。或者也许这种深深嵌套的表情让它感到不安。但我试图让每个子表达式都成为顶级标识符,这没有任何帮助。 (为每个添加明确的类型签名似乎让编译器稍微快乐一些)。还有什么我可能会试图让编译器的工作更简单吗?

最后,我能够使我实际尝试存储的数据结构更小。 (比如,300KB)这使得GHC更加快乐。 (而且最终的应用要快得多)。但是为了将来的参考,我很想知道最好的方法是什么。

解决方案

您最好的选择可能是将您的值的字符串表示形式编译为可执行文件。要以干净的方式做到这一点,请参阅我在上一个问题中的回答
$ b

使用它,只需将表达式存储在 myExpression.exp 中,并执行读取[litFile | myExpression.exp |] 并启用 QuasiQuotes 扩展,并将表达式存储为字符串文字 。




我试着做类似的事情来存储实际的常量,但是它的失败原因与将值嵌入到 .hs 文件会。我的尝试是:

Verbatim.hs

  module Verbatim where 

import Language.Haskell.TH
import Language.Haskell.TH.Quote
import Language.Haskell.Meta。解析

readExp :: String - > Q Exp
readExp =要么失败返回。 parseExp

verbatim :: QuasiQuoter
verbatim = QuasiQuoter {quoteExp = readExp}

verbatimFile :: QuasiQuoter
verbatimFile = quoteFile verbatim

测试计划:

  { - #LANGUAGE QuasiQuotes# - } 
module Main(main)where

import verbatim
$ b $ main main :: IO()
main = print [verbatimFile | test.exp |]

该程序适用于小型测试。 exp 文件,但是在这台电脑上已经失效了大约2MiB。


Today I asked GHC to compile an 8MB Haskell source file. GHC thought about it for about 6 minutes, swallowing almost 2GB of RAM, and then finally gave up with an out-of-memory error.

[As an aside, I'm glad GHC had the good sense to abort rather than floor my whole PC.]

Basically I've got a program that reads a text file, does some fancy parsing, builds a data structure and then uses show to dump this into a file. Rather than include the whole parser and the source data in my final application, I'd like to include the generated data as a compile-time constant. By adding some extra stuff to the output from show, you can make it a valid Haskell module. But GHC apparently doesn't enjoy compiling multi-MB source files.

(The weirdest part is, if you just read the data back, it actually doesn't take much time or memory. Strange, considering that both String I/O and read are supposedly very inefficient...)

I vaguely recall that other people have had trouble with getting GHC to compile huge files in the past. FWIW, I tried using -O0, which speeded up the crash but did not prevent it. So what is the best way to include large compile-time constants in a Haskell program?

(In my case, the constant is just a nested Data.Map with some interesting labels.)

Initially I thought GHC might just be unhappy at reading a module consisting of one line that's eight million characters long. (!!) Something to do with the layout rule or such. Or perhaps that the deeply-nested expressions upset it. But I tried making each subexpression a top-level identifier, and that was no help. (Adding explicit type signatures to each one did appear to make the compiler slightly happier, however.) Is there anything else I might try to make the compiler's job simpler?

In the end, I was able to make the data-structure I'm actually trying to store much smaller. (Like, 300KB.) This made GHC far happier. (And the final application much faster.) But for future reference, I'd be interested to know what the best way to approach this is.

解决方案

Your best bet is probably to compile a string representation of your value into the executable. To do this in a clean manner, please refer to my answer in a previous question.

To use it, simply store your expression in myExpression.exp and do read [litFile|myExpression.exp|] with the QuasiQuotes extension enabled, and the expression will be "stored as a string literal" in the executable.


I tried doing something similar for storing actual constants, but it fails for the same reason that embedding the value in a .hs file would. My attempt was:

Verbatim.hs:

module Verbatim where

import Language.Haskell.TH
import Language.Haskell.TH.Quote
import Language.Haskell.Meta.Parse

readExp :: String -> Q Exp
readExp = either fail return . parseExp

verbatim :: QuasiQuoter
verbatim = QuasiQuoter { quoteExp = readExp }

verbatimFile :: QuasiQuoter
verbatimFile = quoteFile verbatim

Test program:

{-# LANGUAGE QuasiQuotes #-}
module Main (main) where

import Verbatim

main :: IO ()
main = print [verbatimFile|test.exp|]

This program works for small test.exp files, but fails already at about 2MiB on this computer.

这篇关于用GHC编译非常大的常量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆