如何使DOT正确处理UTF-8到PostScript并具有多个图形/页面? [英] How can I make DOT correctly process UTF-8 to PostScript and have multiple graph/pages?

查看:223
本文介绍了如何使DOT正确处理UTF-8到PostScript并具有多个图形/页面?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

此点源

graph A
{
    a;
}
graph B
{
    "Enûma Eliš";
}

当使用 dot -Tps 会产生此错误


警告:UTF-8输入使用了非Latin1字符,此PostScript驱动程序无法处理

Warning: UTF-8 input uses non-Latin1 characters which cannot be handled by this PostScript driver

我可以通过传递 -Tps:cairo 来解决UTF-8问题,但是只有图A在输出中-被截断为一页。 -Tpdf 也是如此。我的安装上没有其他后记驱动程序。

I can fix the UTF-8 problem by passing -Tps:cairo but then only graph A is in the output -- it is truncated to a single page. The same happens with -Tpdf. There are no other postscript driver available on my installation.

我可以将图形拆分成单独的文件,然后将它们连接起来,但我宁愿不这样做。有没有办法正确处理UTF-8和输出多页?

I could split the graphs into separate files and concatenate them afterwards, but I'd rather not. Is there a way to have correct UTF-8 handling and multiple page output?

推荐答案

显然 PS驱动程序无法处理旧版ISO8859-1以外的其他编码。我认为它也不能更改字体。

Apparently the dot PS driver can't handle other encodings than the old ISO8859-1. I think it can't change fonts either.

您可以做的一件事是运行过滤器来更改 dot 的PostScript输出。下面的Perl程序可以做到这一点,它是我对某些代码的改编。它将编码从UTF-8更改为经过修改的ISO编码,并用多余的字符替换未使用的字符。

One thing you can do is to run a filter to change dot's PostScript output. The following Perl program does that, it's an adaptation of some code I had. It changes the encoding from UTF-8 to a modified ISO encoding with extra characters replacing unused ones.

当然,输出仍然取决于具有这些字符的字体。由于 dot (我认为)仅使用默认的PostScript字体,因此标准拉丁之外的任何内容都是不可能的...

Of course, the output still depends on the font having the characters. Since dot (I think) only uses the default PostScript fonts, anything beyond the "standard latin" is out of the question...

它可与Ghostscript或任何定义 Adob​​eGlyphList 的解释器一起使用。

It works with Ghostscript or with any interpreter which defines AdobeGlyphList.

过滤器应为这样使用:

dot -Tps graph.dot | perl reenc.pl > output.ps

这里是:

#!/usr/bin/perl

use strict;
use warnings;
use open qw(:std :utf8);

my $ps = do { local $/; <STDIN> };
my %high;
my %in_use;
foreach my $char (split //, $ps) {
    my $code = (unpack("C", $char))[0];
    if ($code > 127) {
        $high{$char} = $code;
        if ($code < 256) {
            $in_use{$code} = 1;
        }
    }
}
my %repl;
my $i = 128;
foreach my $char (keys %high) {
    if ($in_use{$high{$char}}) {
        $ps =~ s/$char/sprintf("\\%03o", $high{$char})/ge;
        next;
    }
    while ($in_use{$i}) { $i++; }
    $repl{$i} = $high{$char};
    $ps =~ s/$char/sprintf("\\%03o", $i)/ge;
    $i++;
}
my $psprocs = <<"EOPS";
/EncReplacements <<
  @{[ join(" ", %repl) ]}
>> def
/RevList AdobeGlyphList length dict dup begin
  AdobeGlyphList { exch def } forall
end def
% code -- (uniXXXX)
/uniX { 16 6 string cvrs dup length 7 exch sub exch
  (uni0000) 7 string copy dup  4 2 roll putinterval } def
% font code -- glyphname
/unitoname { dup RevList exch known
  { RevList exch get }
  { uniX cvn } ifelse
  exch /CharStrings get 1 index known not
  { pop /.notdef } if
} def
/chg-enc { dup length array copy EncReplacements
  { currentdict exch unitoname 2 index 3 1 roll put } forall
} def
EOPS

$ps =~ s{/Encoding EncodingVector def}{/Encoding EncodingVector chg-enc def};
$ps =~ s/(%%BeginProlog)/$1\n$psprocs/;

print $ps;

这篇关于如何使DOT正确处理UTF-8到PostScript并具有多个图形/页面?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆