Shift-JIS中IBM大型机上的日语COBOL代码;代表转移到PC后如何? [英] Japanese COBOL code on IBM mainframe in Shift-JIS; represented after transfer to a PC how?

查看:202
本文介绍了Shift-JIS中IBM大型机上的日语COBOL代码;代表转移到PC后如何?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们有一个日语客户端,该客户端在大型机上具有COBOL中的源代码.他声称大型机上的代码以Shift-JIS2表示(我们认为我们对此非常了解).将该代码传输到PC时,最常用的编码是什么? 我们已经给他发送了一个程序来处理该COBOL代码,这似乎很麻烦.客户不会直接将代码提供给我们,因此实验很困难.他的实验似乎表明UTF-8.我假设Shift-JIS2中可编码的日文字符相应地转换为Unicode等效项.有人在这里有经验吗?

We have a Japanese client that has source code in COBOL on an mainframe. He claims the code on the mainframe is represented in Shift-JIS2 (and we think we understand that pretty well). When that code is transferred to an PC, what is the most common encoding used? We've sent him a program to process that COBOL code and it seems to choke. The customer won't give us the code directly, so experiments are hard. His experiments seem to indicate UTF-8; I assume the Japanese characters encodable in Shift-JIS2 are correspondingly converted to Unicode equivalents. Anybody have any experience here?

我认为我们解决了我们的奥秘.客户端正在PC上使用CP-932("ShiftJIS"),但是他的COBOL程序的标识符中带有日语字符,这就是我们的工具令人窒息的原因.

I think we solved our mystery. The client is (duh!) using CP-932 ("ShiftJIS") on the PC, but his COBOL program has Japanese characters in the identifiers, and that's why our tool is choking.

跟进:更加令人惊讶. SHIFT-JIS通常将我们认为的ASCII文本编码为所谓的"FULLWIDTH"字符,其占用的屏幕空间与东亚表意文字相同;传统的ASCII字符充当半角字符.因此,有一个全"A" ,"B",..."Z"以及全宽-".显然,要处理日语COBOL,我们的COBOL解析器必须不仅接受Western ASCII,而且还接受FULLWIDTH等效项,尤其是. FULLWIDTH字母,以及令人惊讶的是FULLWIDTH HYPHEN用来分隔COBOL标识符中的字母".

Followup: A bit more of a surprise. SHIFT-JIS often encodes what we think of as ASCII text as so-called "FULLWIDTH" characters, that take the same screen space as an East Asian ideograph; conventionalo ASCII characters act as half-width. So, there's a FULLWIDTH "A" , "B", ... "Z" as well as FULLWIDTH "-". Apparantly, to process Japanese COBOL, our COBOL parser has to accept not only Western ASCII, but also the FULLWIDTH equivalents, esp. the FULLWIDTH letters and surprisingly a FULLWIDTH HYPHEN used to seperate "letters" in a COBOL identifier.

IBM Enterprise COBOL允许在标识符中使用DBCS字符. kes!

IBM Enterprise COBOL allows DBCS characters in identifiers. Yikes!

推荐答案

在日本,仍然仍然使用三种编码:EUC-JP,ISO-2022-JP和Shift-JIS.

There's three encodings that are all still very much in use in Japan: EUC-JP, ISO-2022-JP, and Shift-JIS.

ISO-2022-JP通常用于电子邮件.虽然您会在Unix计算机中看到EUC-JP.我个人除了Shift-JIS之外没有处理其他任何事情. (没有大型机.)

ISO-2022-JP is usually used for E-mails. While you'll see EUC-JP in Unix machines. I personally haven't dealt with anything other than Shift-JIS though. (Nor mainframes.)

这篇关于Shift-JIS中IBM大型机上的日语COBOL代码;代表转移到PC后如何?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆