是否可以让 GCC 使用 BOM 源文件编译 UTF-8? [英] Is it possible to get GCC to compile UTF-8 with BOM source files?

查看：62 发布时间：2022/1/23 20:41:02 gcc utf-8 g++ byte-order-mark

本文介绍了是否可以让 GCC 使用 BOM 源文件编译 UTF-8?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我使用 Windows 上的 Microsoft Visual Studio 和 uBuntu Linux 上的 GCC 开发 C++ 跨平台.

I develop C++ cross platform using Microsoft Visual Studio on Windows and GCC on uBuntu Linux.

在 Visual Studio 中，我可以使用像π"这样的 unicode 符号.和²"在我的代码中.Visual Studio 始终将源文件保存为带有 BOM(字节顺序标记)的 UTF-8.

In Visual Studio I can use unicode symbols like "π" and "²" in my code. Visual Studio always saves the source files as UTF-8 with BOM (Byte Order Mark).

例如:

// A = π.r²
double π = 3.14;

只有在我先删除 BOM 时，GCC 才会愉快地编译这些文件.如果我不删除 BOM，我会收到如下错误:

GCC happily compiles these files only if I remove the BOM first. If I do not remove the BOM, I get errors like these:

wwga_hydutils.cpp:28:9: 错误:程序中出现杂散317"

wwga_hydutils.cpp:28:9: error: stray ‘317’ in program

wwga_hydutils.cpp:28:9: 错误:程序中出现杂散200"

wwga_hydutils.cpp:28:9: error: stray ‘200’ in program

这让我想到了这个问题:

Which brings me to the question:

有没有办法让 GCC 在不删除 BOM 的情况下编译 UTF-8 文件?

Is there a way to get GCC to compile UTF-8 files without first removing the BOM?

我正在使用:

Windows 7
Visual Studio 2010

和:

uBuntu Oneiric 11.10
GCC 4.6.1(由 apt-get install gcc 提供)

正如第一位评论者所指出的，我的问题是不是 BOM，而是在字符串常量之外有非 ascii 字符.GCC 不喜欢符号名称中的非 ascii 字符，但事实证明 GCC 与带有 BOM 的 UTF-8 完全兼容.

As the first commenter pointed out, my problem was not the BOM, but having non-ascii characters outside of string constants. GCC does not like non-ascii characters in symbol names, but it turns out GCC is fully compatible with UTF-8 with BOM.

推荐答案

根据GCC Wiki，目前尚不支持.您可以使用 -fextended-identifiers 并预处理您的代码以将标识符转换为 UCN.从链接页面:

According to the GCC Wiki, this isn't supported yet. You can use -fextended-identifiers and pre-process your code to convert the identifiers to UCN. From the linked page:

perl -pe 'BEGIN { binmode STDIN, ":utf8"; } s/(.)/ord($1) < 128 ? $1 : sprintf("\U%08x", ord($1))/ge;'

另请参阅 g++ unicode 变量名和 Unicode 标识符和 C++11 中的源代码?

这篇关于是否可以让 GCC 使用 BOM 源文件编译 UTF-8?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

是否可以让 GCC 使用 BOM 源文件编译 UTF-8? [英] Is it possible to get GCC to compile UTF-8 with BOM source files?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

是否可以让 GCC 使用 BOM 源文件编译 UTF-8? [英] Is it possible to get GCC to compile UTF-8 with BOM source files?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭