XLSX文件是否按定义编码为UTF-8? [英] Are XLSX files UTF-8 encoded by definition?

查看:276
本文介绍了XLSX文件是否按定义编码为UTF-8?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用PHP读取XLSX文件.确切地说,请使用 gneustaetter/XLSXReader .但是,这些XLSX文件是由不同的公司使用不同的软件生成的.因此,我想检查它们是否具有正确的编码,并且始终只找到UTF-8.

I'm trying to read in XLSX files with PHP. Using gneustaetter/XLSXReader to be exact. However, these XLSX-files are generated by different companies, using different software. So I wanted to check if they have the right encoding and always just found UTF-8.

因此,我的问题如上所述:XLSX文件是否按定义编码为UTF-8?还是有可能破坏我正在处理的导入脚本的异常?

Therefore my question as above: Are XLSX files UTF-8 encoded by definition? Or are there exceptions that could break the import script I'm working on?

推荐答案

假设它始终为UTF-8会很冒险.我只是将您的期望寄托在XML标头中XML所描述的内容上.以我的经验,Windows-1252编码的数据总是在您最不期望的时候出现.您可以查看 XLSX规范更多信息仔细了解更多信息.

It'd be risky to presume it's always UTF-8. I'd just key your expectations to what the XML describes in the XML header. In my experience Windows-1252 encoded data shows up all the time when you least expect it. You might check the XLSX specification more closely to find out more.

这是一个与Windows-1252编码的XLSX文件有关的铬错误,所以这些似乎很自然地存在.也许它们是由Microsoft Office以外的程序产生的.随着诸如LibreOffice之类的事情变得越来越流行,可能没有最强大的XLSX支持的旧版本最终可能会与您的代码进行交互.您可能不想在代码中出现这样的错误.

Here's a Chromium bug relating to a Windows-1252 encoded XLSX file, so these seem to exist in the wild. Maybe they're produced by programs other than Microsoft Office. With things like LibreOffice becoming more popular, older versions that may not have had the most robust XLSX support might end up interacting with your code. You probably don't want to have a bug like this show up in your code.

除非您有具体的理由拒绝无效编码,否则请尝试并尽量包容.严格来说,JSON是UTF-8.根据定义,XLSX似乎是XML,但是编码的含义却不尽如人意.UTF-8似乎只是默认约定.

Try and be as accommodating as possible unless you have a concrete reason for rejecting invalid encoding. JSON, by strict definition, is UTF-8. XLSX seems to be XML by definition, but the encoding is not as nailed down. UTF-8 simply seems to be the default convention.

这篇关于XLSX文件是否按定义编码为UTF-8?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆