sep =“;语句打断由XSL生成的CSV文件中的utf8 BOM [英] sep=";" statement breaks utf8 BOM in CSV file which is generated by XSL

查看:142
本文介绍了sep =“;语句打断由XSL生成的CSV文件中的utf8 BOM的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在使用XSLT开发CSV导出。和CSV文件将使用%99%与Excel在我的情况下,所以我必须考虑Excel的行为。



我的第一个问题是csv中的德语特殊字符。即使CSV编码是UTF8,Excel也无法正确打开CSV文件与UTF8。特殊字符变得怪异的符号。我找到了这个问题的解决方案。我刚刚添加了3个附加字节(



并且答案似乎唯一的方法是使用BOM的UTF16 le编码。



注也是根据 http://wiki.scn.sap.com/wiki/display/ABAP/CSV+tests+of+encoding+and+column+separator?original_fqdn=wiki.sdn.sap.com
似乎如果你使用utf16-le与选项卡分隔符,那么它工作。



我想知道excel是否读取sep =;然后重新调用该方法以获取CSV文本并丢失BOM - 我试图给出不正确的文本,我找不到任何工作告诉excel同时采用sep和编码。


I'm currently developing CSV export with XSLT. And CSV file will be used %99 percent with Excel in my case, so I have to consider Excel behavior.

My first problem was German special characters in csv. Even fact that CSV encoding is UTF8, Excel cannot open properly CSV file with UTF8. The special characters are getting weird symbols. I found a solution for this problem. I just added 3 additional bytes(EF BB BF - a.k.a BOM Header) beginning of content bytes. Because UTF8 BOM is way to say that 'hey dude, it is UTF8, open it properly' to Excel. Problem solved!

And my second problem was about separator. The default separator could be comma or semicolon depending on region. I think it is semicolon in Germany and comma in UK. So, in order to prevent this problem, I had to add the line in below:

<xsl:text>sep=;</xsl:text>

or

<xsl:text>sep=,</xsl:text>

(This separator was not implemented as hard-coded)

But my problem which I cannot find any solution is that if you add "sep=;" or "sep=," beginning of the file while the CSV file is being generated with UT8-BOM, the BOM doesn't help for showing special characters properly anymore! And I'm sure that BOM bytes are always in the beginning of byte array. This screen shot is from MS Excel in Mac OS X:

First 3 symbols belong to BOM header.

Have you ever had like this problem or do you have any suggestions? Thank you.

Edit:

I share the printscreens.

a. With BOM and <xsl:text>sep=;</xsl:text>

b. Just with BOM

The Java code:

// Write the bytes
ServletOutputStream out = resp.getOutputStream();
if(contentType.toString().equals("CSV")) {
  // The additional bytes in below is prefix indicates that the content is in UTF-8.
  out.write(239);
  out.write(187);
  out.write(191);
} 
out.write(bytes); // Content bytes, in this case XSL

The XSL code:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0" 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output method="text" version="1.0" encoding="UTF-8" indent="yes" />

    <xsl:template match="/">
    <xsl:text>sep=;</xsl:text>
    <table>
        ...
        </table>
</xsl:template>

解决方案

You are right, there is no way in Excel 2007 to get it load both the encoding and the seperator correctly across different locales when someone double clicks a CSV file.

It seems like when you specify sep= after the BOM it forgets the BOM has told it that it is UTF-8.

You have to specify the BOM because in certain locales Excel does not detect the seperator. For instance in danish, the default seperator is ;. If you output tab or comma seperated text then it does not detect the seperator and in other locales if you seperate with semi-colon it doesn't load. You can test this by changing the locae format in windows settings - excel then picks this up.

From this question: Is it possible to force Excel recognize UTF-8 CSV files automatically?

and the answers it seems the only way is to use UTF16 le encoding with BOM.

Note also that as per http://wiki.scn.sap.com/wiki/display/ABAP/CSV+tests+of+encoding+and+column+separator?original_fqdn=wiki.sdn.sap.com it seems that if you use utf16-le with tab seperators then it works.

I've wondered if excel reads sep=; and then re-calls the method to get the CSV text and loses the BOM - I've tried giving incorrect text and I can't find any work around that tells excel to take both the sep and the encoding.

这篇关于sep =“;语句打断由XSL生成的CSV文件中的utf8 BOM的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆