使用PHP将UCS-2文件转换为UTF-8 [英] Convert UCS-2 file to UTF-8 with PHP

查看:321
本文介绍了使用PHP将UCS-2文件转换为UTF-8的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



在将数据插入数据库之前,我想要一个从客户端提供的CSV文件,该CSV文件必须使用PHP解析并插入数据库。将其转换为UTF-8,但我似乎无法找到如何。



这是我试图检测的文件编码:

  $ enca -d -L zh ./artigos.txt 
./artigos.txt:通用字符集2字节; UCS-2; BMP
CRLF行终止符
成对排列的字节顺序(1,2 - > 2,1)

我尝试使用iconv功能,但它混淆了转换,并显示与原件不同的字符的结果。



文件的第一行(base64编码):

  IgAwADMAMQAxADkAIgAsACIANwAzADEAMwA 


解决方案例

这似乎工作(小端),不包括任何非ASCII字符

  $ S = 'IgAwADMAMQAxADkAIgAsACIANwAzADEAMwA'; 
$ t = base64_decode($ s);
echo iconv('UCS-2LE','UTF-8',substr($ t,0,-1)); //最后一个字节无效
pre>

I have a CSV file supplied from a client which has to be parsed and inserted into a database using PHP.

Before inserting the data into the DB, I want to convert it to UTF-8 but I cant seem to find how.

This is what I got trying to detect the files encoding:

$ enca -d -L zh ./artigos.txt 
    ./artigos.txt: Universal character set 2 bytes; UCS-2; BMP
    CRLF line terminators
    Byte order reversed in pairs (1,2 -> 2,1)

I tried using the iconv function but it messes up the conversion and shows the result with diferent characters than the originals.

First line of the file (base64 encoded):

IgAwADMAMQAxADkAIgAsACIANwAzADEAMwA0ADYAMgA2ADQAMAAwADEANQAiACwAIgBBAGcAcgBhAGYAYQBkAG8AcgAgAFIAYQBwAGkAZAAgADkAIABIAGUAYQB2AHkAIABEAHUAdAB5ACIALAAiAEEAZwByAGEAZgBvACAAOQAvADgALAAgADkALwAxADAALAAgADkALwAxADIALAAgADkALwAxADQAIgAsACIAMQAxADAAZgBsAHMAIgAsACIAIgAsACIAIgAsACIAIgAsACIAMAAzADEAMQA5AC4AagBwAGcAIgAsACIAIgAsACIAMQAsADIAMAAiACwAIgA1ADkALAA5ADAAIgAsACIAMgAiACwAIgAwACIALAAiADAAIgAsACIAMAAiACwAIgAwACIALAAiADAAIgAsACIAMAAiACwAIgAwACIALAAiADAAIgAsACIAMAAiACwAIgAwACIALAAiADAAIgAsACIAMAAiACwAIgAwACIALAAiADAAIgAsACIAMAAiACwAIgAwACIALAAiADAAIgAsACIAMAAiACwAIgAwACIALAAiADAAIgAsACIARgBhAGwAcwBlACIADQAK

解决方案

This seems to work(little endian), althoug you didnt include any non ascii chars

$s='IgAwADMAMQAxADkAIgAsACIANwAzADEAMwA0ADYAMgA2ADQAMAAwADEANQAiACwAIgBBAGcAcgBhAGYAYQBkAG8AcgAgAFIAYQBwAGkAZAAgADkAIABIAGUAYQB2AHkAIABEAHUAdAB5ACIALAAiAEEAZwByAGEAZgBvACAAOQAvADgALAAgADkALwAxADAALAAgADkALwAxADIALAAgADkALwAxADQAIgAsACIAMQAxADAAZgBsAHMAIgAsACIAIgAsACIAIgAsACIAIgAsACIAMAAzADEAMQA5AC4AagBwAGcAIgAsACIAIgAsACIAMQAsADIAMAAiACwAIgA1ADkALAA5ADAAIgAsACIAMgAiACwAIgAwACIALAAiADAAIgAsACIAMAAiACwAIgAwACIALAAiADAAIgAsACIAMAAiACwAIgAwACIALAAiADAAIgAsACIAMAAiACwAIgAwACIALAAiADAAIgAsACIAMAAiACwAIgAwACIALAAiADAAIgAsACIAMAAiACwAIgAwACIALAAiADAAIgAsACIAMAAiACwAIgAwACIALAAiADAAIgAsACIARgBhAGwAcwBlACIADQAK';
$t=base64_decode($s);
echo iconv('UCS-2LE', 'UTF-8', substr($t, 0, -1));//last byte was invalid

这篇关于使用PHP将UCS-2文件转换为UTF-8的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆