使用PHP将UCS-2文件转换为UTF-8 [英] Convert UCS-2 file to UTF-8 with PHP
问题描述
在将数据插入数据库之前,我想要一个从客户端提供的CSV文件,该CSV文件必须使用PHP解析并插入数据库。将其转换为UTF-8,但我似乎无法找到如何。
这是我试图检测的文件编码:
$ enca -d -L zh ./artigos.txt
./artigos.txt:通用字符集2字节; UCS-2; BMP
CRLF行终止符
成对排列的字节顺序(1,2 - > 2,1)
我尝试使用iconv功能,但它混淆了转换,并显示与原件不同的字符的结果。
文件的第一行(base64编码):
IgAwADMAMQAxADkAIgAsACIANwAzADEAMwA
这似乎工作(小端),不包括任何非ASCII字符
$ S = 'IgAwADMAMQAxADkAIgAsACIANwAzADEAMwA';
pre>
$ t = base64_decode($ s);
echo iconv('UCS-2LE','UTF-8',substr($ t,0,-1)); //最后一个字节无效
I have a CSV file supplied from a client which has to be parsed and inserted into a database using PHP.
Before inserting the data into the DB, I want to convert it to UTF-8 but I cant seem to find how.
This is what I got trying to detect the files encoding:
$ enca -d -L zh ./artigos.txt ./artigos.txt: Universal character set 2 bytes; UCS-2; BMP CRLF line terminators Byte order reversed in pairs (1,2 -> 2,1)
I tried using the iconv function but it messes up the conversion and shows the result with diferent characters than the originals.
First line of the file (base64 encoded):
IgAwADMAMQAxADkAIgAsACIANwAzADEAMwA0ADYAMgA2ADQAMAAwADEANQAiACwAIgBBAGcAcgBhAGYAYQBkAG8AcgAgAFIAYQBwAGkAZAAgADkAIABIAGUAYQB2AHkAIABEAHUAdAB5ACIALAAiAEEAZwByAGEAZgBvACAAOQAvADgALAAgADkALwAxADAALAAgADkALwAxADIALAAgADkALwAxADQAIgAsACIAMQAxADAAZgBsAHMAIgAsACIAIgAsACIAIgAsACIAIgAsACIAMAAzADEAMQA5AC4AagBwAGcAIgAsACIAIgAsACIAMQAsADIAMAAiACwAIgA1ADkALAA5ADAAIgAsACIAMgAiACwAIgAwACIALAAiADAAIgAsACIAMAAiACwAIgAwACIALAAiADAAIgAsACIAMAAiACwAIgAwACIALAAiADAAIgAsACIAMAAiACwAIgAwACIALAAiADAAIgAsACIAMAAiACwAIgAwACIALAAiADAAIgAsACIAMAAiACwAIgAwACIALAAiADAAIgAsACIAMAAiACwAIgAwACIALAAiADAAIgAsACIARgBhAGwAcwBlACIADQAK
解决方案This seems to work(little endian), althoug you didnt include any non ascii chars
$s='IgAwADMAMQAxADkAIgAsACIANwAzADEAMwA0ADYAMgA2ADQAMAAwADEANQAiACwAIgBBAGcAcgBhAGYAYQBkAG8AcgAgAFIAYQBwAGkAZAAgADkAIABIAGUAYQB2AHkAIABEAHUAdAB5ACIALAAiAEEAZwByAGEAZgBvACAAOQAvADgALAAgADkALwAxADAALAAgADkALwAxADIALAAgADkALwAxADQAIgAsACIAMQAxADAAZgBsAHMAIgAsACIAIgAsACIAIgAsACIAIgAsACIAMAAzADEAMQA5AC4AagBwAGcAIgAsACIAIgAsACIAMQAsADIAMAAiACwAIgA1ADkALAA5ADAAIgAsACIAMgAiACwAIgAwACIALAAiADAAIgAsACIAMAAiACwAIgAwACIALAAiADAAIgAsACIAMAAiACwAIgAwACIALAAiADAAIgAsACIAMAAiACwAIgAwACIALAAiADAAIgAsACIAMAAiACwAIgAwACIALAAiADAAIgAsACIAMAAiACwAIgAwACIALAAiADAAIgAsACIAMAAiACwAIgAwACIALAAiADAAIgAsACIARgBhAGwAcwBlACIADQAK'; $t=base64_decode($s); echo iconv('UCS-2LE', 'UTF-8', substr($t, 0, -1));//last byte was invalid
这篇关于使用PHP将UCS-2文件转换为UTF-8的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!