file_get_contents()将UTF-8转换为ISO-8859-1 [英] file_get_contents() converts UTF-8 to ISO-8859-1

查看:108
本文介绍了file_get_contents()将UTF-8转换为ISO-8859-1的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从 yahoo.com 获取搜索结果.

I am trying to get search results from yahoo.com.

但是,file_get_contents()将UTF-8字符集(雅虎使用的字符集)内容转换为ISO-8859-1.

But file_get_contents() converts UTF-8 charset (charset, that yahoo uses) content to ISO-8859-1.

尝试:

$filename = "http://search.yahoo.com/search;_ylt=A0oG7lpgGp9NTSYAiQBXNyoA?p=naj%C5%A1%C5%A5astnej%C5%A1%C3%AD&fr2=sb-top&fr=yfp-t-701&type_param=&rd=pref";

echo file_get_contents($filename);

脚本为

header('Content-Type: text/html; charset=UTF-8');

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

$er = mb_convert_encoding($filename , 'UTF-8');

$s2 = iconv("ISO-8859-1","UTF-8",$filename );

echo utf8_encode(file_get_contents($filename));

无济于事,因为获取网页内容后,特殊字符(如šťž)将替换为问号???

NOT help, because after getting web content speciall characters as š ť ž are replaced with question marks ???

我将不胜感激.

推荐答案

这似乎是内容协商问题,因为file_get_contents可能发送了仅接受ISO 8859-1作为字符编码的请求.

This seems to be a content negotiation problem as file_get_contents probably sends a request that only accepts ISO 8859-1 as character encoding.

您可以使用

You can create a custom stream context for file_get_contents using stream_context_create that explicitly states that you accept UTF-8:

$opts = array('http' => array('header' => 'Accept-Charset: UTF-8, *;q=0'));
$context = stream_context_create($opts);

$filename = "http://search.yahoo.com/search;_ylt=A0oG7lpgGp9NTSYAiQBXNyoA?p=naj%C5%A1%C5%A5astnej%C5%A1%C3%AD&fr2=sb-top&fr=yfp-t-701&type_param=&rd=pref";
echo file_get_contents($filename, false, $context);

这篇关于file_get_contents()将UTF-8转换为ISO-8859-1的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆