为什么file_get_contents返回奇怪的字符? [英] why file_get_contents returning strange characters?
问题描述
我正在尝试解析 http://www.desi-tashan.com/category/pakistan-tvs/aaj-tv/3-idiots/和file_get_contents.
I am trying to parse http://www.desi-tashan.com/category/pakistan-tvs/aaj-tv/3-idiots/ with file_get_contents.
但是它返回非常不寻常的字符和符号.
But it returns very unusual characters and symbols.
就像我解析 http://www.desi-tashan.com/一样很好有人可以告诉我为什么会这样吗?
where as if I parse http://www.desi-tashan.com/ it works nicely. Could someone tell why is this happening?
涉及编码解码吗?
页面似乎是用wordpress制作的.
The page seems to be made with wordpress..
推荐答案
您看到的内容是gzipped
您可能会对 gzdecode
或 zlib-decode
(请注意,未启用PHP中的Zlib支持默认情况下)
您的代码可能看起来像这样
Your code might look like this
$url = 'http://www.desi-tashan.com/category/pakistan-tvs/aaj-tv/3-idiots/';
$content = file_get_contents($url);
$decoded_content = gzdecode($content); // or zlib_decode($content);
Another solution here on stackoverflow, which adds HTTP header Accept-Encoding
in the request telling the server NOT to gzip.
但是,它在www.desi-tashan.com
上不起作用,服务器忽略了Accept-Encoding
标头,并始终返回压缩后的内容
However, it doesn't work on www.desi-tashan.com
, the server is ignoring Accept-Encoding
header, and always return gzipped content
这篇关于为什么file_get_contents返回奇怪的字符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!