如何使用HTML简单dom获得Content-type? [英] How to get Content-type using html simple dom?
问题描述
我尝试了 find('meta [http-equiv = Content-type]')
,但未能检索到该信息。
I tried find('meta[http-equiv="Content-type"]')
but it failed to retrieve that information.
推荐答案
SimpleHTMLDom在选择器中不使用带引号的字符串文字。只是 elem [attr = value]
。而且 value 的比较似乎区分大小写(可能有一种使其不区分大小写的方法,但我不知道)*
SimpleHTMLDom doesn't use quoted string literals in the selector. It's just elem[attr=value]
. And the comparison of value seems to be case-sensitive (there may be a way to make it case-insensitive, but that I don't know)*
例如
require 'simple_html_dom.php';
$html = file_get_html('http://www.google.com/');
// most likely one one element but foreach doesn't hurt
foreach( $html->find('meta[http-equiv=content-type]') as $ct ) {
echo $ct->content, "\n";
}
打印 text / html; charset = ISO-8859-1
。
*编辑:是的,有一种方法可以执行不区分大小写的匹配,请使用 * =
而不是 =
*edit: yes, there is a way to perform a case-insensitive match, use *=
instead of =
find('meta[http-equiv*=content-type]')
edit2:顺便说一句, http-equiv * =内容类型
也会匹配< meta http-equiv = haha-no-content-types。 ..
(它仅测试字符串是否在属性值中的某个位置)。但这是我唯一找到的不区分大小写的功能/运算符。我猜您可以在这种情况下使用它;-)
编辑3:它使用preg_match(’... / i’),并且模式/选择器直接传递给该函数。因此,您可以执行类似 http-equiv * = ^ content-type $
的操作来匹配 http-equiv = Content类型 ,而不是
http-equiv = xyzContent-typeabc
。但是我不知道这是否是必要功能。
edit2: btw that http-equiv*=content-type
thingy would also match <meta http-equiv="haha-no-content-types"...
(it only tests if the string is somewhere in the attribute's value). But it's the only case-insensitive function/operator I could find. I guess you can live with it in this case ;-)
edit 3: It uses preg_match('.../i') and the pattern/selector is directly passed to that function. Therefore you could do something like http-equiv*=^content-type$
to match http-equiv="Content-type"
but not http-equiv="xyzContent-typeabc"
. But I don't know if this is a warranted feature.
这篇关于如何使用HTML简单dom获得Content-type?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!