如何通过PHP获取一个HTML页面作为字符串? [英] How can I get an HTML page as a string via PHP?
问题描述
我使用 simple_php_dom
和curl从网页获取一些信息。问题是页面没有正确构建,所以DOM对象包含错误的信息。
I am fetching some info via PHP from a webpage using simple_php_dom
and curl. The problem is that the page is not built correctly so the DOM object contains erroneous info.
如何获取HTML文件作为PHP var中的字符串,可以通过它运行正则表达式?
How can I get the HTML file as a string in a PHP var so that I can run a regular expression through it?
Curl不工作,因为它忽略了错误的部分。
simple_html_dom .php
有相同的问题。
wget
不工作,因为我没有权限在
Curl doesn't work as it is ignoring the bad part.
simple_html_dom.php
has the same issue.
wget
doesn't work since I don't have permissions for it on the server.
推荐答案
file_get_contents - 将整个文件读入字符串
file_get_contents — Reads entire file into a string
string file_get_contents (
string $filename [, int $flags= 0 [, resource $context [, int $offset= -1 [, int $maxlen= -1 ]]]]
)
:
此函数与file()类似,除了file_get_contents ()返回一个字符串中的文件,从指定的偏移量开始到maxlen个字节。失败时,file_get_contents()将返回FALSE。
This function is similar to file(), except that file_get_contents() returns the file in a string, starting at the specified offset up to maxlen bytes. On failure, file_get_contents() will return FALSE.
file_get_contents()是将文件内容读入字符串的首选方法。
file_get_contents() is the preferred way to read the contents of a file into a string. It will use memory mapping techniques if supported by your OS to enhance performance.
它可以与网页和文件一起使用。您可以抓取HTML,只需使用 http://whatever.com/page.html 作为$ filename。
And it works both with webpages and files. You can grab the HTML, just by using "http://whatever.com/page.html" as $filename.
这篇关于如何通过PHP获取一个HTML页面作为字符串?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!