文件抓取器PHP [英] File Crawler PHP

查看:153
本文介绍了文件抓取器PHP的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

只是想知道如何递归地搜索网站文件夹目录(与脚本上传的目录相同),并打开/读取每个文件并搜索特定的字符串?



例如我可能会这样:



search.php?string = hello%20world



这将运行一个进程,然后输出一些像

 hello world found inside

httpdocs
/index.php
/contact.php

httpdocs / private /
../prviate.php
../morestuff.php
../tastey.php

httpdocs / private / love
../../goodness.php

我不想让它链接 - 作为私有文件和非链接文件进行链接,但我希望其他所有非二进制文件都可以访问。



非常感谢



欧文

解决方案<



1)使用 grep code> exec 命令(仅当服务器支持):

  $ query = $ _GET [ '字串']; 
$ found = array();
exec(grep -Ril。escapeshellarg($ query)。'$ _SERVER ['DOCUMENT_ROOT'],$ found);

完成后,包含查询的每个文件路径都将放在 $结果。您可以根据需要迭代此数组并处理/显示。



2)循环遍历文件夹并打开每个文件,搜索字符串,并保存发现:

 功能搜索($ file,$ query,& $ found){
if(is_file文件)){
$ contents = file_get_contents($ file);
if(strpos($ contents,$ query)!== false){
//文件包含查询字符串
$ found [] = $ file;
}
} else {
//文件是一个目录
$ base_dir = $ file;
$ dh = opendir($ base_dir); $($ $ $ $)($ file = readdir($ dh))){
if(($ file!='。')& b $ b //调用search()找到的文件/目录
search($ base_dir。'/'。$ file,$ query,$ found);
}
}
closedir($ dh);
}
}

$ query = $ _GET ['string'];
$ found = array();
search($ _ SERVER ['DOCUMENT_ROOT'],$ query,$ found);

应该(未经测试)递归搜索所需字符串的每个子文件夹/文件。如果找到,它将在变量 $ found


just wondering how it would be possible to recursively search through a website folder directory (the same one as the script is uploaded to) and open/read every file and search for a specific string?

for example I might have this:

search.php?string=hello%20world

this would run a process then output somethign like

"hello world found inside"

httpdocs
/index.php
/contact.php

httpdocs/private/
../prviate.php
../morestuff.php
../tastey.php

httpdocs/private/love
../../goodness.php

I dont want it to link- crawl as private files and unlinked files are round, but i'd like every other non-binary file to be access really.

many thanks

Owen

解决方案

Two immediate solutions come to mind.

1) Using grep with the exec command (only if the server supports it):

$query = $_GET['string'];
$found = array();
exec("grep -Ril '" . escapeshellarg($query) . "' " . $_SERVER['DOCUMENT_ROOT'], $found);

Once finished, every file-path that contains the query will be placed in $found. You can iterate through this array and process/display it as needed.

2) Recursively loop through the folder and open each file, search for the string, and save it if found:

function search($file, $query, &$found) {
    if (is_file($file)) {
        $contents = file_get_contents($file);
        if (strpos($contents, $query) !== false) {
            // file contains the query string
            $found[] = $file;
        }
    } else {
        // file is a directory
        $base_dir = $file;
        $dh = opendir($base_dir);
        while (($file = readdir($dh))) {
            if (($file != '.') && ($file != '..')) {
                // call search() on the found file/directory
                search($base_dir . '/' . $file, $query, $found);
            }
        }
        closedir($dh);
    }
}

$query = $_GET['string'];
$found = array();
search($_SERVER['DOCUMENT_ROOT'], $query, $found);

This should (untested) recursively search into each subfolder/file for the requested string. If it's found, it will be in the variable $found.

这篇关于文件抓取器PHP的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆