使用 preg_match 在 PHP 中解析 Apache 日志 [英] Parse Apache log in PHP using preg_match
问题描述
我需要将数据保存在表中(用于报告、统计等...),以便用户可以按时间、用户代理等进行搜索.我有一个每天运行的脚本,它读取 Apache 日志然后插入它在数据库中.
I need to save data in a table (for reporting, stats etc...) so a user can search by time, user agent etc. I have a script that runs every day that reads the Apache Log and then insert it in the database.
日志格式:
10.1.1.150 - - [29/September/2011:14:21:49 -0400] "GET /info/ HTTP/1.1" 200 9955 "http://www.domain.com/download/" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_8; de-at) AppleWebKit/533.21.1 (KHTML, like Gecko) Version/5.0.5 Safari/533.21.1"
我的正则表达式:
preg_match('/^(\S+) (\S+) (\S+) \[([^:]+):(\d+:\d+:\d+) ([^\]]+)\] \"(\S+) (.*?) (\S+)\" (\S+) (\S+) (\".*?\") (\".*?\")$/',$log, $matches);
现在当我打印时:
print_r($matches);
Array
(
[0] => 10.1.1.150 - - [29/September/2011:14:21:49 -0400] "GET /info/ HTTP/1.1" 200 9955 "http://www.domain.com/download/" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_8; de-at) AppleWebKit/533.21.1 (KHTML, like Gecko) Version/5.0.5 Safari/533.21.1"
[1] => 10.1.1.150
[2] => -
[3] => -
[4] => 29/September/2011
[5] => 14:21:49
[6] => -0400
[7] => GET
[8] => /info/
[9] => HTTP/1.1
[10] => 200
[11] => 9955
[12] => "http://www.domain.com/download/"
[13] => "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_6_8; de-at) AppleWebKit/533.21.1 (KHTML, like Gecko) Version/5.0.5 Safari/533.21.1"
)
我得到:"http://www.domain.com/download/"
和用户代理相同.我怎样才能在正则表达式中去掉这些 "
?奖励(有没有什么快速的方法可以轻松插入日期/时间)?
I get: "http://www.domain.com/download/"
and same for user agent. How can I get rid of these "
in the regex? Bonus (Is there any quick way to insert the date/time easily)?
谢谢
推荐答案
要在 PHP 中解析 Apache access_log
登录,您可以使用以下正则表达式:
To parse an Apache access_log
log in PHP you can use this regex:
$regex = '/^(\S+) (\S+) (\S+) \[([^:]+):(\d+:\d+:\d+) ([^\]]+)\] \"(\S+) (.*?) (\S+)\" (\S+) (\S+) "([^"]*)" "([^"]*)"$/';
preg_match($regex ,$log, $matches);
要匹配 Apache error_log
格式,您可以使用此正则表达式:
To match the Apache error_log
format, you can use this regex:
$regex = '/^\[([^\]]+)\] \[([^\]]+)\] (?:\[client ([^\]]+)\])?\s*(.*)$/i';
preg_match($regex, $log, $matches);
$matches[1] = Date and time, $matches[2] = severity,
$matches[3] = client addr (if present) $matches[4] = log message
它匹配有或没有客户端的行:
It matches lines with or without the client:
[Tue Feb 28 11:42:31 2012] [notice] Apache/2.4.1 (Unix) mod_ssl/2.4.1 OpenSSL/0.9.8k PHP/5.3.10 configured -- resuming normal operations
[Tue Feb 28 14:34:41 2012] [error] [client 192.168.50.10] Symbolic link not allowed or link target not accessible: /usr/local/apache2/htdocs/x.js
这篇关于使用 preg_match 在 PHP 中解析 Apache 日志的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!