建立连接日志系统 [英] Building a connection log system
问题描述
我正在构建一个智能"日志系统,在该系统中,我可以监视客户连接,例如启动和停止与服务器的连接建立时间.
I'm building a "Smart" Log System, where I'm capable of monitoring customers connections, like, start and stop connection establishment time to server.
原始记录:
Dec 19 00:00:03 172.16.20.24 pppoe,ppp,info <pppoe-customer1>: terminating... - peer is not responding
Dec 19 00:00:03 172.16.20.24 pppoe,ppp,info,account customer1 logged out, 4486 1009521 23444247 12573 18159
Dec 19 00:00:03 172.16.20.24 pppoe,ppp,info <pppoe-customer1>: disconnected
Dec 19 00:00:07 172.16.20.24 pppoe,info PPPoE connection established from 60:E3:27:A2:60:09
Dec 19 00:00:08 172.16.20.24 pppoe,ppp,info,account customer2 logged in, 10.171.3.185
Dec 19 00:00:08 172.16.20.24 pppoe,ppp,info <pppoe-customer2>: authenticated
Dec 19 00:00:08 172.16.20.24 pppoe,ppp,info <pppoe-customer2>: connected
Dec 19 00:00:13 172.16.20.24 pppoe,info PPPoE connection established from C0:25:E9:7F:C0:41
Dec 19 00:00:14 172.16.20.24 pppoe,ppp,error <ccfa>: user customer3 authentication failed
Dec 19 00:00:32 172.16.20.24 pppoe,info PPPoE connection established from C0:25:E9:7F:C0:41
Dec 19 00:00:36 172.16.20.24 pppoe,ppp,error <ccfb>: user customer3 authentication failed
Dec 19 00:01:06 172.16.20.24 pppoe,info PPPoE connection established from C0:25:E9:7F:C0:41
对我来说重要的是:捕获具有 connected 和 disconnected 字符串的行.
What are important for me: capture lines with connected and disconnected strings.
我知道了:
import os
import re
import sys
f = open('log.log','r')
log = []
for line in f:
if re.search(r': connected|: disconnected',line):
ob = dict()
ob['USER'] = re.search(r'<pppoe(.*?)>',line).group(0).replace("<pppoe-","").replace(">","")
ob['DATA'] = re.search(r'^\w{3} \d{2} \d{2}:\d{2}:\d{2}',line).group(0)
ob['CONNECTION'] = re.search(r': .*',line).group(0).replace(": ", "")
log.append(ob)
我还在学习,所以这不是最出色的正则表达式,但是没关系! 需要立即优化此日志列表,想获得以下示例:
I'm still learning, so that's not the most brilliant regex, but it's ok! Need now refine this log list, want to get to this sample:
{"connection" : [{
"start" : "Dec 19 10:12:58",
"username" : "customer2"}
{"connection" : [{
"start" : "Dec 20 10:12:58",
"username" : "customer1"}
{"connection" : [{
"start" : "Dec 19 10:12:58",
"stop" : Dec 22 10:04:35",
"username" : "customer4"}
{"connection" : [{
"start" : "Dec 19 10:12:58",
"stop" : "Dec 24 10:04:35"
"username" : "customer3"}
我的障碍,
- RAW日志不断生成,我需要确定是否 用户已存在. 如果是:更新连接(customer2删除他的连接,需要注册!),但是如果他有常量删除连接,会发生什么?
- The RAW Log is constantly being generated, I need to identify if some user already exists. IF YES: update connection (customer2 drops his connections, need registre it!) but What's happen if he has constants drop connections?
例如:
Dec 19 10:20:58 172.16.20.24 pppoe,ppp,info <pppoe-customer2>: disconnected
Dec 19 01:00:36 172.16.20.24 pppoe,ppp,error <ccfb>: user customer3 authentication failed
Dec 19 01:01:06 172.16.20.24 pppoe,info PPPoE connection established from C0:25:E9:7F:C0:41
Dec 19 10:21:38 172.16.20.24 pppoe,ppp,info <pppoe-customer2>: authenticated
Dec 19 10:21:48 172.16.20.24 pppoe,ppp,info <pppoe-customer2>: connected
Dec 19 10:22:38 172.16.20.24 pppoe,ppp,info <pppoe-customer3>: authenticated
Dec 19 10:22:58 172.16.20.24 pppoe,ppp,info <pppoe-customer2>: disconnected
第一次断开连接,添加起来很简单.
First disconnection, simple to add it.
{"connection" : [{
"start" : "Dec 19 10:12:58"
"stop" : "Dec 19 10:20:58",
"username" : "customer2"}
在下一个身份验证中,我需要搜索此特定用户,插入新的开始"连接时间并清除停止".依此类推.
In the next authentication, I need to search this specific user, insert new "start" connection time and erase "stop". And so on.
{"connection" : [{
"start" : "Dec 19 10:21:48"
"username" : "customer2"}
- 我的下一个挑战者,它将创建这个新的优化列表.
试图做到这一点,但不起作用!
Tried to make this, but does not work!
conn = []
for l in log:
obcon = dict()
if not obcon:
obcon['USER'] = l['USER']
if l['DATA'] == 'connected':
obcon['START'] = l['DATA']
obcon['STOP'] = ""
else:
obcon['STOP'] = l['DATA']
conn.append(obcon)
在构建新列表之前,我需要检查是否存在某些用户,如果不存在,请构建它!我用来识别启动/停止连接的['CONNECTION']:
Before build the new list, I'd need to check if exists some user, if not, let's build it! The ['CONNECTION'] I use to identify starts/stop connections:
Disconnected -> STOP
Connected -> START
我不知道我是否需要更具体. 需要想法.拜托!
I dont know if I need to be more specific. Need ideas. Please!
推荐答案
我认为var log
的类型应为dict
,因为它可以帮助您更轻松地找到现有的用户数据.
接下来,在任何地方都使用re(...).group(0)
,这是整个匹配字符串.例如,提取用户名时,您写了'<pppoe(.*?)>'
,但它位于group(1)
中(在正则表达式中,括号用于匹配提取).
我的建议是(注意-我删除了sys
和os
的导入,因为它们没有被使用):
In my opinion, the var log
should be of type dict
as it will help you find an existing user data more easily.
Next, you used re(...).group(0)
everywhere, which is the entire matching string. For example, when extracting the user name, you wrote '<pppoe(.*?)>'
, but it is located in group(1)
(in regex, parentheses are used for match extraction).
My suggestion is (Note - I removed the imports of sys
and os
as they are not in use):
import re
f = open('log.log', 'r')
log = dict()
for line in f:
reg = re.search(r': ((?:dis)?connected)', line) # finds connected or disconnected
if reg is not None:
user = re.search(r'<pppoe-(.*?)>', line).group(1)
# if the user in the log, get it, else create it with empty dict
ob = log.setdefault(user, dict({'USER': user}))
ob['CONNECTION'] = reg.group(1)
time = re.search(r'^\w{3} \d{2} \d{2}:\d{2}:\d{2}', line).group(0)
if ob['CONNECTION'].startswith('dis'):
ob['END'] = time
else:
ob['START'] = time
if 'END' in ob:
ob.pop('END')
如果日志文件为:
Dec 19 00:00:03 172.16.20.24 pppoe,ppp,info <pppoe-customer1>: terminating... - peer is not responding
Dec 19 00:00:03 172.16.20.24 pppoe,ppp,info,account customer1 logged out, 4486 1009521 23444247 12573 18159
Dec 19 00:00:03 172.16.20.24 pppoe,ppp,info <pppoe-customer1>: disconnected
Dec 19 00:00:07 172.16.20.24 pppoe,info PPPoE connection established from 00:00:00:00:00:00
Dec 19 00:00:08 172.16.20.24 pppoe,ppp,info,account customer2 logged in, 127.0.0.1
Dec 19 00:00:08 172.16.20.24 pppoe,ppp,info <pppoe-customer2>: authenticated
Dec 19 00:00:08 172.16.20.24 pppoe,ppp,info <pppoe-customer2>: connected
Dec 19 00:00:13 172.16.20.24 pppoe,info PPPoE connection established from 00:00:00:00:00:00
Dec 19 00:00:14 172.16.20.24 pppoe,ppp,error <ccfa>: user customer3 authentication failed
Dec 19 00:02:03 172.16.20.24 pppoe,ppp,info,account customer2 logged out, 4486 1009521 23444247 12573 18159
Dec 19 00:02:03 172.16.20.24 pppoe,ppp,info <pppoe-customer2>: disconnected
Dec 19 00:02:08 172.16.20.24 pppoe,ppp,info,account customer3 logged in, 127.0.0.1
Dec 19 00:02:08 172.16.20.24 pppoe,ppp,info <pppoe-customer3>: authenticated
Dec 19 00:02:08 172.16.20.24 pppoe,ppp,info <pppoe-customer3>: connected
log
的值将是:
{
'customer1': {
'CONNECTION': 'disconnected',
'END': 'Dec 19 00:00:03',
'USER': 'customer1'
},
'customer3': {
'START': 'Dec 19 00:02:08',
'CONNECTION': 'connected',
'USER': 'customer3'
},
'customer2': {
'START': 'Dec 19 00:00:08',
'CONNECTION': 'disconnected',
'END': 'Dec 19 00:02:03',
'USER': 'customer2'
}
}
这篇关于建立连接日志系统的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!