在Pig脚本中使用正则表达式从日志中提取字符串 [英] Extracting string from logs with regex in pig script

查看:139
本文介绍了在Pig脚本中使用正则表达式从日志中提取字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有日志数据,我想将每个信息提取到一个变量中

I have log data and I want to extract each information into a variable

以下是示例一行日志. {:id => 306,:name =>"bblite",:cpu => {:quota => 4,:allocated => 4,:actual => 0},:memory => {:quota => 8192, :allocated => 8192,:actual => 8578},:cluster_stats => {"wc1104" => {:cpu => 0,:mem => 8578}}}}

The following is sample one line log. {:id=>306, :name=>"bblite", :cpu=>{:quota=>4, :allocated=>4, :actual=>0}, :memory=>{:quota=>8192, :allocated=>8192, :actual=>8578}, :cluster_stats=>{"wc1104"=>{:cpu=>0, :mem=>8578}}}

我需要具有所有ID的变量,具有所有名称的变量,具有CPU的变量和具有所有群集统计信息的变量

I need variable that have all ids,a variable that have all names,a variable that have CPUs and a variable that have all cluster stats

以下是我的猪脚本的一部分.我可以存储ID,但不知道如何使用正则表达式提取其余ID.

The following is the portion of my pig script. I can store the ids but I have no idea how to extract the rest of them using regex.

. .

matching_messages = FILTER raw_lines BY (LOWER(message) MATCHES '.*cc_altus-plaform.*');

ids = FOREACH matching_messages GENERATE REGEX_EXTRACT(message,'id=>\\d*',0);

names = FOREACH matching_messages GENERATE REGEX_EXTRACT(message,'name=>\\"\\",',0);

line_with_date = FOREACH matching_messages GENERATE
DateFormatter(timestamp) AS formatted_time: chararray, message;

DUMP names;

推荐答案

以下代码段是我编写的可运行的正则表达式:

The following codes snippet is the regex I have written which works:

id = FOREACH matching_messages GENERATE REGEX_EXTRACT(message,'(?<=id=>)\\d*',0);

name = FOREACH matching_messages GENERATE REGEX_EXTRACT(message,'name=>\\"[\\w]*\\"',0);

cpu = FOREACH matching_messages GENERATE REPLACE( REGEX_EXTRACT(message, 'cpu=>\\{.*?\\}',0), ',','');

memory = FOREACH matching_messages GENERATE REGEX_EXTRACT(message,'memory=>\\{.*?\\}',0);

cluster = FOREACH matching_messages GENERATE REGEX_EXTRACT(message,'cluster_stats=>\\{.*?\\}',0);

这篇关于在Pig脚本中使用正则表达式从日志中提取字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆