将日志文件解析为CSV格式 [英] Parsing a log file into CSV format

查看:744
本文介绍了将日志文件解析为CSV格式的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试开发一小段代码,这将帮助我界定一个日志文件,该日志文件存储了到一周内发送的指向不同IP的大量跟踪路由.

Im trying to develop a small bit of code that will help me delimit a log file that has stored a large number of trace routes to different IP's that were sent over a period of week.

我希望在每个跟踪路径的起点和终点使用"--- START ---"和"--- END ---"标签对它们进行拆分.示例如下:

Im hoping to split them using a '--- START ---' and '--- END ---' tag at the begining and the end of each trace route. Example is as followed:

--- START ---
Mon Mar 12 22:45:05 GMT 2012
traceroute to xxxxxxx (xxxxxxxx), 30 hops max, 60 byte packets
1  xxxxxxx (xxxxx)  1.085 ms  1.662 ms  2.244 ms
2  xxxxxxxx (xxxxxxx)  0.792 ms  0.782 ms  0.772 ms
3  xxxxxxxxx (xxxxxxxxx)  8.545 ms  9.170 ms  9.644 ms
4  etc
5  etc
--- END ---
--- START ---
Mon Mar 12 22:45:05 GMT 2012
traceroute to xxxxxxxxx (xxxxxxxx), 30 hops max, 60 byte packets
1  139.222.0.1 (xxxxxxxxx)  0.925 ms  1.318 ms  1.954 ms
2  10.0.0.1 (xxxxxxxx)  0.345 ms  0.438 ms  0.496 ms
3  172.16.0.34 (xxxxxxxxx)  0.830 ms  2.553 ms  0.809 ms
4 etc
5 etc
6 etc
--- END ---

任何人都可以帮助我完成此操作的方法,无论是在matlab还是java中...我还试图计算每个跟踪路由产生的跳数.那就是--- END ---标记被抛出之前给出的数字...

Can anyone give me a hand into how this would be done, in matlab or java... I'm also trying to count the amount of hops each trace route makes. That being the number given before the --- END --- tag is thrown...

任何帮助将不胜感激,

干杯.

推荐答案

快速的肮脏示例:

import java.io.*;
import java.text.*;
import java.util.*;
import java.util.regex.*;

public class Main
{
  private static class Record
  {
    public Date date;

    public String to;
    public int hops;
    public int packetSize;

    public String toString()
    {
      return date + " ::: " + to + ", " + hops + " hops, "
        + packetSize + " bytes";
    }
  }

  public static void main(String[] args) throws Exception
  {
    Scanner s = new Scanner(new FileReader(new File("input.txt")));

    Record currentRecord = null;
    ArrayList<Record> list = new ArrayList<Record>();

    while (s.hasNextLine()) {
      String line = s.nextLine();

      if ("--- START ---".equals(line)) {
        currentRecord = new Record();

      } else if ("--- END ---".equals(line)) {
        list.add(currentRecord);
        currentRecord = null;

      } else if (currentRecord.date == null) {
        currentRecord.date = new SimpleDateFormat(
          "EEE MMM dd HH:mm:ss zzz yyyy").parse(line);

      } else if (line.startsWith("traceroute to ")) {
        Pattern p = Pattern.compile(
          "traceroute to ([^ ]+) [^,]+, ([^ ]+) hops max, ([^ ]+) byte packets");
        Matcher m = p.matcher(line);

        if (m.matches()) {
          currentRecord.to = m.group(1);
          currentRecord.hops = Integer.parseInt(m.group(2));
          currentRecord.packetSize = Integer.parseInt(m.group(3));
        }
      }
    }

    for (int i = 0; i < list.size(); i++)
      System.out.println(list.get(i).toString());
  }

}

输出:

Tue Mar 13 04:15:05 GMT+05:30 2012 ::: 203.98.69.105, 30 hops, 60 bytes
Tue Mar 13 04:15:05 GMT+05:30 2012 ::: 62.193.36.27, 30 hops, 60 bytes

我正朝着不同的方向(ScannerPatternSimpleDateFormat等)指向您.对于单独的网关"项目,还可以使用String.split(),并使用" "(两个空格)作为分隔符.

I'm pointing you in a bunch of different directions (Scanner, Pattern, SimpleDateFormat, etc.). For the individual "gateway" items, you could also use String.split() using " " (two spaces) as the delimiter.

这篇关于将日志文件解析为CSV格式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆