Groovy解析文本文件 [英] Groovy parsing text file

查看:141
本文介绍了Groovy解析文本文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个我想解析的文件日志,并且有一些问题。起初,它似乎很简单。我会继续发布我提供的源代码,然后解释我正在尝试做什么。



我试图解析的文件包含这些数据:

 硬盘设备0:/ dev / sda 
硬盘型号:ST3160815A
硬盘序列号: 5RA020QY
HDD修正:3.AAA
硬盘大小:152628 MB
接口:IDE / ATA
温度:33 C
健康状况:100%
性能:70%
开机时间:27天,13小时
预估使用寿命:超过1000天

硬盘设备1:/ dev / sdb
硬盘型号:TOSHIBA MK1237GSX
硬盘序列号:97LVF9MHS
硬盘版本:DL130M
硬盘大小:114473 MB
接口:S-ATA
温度:30 C
健康:100%
性能:100%
开机时间:38天,11小时
Est。终身:超过1000天

我的源代码(如下)基本上是逐行分割文件然后将该行分成两个(键:值)。

来源:

  def dataList = [:] 
def theInfoName =C:\\testdata.txt

File theInfoFile = new File(theInfoName)

def words
def key
def value

if(!theInfoFile.exists()){
println文件不存在

} else {

theInfoFile.eachLine {line - >

if(line.trim()。size()== 0){
return null

} else {

words = line.split(\ t:)
key = words [0]
value = words [1]
dataList [key] = value

println$ {words [0]} = $ {words [1]}
}

}
println$ dataList.Performance//测试Performance是否结束 - 写过以前的绩效值
}

我的源代码的问题是,当我使用我的getters(如$ dataList.Performance)只显示文件中的最后一个而不是两个。



所以我想知道,我该如何解析文件这样它可以保存两个硬盘的信息?有没有办法将信息打包成'硬盘驱动器对象'?



任何和所有帮助表示赞赏



一些附注:

该文件位于Windows机器上(即使信息是从nix系统中获取的)



文本文件被制表符,冒号和空格分开(就像我在源代码中所显示的那样),只是认为我会说,因为它看起来不像这个页面。


解决方案

这将读取块中的数据(用空行分隔块) $ b $ def dataList = []
def theInfoName ='testdata.txt'

文件theInfoFile =新文件(theInfoName)

if(!theInfoFile.exists()){
println文件不存在
} else {
def driveInfo = [:]
//遍历文件
theInfoFile.eachLine {line - >
//如果该行不为空
if(line.trim()){
//分割成键值
def(key,value)= line .split('\t:').collect {it.trim()}
//并将它们存储在driveInfo Map
driveInfo。$ key= value
}
else {
//如果该行为空,并且我们有一些信息
if(driveInfo){
//将其存储在列表中
dataList<<< ; driveInfo
//并清除它
driveInfo = [:]
}
}
}
//当我们完成文件时,存储任何其余数据
if(driveInfo){
dataList<< driveInfo
}
}

dataList.eachWithIndex {it,index - >
printlnDrive $ index
it.each {k,v - >
println\ t $ k = $ v
}
}

手指划过你的硬盘信息部分之间有空行(你在测试数据中显示了一行): - )

btw:我得到以下输出:

 驱动器0 
HDD设备0 = / dev / sda
HDD型号ID = ST3160815A
硬盘序列号= 5RA020QY
硬盘版本= 3.AAA
硬盘大小= 152628 MB
接口= IDE / ATA
温度= 33 C
健康= 100%
表现= 70%
开机时间= 27天,13小时
Est。使用寿命=超过1000天
硬盘1
硬盘设备1 = / dev / sdb
硬盘型号ID = TOSHIBA MK1237GSX
硬盘序列号= 97LVF9MHS
硬盘版本= DL130M
硬盘大小= 114473 MB
接口= S-ATA
温度= 30 C
健康= 100%
性能= 100%
开机时间= 38天,11小时
Est。终身=超过1000天






也得到了代码:

  def dataList = [] 
def theInfoFile = new File('testdata.txt ')

if(!theInfoFile.exists()){
println文件不存在
} else {
//分割文件的文本转换成\\\
\\\

分隔的块//然后,从一个空列表开始依次遍历每个文本块
dataList = theInfoFile.text.split('\\\
\\\
').inject([]){list,block - >
//将当前块拆分为行(基于换行符char)
//然后从空映射开始,依次遍历每行
//完成后,添加此映射到我们在
list<<<< block.split('\\\
').inject([:]){map,line - >
//将行分割为一个键和一个值(修剪每个元素)
def(key,value)= line.split('\t:').collect {it.trim( )}
//然后,将这个键:值映射添加到我们在
map<<<< [(key):value] // leftShift运算符也返回映射
//每次关闭被调用时,注入闭包必须返回累积的
//状态
}
}
}

dataList.eachWithIndex {it,index - >
printlnDrive $ index
it.each {k,v - >
println\ t $ k = $ v
}
}

但是它必须立即将整个文件加载到内存中(并且依赖 \\\
作为EOL终止字符)


I have a file log that I would like to parse and am having some issues. At first it seemed it would be simple. I'll go ahead and post the source I have come up with and then explain what I am trying to do.

The file I'm trying to parse contains this data:

HDD Device 0 : /dev/sda
HDD Model ID  : ST3160815A
HDD Serial No : 5RA020QY
HDD Revision  : 3.AAA
HDD Size     : 152628 MB
Interface    : IDE/ATA
Temperature         : 33 C
Health  : 100%
Performance  : 70%
Power on Time : 27 days, 13 hours
Est. Lifetime : more than 1000 days

HDD Device 1 : /dev/sdb
HDD Model ID  : TOSHIBA MK1237GSX
HDD Serial No : 97LVF9MHS
HDD Revision  : DL130M
HDD Size     : 114473 MB
Interface    : S-ATA
Temperature  : 30 C
Health  : 100%
Performance  : 100%
Power on Time : 38 days, 11 hours
Est. Lifetime : more than 1000 days

My source code (below) basically breaks up the file line by line and then splits the line into two (key:value).

Source:

def dataList = [:]
def theInfoName = "C:\\testdata.txt"

File theInfoFile = new File(theInfoName)

def words
def key
def value

if (!theInfoFile.exists()) {
     println "File does not exist"

} else {

 theInfoFile.eachLine { line ->

 if (line.trim().size() == 0) {
  return null

 } else {

  words = line.split("\t: ")
  key=words[0] 
  value=words[1]
  dataList[key]=value

  println "${words[0]}=${words[1]}"
  }

 }
 println "$dataList.Performance"  //test if Performance has over-written the previous Performance value
}

The problem with my source is that when I use my getters (such as $dataList.Performance) it only shows the last one in the file rather than two.

So I'm wondering, how do I parse the file so that it keeps the information for both hard drives? Is there a way to pack the info into a 'hard drive object'?

Any and all help is appreciated

A few side notes:

The file is on a windows machine (even though the info is grabbed from a nix system)

The text file is split by a tab, colon, and space (like shown in my source code) just thought I would state that because it doesn't look like that on this page.

解决方案

This will read the data in blocks (with blank lines separating the blocks)

def dataList = []
def theInfoName = 'testdata.txt'

File theInfoFile = new File( theInfoName )

if( !theInfoFile.exists() ) {
  println "File does not exist"
} else {
  def driveInfo = [:]
  // Step through each line in the file
  theInfoFile.eachLine { line ->
    // If the line isn't blank
    if( line.trim() ) {
      // Split into a key and value
      def (key,value) = line.split( '\t: ' ).collect { it.trim() }
      // and store them in the driveInfo Map
      driveInfo."$key" = value
    }
    else {
      // If the line is blank, and we have some info
      if( driveInfo ) {
        // store it in the list
        dataList << driveInfo
        // and clear it
        driveInfo = [:]
      }
    }
  }
  // when we've finished the file, store any remaining data
  if( driveInfo ) {
    dataList << driveInfo
  }
}

dataList.eachWithIndex { it, index ->
  println "Drive $index"
  it.each { k, v ->
    println "\t$k = $v"
  }
}

Fingers crossed you have blank lines between your HDD info sections (you showed one in your test data) :-)

btw: I get the following output:

Drive 0
    HDD Device 0 = /dev/sda
    HDD Model ID = ST3160815A
    HDD Serial No = 5RA020QY
    HDD Revision = 3.AAA
    HDD Size = 152628 MB
    Interface = IDE/ATA
    Temperature = 33 C
    Health = 100%
    Performance = 70%
    Power on Time = 27 days, 13 hours
    Est. Lifetime = more than 1000 days
Drive 1
    HDD Device 1 = /dev/sdb
    HDD Model ID = TOSHIBA MK1237GSX
    HDD Serial No = 97LVF9MHS
    HDD Revision = DL130M
    HDD Size = 114473 MB
    Interface = S-ATA
    Temperature = 30 C
    Health = 100%
    Performance = 100%
    Power on Time = 38 days, 11 hours
    Est. Lifetime = more than 1000 days


Messing around, I also got the code down to:

def dataList = []
def theInfoFile = new File( 'testdata.txt' )

if( !theInfoFile.exists() ) {
  println "File does not exist"
} else {
  // Split the text of the file into blocks separated by \n\n
  // Then, starting with an empty list go through each block of text in turn
  dataList = theInfoFile.text.split( '\n\n' ).inject( [] ) { list, block ->
    // Split the current block into lines (based on the newline char)
    // Then starting with an empty map, go through each line in turn
    // when done, add this map to the list we created in the line above
    list << block.split( '\n' ).inject( [:] ) { map, line ->
      // Split the line up into a key and a value (trimming each element)
      def (key,value) = line.split( '\t: ' ).collect { it.trim() }
      // Then, add this key:value mapping to the map we created 2 lines above
      map << [ (key): value ] // The leftShift operator also returns the map 
                              // the inject closure has to return the accumulated
                              // state each time the closure is called
    }
  }
}

dataList.eachWithIndex { it, index ->
  println "Drive $index"
  it.each { k, v ->
    println "\t$k = $v"
  }
}

But that has to load the whole file into memory at once (and relies on \n as the EOL termination char)

这篇关于Groovy解析文本文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆