处理文本文件中的数据并转换为 csv [英] process data from text file and convert into csv

查看:26
本文介绍了处理文本文件中的数据并转换为 csv的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我们的组织中,每个月都会在服务器级别运行一些作业并收集数据,它会发现服务器上正在运行的内容并执行一些检查.这些文件是文本文件并复制到一个存储库服务器.文件名将是 _20200911.log

In our organization, every month a few jobs will run and collect data on server level and it will find what is running on the server and also perform some checks. These files are text files and copied to one repository server. The file name will be <servername>_20200911.log

此示例文件检查运行 postgreSQL 的服务器.

This sample file checks for servers where postgreSQL is running.

Date Collected                  || 11-10-2020 03:20:42 GMT ||
Server Name                     || pglinux1             ||
Operating system                || RHEL                     || passed
OS Version                      || 6.9                      || passed
Kernel version                  || 2.6.32-735.23.1.el6      || passed
Kernel architecture             || x86-64                   || passed
Total Memory                    || 16 gig                   || passed
/opt/postgres fs free           || 32 gig                   || passed
/opt/postgres/data fs free      || 54 gig                   || passed
Is cron jobs exist              || yes                      || passed
Is postgres installed           || yes                      || passed
Postgres version >10            || no                       || failed
repmgr installed                || yes                      || passed
repmgr version  >4              || yes                      || passed
How may pg cluster running      || 3                        || Passed
pgbackrest installed            || yes                      || passed

我们将获得不同技术的类似文件,例如 oracle、mysql、weblogic……每个月我们都需要处理这些文件并识别失败的检查并与相应的团队合作.现在我正在整合所有 postgreSQL/oracle 的数据.就我而言,我将获得大量文件并读取每个文本文件并将数据转换为 cvs,如下所示

We will get similar files for different technologies, like oracle, mysql, weblogic ... Every month we need to process these files and identify failed checks and work with the corresponding team. Now I am consolidating data for all postgreSQL/oracle. In my case I will get lot of files and read each text file and convert data to cvs as below

Date Collected, server name, OPerating system , OS Version,Kernel version,Kernel architecture,Total Memory, /opt/postgres fs free,/opt/postgres/data fs free,Is cron jobs exist,    
11-10-2020 03:20:42 GMT,pglinux1, RHEL, passed, passed, passed, passed , passed , passed, passed passed, failed  
11-10-2020 03:20:42 GMT,pglinux2, RHEL, passed, passed, passed, passed , passed , passed, passed passed, failed  
11-10-2020 03:20:42 GMT,pglinux3, RHEL, passed, passed, passed, passed , passed , passed, passed passed, failed

最初我想,将这些文本文件转换为 CSV 并从每个文件中选取第二行,将其合并为一个文件.我的尝试失败了,因为某些文件数据不一致.现在我想创建一个名为 servercheck.txt 的文件,其中包含所有检查.使用此检查文件可对所有文件中的数据进行 grep 并打印到 CSV 文件中(每个服务器一行).

Initially I thought, convert these text files into CSV and pick the second row from each file, consolidate it into one file. I failed with this attempt, since some file data is not consistent. Now I am thinking to create a file called servercheck.txt with all the checks. Use this checks file to grep data in all files and print into a CSV file (one row per server).

#! /bin/bash
awk -v ORS='{print $0 ","} /tmp/servecheck.txt |sed 's/ *$//g' > serverchecks.csv
for file in `ls -lart *2020091t.log |awk '{print $9}'`
do  
     while read line
     do 
        grep "$line" $file |awk -F "||" '{print $3}' |awk -v ORS='{print $3 ","}' >> serverchecks.csv
     done < servercheck.txt
done 

上面的代码是在同一行(标题和数据)写入数据.

The above code is writing data in same row (heading and data).

我希望我提供了所有必要的细节.请帮助我们提供代码、建议和处理此问题的最佳方法.

I hope I have provided all necessary details. Please help us with code, recommendation, and the best approach to handle this issue.

推荐答案

这可能对你有帮助

for inputfile in *
do
  awk -F "||" '
    { 
    for (i=1; i<=NF; i++)  {
      a[NR,i] = $i
    }
}   
NF>p { p = NF }
END {    
     for(j=1; j<=p; j++) {
        str=a[1,j]
     for(i=2; i<=NR; i++){
        str=str" "a[i,j];
     }
    print str
   }
  }' $inputfile| sed 's/ + /,/g' > tmpfile && mv tmpfile "$inputfile"
done  

按照@Ed Morton 的建议进行编辑

Edited as suggested by @Ed Morton

for inputfile in *
  do 
  awk -F "||" '
   { 
    for (i=1; i<=NF; i++)  {
    a[NR,i] = $i
 }
 }   
  NF>p { p = NF }
  END {    
  for(j=1; j<=p; j++) {
    str=a[1,j]
 for(i=2; i<=NR; i++){
    str=str" "a[i,j];
 }
{gsub(/ + /,",",str); print str}
}
}' $inputfile > tmpfile && mv tmpfile "$inputfile"
done

这篇关于处理文本文件中的数据并转换为 csv的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆