处理文本文件中的数据并转换为 csv [英] process data from text file and convert into csv
问题描述
在我们的组织中,每个月都会在服务器级别运行一些作业并收集数据,它会发现服务器上正在运行的内容并执行一些检查.这些文件是文本文件并复制到一个存储库服务器.文件名将是
In our organization, every month a few jobs will run and collect data on server level and it will find what is running on the server and also perform some checks. These files are text files and copied to one repository server. The file name will be <servername>_20200911.log
此示例文件检查运行 postgreSQL 的服务器.
This sample file checks for servers where postgreSQL is running.
Date Collected || 11-10-2020 03:20:42 GMT ||
Server Name || pglinux1 ||
Operating system || RHEL || passed
OS Version || 6.9 || passed
Kernel version || 2.6.32-735.23.1.el6 || passed
Kernel architecture || x86-64 || passed
Total Memory || 16 gig || passed
/opt/postgres fs free || 32 gig || passed
/opt/postgres/data fs free || 54 gig || passed
Is cron jobs exist || yes || passed
Is postgres installed || yes || passed
Postgres version >10 || no || failed
repmgr installed || yes || passed
repmgr version >4 || yes || passed
How may pg cluster running || 3 || Passed
pgbackrest installed || yes || passed
我们将获得不同技术的类似文件,例如 oracle、mysql、weblogic……每个月我们都需要处理这些文件并识别失败的检查并与相应的团队合作.现在我正在整合所有 postgreSQL/oracle 的数据.就我而言,我将获得大量文件并读取每个文本文件并将数据转换为 cvs,如下所示
We will get similar files for different technologies, like oracle, mysql, weblogic ... Every month we need to process these files and identify failed checks and work with the corresponding team. Now I am consolidating data for all postgreSQL/oracle. In my case I will get lot of files and read each text file and convert data to cvs as below
Date Collected, server name, OPerating system , OS Version,Kernel version,Kernel architecture,Total Memory, /opt/postgres fs free,/opt/postgres/data fs free,Is cron jobs exist,
11-10-2020 03:20:42 GMT,pglinux1, RHEL, passed, passed, passed, passed , passed , passed, passed passed, failed
11-10-2020 03:20:42 GMT,pglinux2, RHEL, passed, passed, passed, passed , passed , passed, passed passed, failed
11-10-2020 03:20:42 GMT,pglinux3, RHEL, passed, passed, passed, passed , passed , passed, passed passed, failed
最初我想,将这些文本文件转换为 CSV 并从每个文件中选取第二行,将其合并为一个文件.我的尝试失败了,因为某些文件数据不一致.现在我想创建一个名为 servercheck.txt
的文件,其中包含所有检查.使用此检查文件可对所有文件中的数据进行 grep 并打印到 CSV 文件中(每个服务器一行).
Initially I thought, convert these text files into CSV and pick the second row from each file, consolidate it into one file. I failed with this attempt, since some file data is not consistent. Now I am thinking to create a file called servercheck.txt
with all the checks. Use this checks file to grep data in all files and print into a CSV file (one row per server).
#! /bin/bash
awk -v ORS='{print $0 ","} /tmp/servecheck.txt |sed 's/ *$//g' > serverchecks.csv
for file in `ls -lart *2020091t.log |awk '{print $9}'`
do
while read line
do
grep "$line" $file |awk -F "||" '{print $3}' |awk -v ORS='{print $3 ","}' >> serverchecks.csv
done < servercheck.txt
done
上面的代码是在同一行(标题和数据)写入数据.
The above code is writing data in same row (heading and data).
我希望我提供了所有必要的细节.请帮助我们提供代码、建议和处理此问题的最佳方法.
I hope I have provided all necessary details. Please help us with code, recommendation, and the best approach to handle this issue.
推荐答案
这可能对你有帮助
for inputfile in *
do
awk -F "||" '
{
for (i=1; i<=NF; i++) {
a[NR,i] = $i
}
}
NF>p { p = NF }
END {
for(j=1; j<=p; j++) {
str=a[1,j]
for(i=2; i<=NR; i++){
str=str" "a[i,j];
}
print str
}
}' $inputfile| sed 's/ + /,/g' > tmpfile && mv tmpfile "$inputfile"
done
按照@Ed Morton 的建议进行编辑
Edited as suggested by @Ed Morton
for inputfile in *
do
awk -F "||" '
{
for (i=1; i<=NF; i++) {
a[NR,i] = $i
}
}
NF>p { p = NF }
END {
for(j=1; j<=p; j++) {
str=a[1,j]
for(i=2; i<=NR; i++){
str=str" "a[i,j];
}
{gsub(/ + /,",",str); print str}
}
}' $inputfile > tmpfile && mv tmpfile "$inputfile"
done
这篇关于处理文本文件中的数据并转换为 csv的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!