使用第一行作为python的变量 [英] Using first row as variable with python
问题描述
我想改变这段代码,指出哪里更具动态性和特定性。我想将每列中的第一行信息用作替代'numAtts'的头部。这样,第一行也不会包含在@data下的数据中。
这是我的代码:
# - * - 编码:UTF-8 - * -
从optparse导入日志中导入日志
OptionParser
import sys
$ b def main():
LEVELS = {'debug':logging.DEBUG,
'info':logging.INFO,
'warning':logging.WARNING ,
'错误':logging.ERROR,
'critical':logging.CRITICAL}
usage =usage:arff automate [options] \\\
。
parser = OptionParser(usage = usage,version =%prog 1.0)
#定义选项
parser.add_option( - l,--log, dest =level_name,default =info,help =选择日志级别:debug,info,warning,error,critical)
#Parsing arguments
(options,args )= parser.parse_args()
#Mandatory参数
如果len(args)!= 1:
parser.error(参数数量不正确)
inputPath = args [0]
#启动程序------------------
打开(inputPath,r)为f:
strip = str.strip
split = str.split
data = [split(strip(line))for line in f]
########################################## #####################
##在这里修改##
numAtts = len(data [0])
logging.info(Number of attributes:+ str(numAtts))
print@RELATION relationData
for e范围(numAtts):
打印@ATTRIBUTE att {0} NUMERIC.format(e)
################ ###############################################
classSet = set()
用于数据中的e:
classSet.add(e [-1])$ b
$ b print@ATTRIBUTE class {%s} %(,。join(classSet))
print@DATA
用于数据项目:
print ,。join(item [0:])
if __name__ ==__main__:
main()
输入文件如下所示(以制表符分隔):
F1 F2 F3 F4 F5 F6 STRING
7209 3004 15302 5203 2 1例A
6417 3984 16445 5546 15 1例b
8822 3973 23712 7517 18 0 EXPAMPLEC
输出文件(实际)如下所示:
@RELATION relationData
@ATTRIBUTE att0 NUMERIC
@ATTRIBUTE att1 NUMERIC
@ATTRIBUTE att2 NUM ERIC
@ATTRIBUTE ATT3 NUMERIC
@ATTRIBUTE ATT4 NUMERIC
@ATTRIBUTE att5 NUMERIC
@ATTRIBUTE att6 NUMERIC
@ATTRIBUTE类{例B,STRING,EXPAMPLEC,照例a}
@DATA
F1,F2,F3,F4,F5,{0,1},STRING
7209,3004,15302,5203,2,1,例如
6417,3984,16445,5546,15,1,EXAMPLEB
8822,3973,23712,7517,18,0,EXPAMPLEC
所需的输出文件如下所示:
@关系关系数据
@attribute'att [F1]'数字
@attribute'att [F2]'数字
@attribute'att [F3]'数字
@attribute'att [ F4]'数字
@attribute'att [F5]'数字
@attribute'att [F6]'{0,1}
@attribute'class'STRING
@data
7209,3004,15302,5203,2,1,EXAMPLEA
6417,3984,16445,5546,15,1,EXAMPLEB
8822,3973,23712,7517, 18,1,EXPAMPLEC
所以,当你看到我的代码几乎在那里,但我无法/不确定如何标记第一行作为用于标题的变量并开始处理第2行的数据。
因此,我的问题是:如何格式化要使用的输出第一行作为头部?
有没有人有任何见解?谢谢!解决方案您不是完全格式化您想要的输出标题。这里
用于范围内的e(numAtts):
print@ATTRIBUTE att {0} NUMERIC.format e)
你只是格式化
e
输出。您需要在这里访问data [0]
。在范围(numAtts):
打印 @ATTRIBUTE ATT '[{0}]' NUMERIC .format(DataA的[0] [E])
后面的用法部分可以利用
range / xrange
跳过0th code> index。
适用于范围(1,numAtts)中的e:
print,。 join(data [e] [0:])
另外我建议不需要存储
str
变量中的方法可以使用方法链来获得期望的值。
而不是:
data = [split(strip(line))for line in f]
使用此:
<$ c
$ ********** 编辑包含此选项 ***********
next
也允许跳过第一行,开始数据段,因此第二行跳过。
next(iter(data))
for data in data [1:]:
print,。join(item [0:])
I want to change this piece of code where indicated to be more dynamic and specific. I would like to use the first row information in each column as a header that substitutes 'numAtts'. That way, the first row would also not be included in the data underneath the @data.
Here is my code:
# -*- coding: UTF-8 -*- import logging from optparse import OptionParser import sys def main(): LEVELS = {'debug': logging.DEBUG, 'info': logging.INFO, 'warning': logging.WARNING, 'error': logging.ERROR, 'critical': logging.CRITICAL} usage = "usage: arff automate [options]\n ." parser = OptionParser(usage=usage, version="%prog 1.0") #Defining options parser.add_option("-l", "--log", dest="level_name", default="info", help="choose the logging level: debug, info, warning, error, critical") #Parsing arguments (options, args) = parser.parse_args() #Mandatory arguments if len(args) != 1: parser.error("incorrect number of arguments") inputPath = args[0] # Start program ------------------ with open(inputPath, "r") as f: strip = str.strip split = str.split data = [split(strip (line)) for line in f] ############################################################### ## modify here## numAtts = len(data[0]) logging.info(" Number of attributes : "+str(numAtts) ) print "@RELATION relationData" print "" for e in range(numAtts): print "@ATTRIBUTE att{0} NUMERIC".format(e) ############################################################### classSet = set() for e in data: classSet.add(e[-1]) print "@ATTRIBUTE class {%s}" % (",".join(classSet)) print "" print "@DATA" for item in data: print ",".join(item[0:]) if __name__ == "__main__": main()
The input file is like this (tab-separated):
F1 F2 F3 F4 F5 F6 STRING 7209 3004 15302 5203 2 1 EXAMPLEA 6417 3984 16445 5546 15 1 EXAMPLEB 8822 3973 23712 7517 18 0 EXPAMPLEC
The output file (actual) is like this:
@RELATION relationData @ATTRIBUTE att0 NUMERIC @ATTRIBUTE att1 NUMERIC @ATTRIBUTE att2 NUMERIC @ATTRIBUTE att3 NUMERIC @ATTRIBUTE att4 NUMERIC @ATTRIBUTE att5 NUMERIC @ATTRIBUTE att6 NUMERIC @ATTRIBUTE class {EXAMPLEB,STRING,EXPAMPLEC,EXAMPLEA} @DATA F1,F2,F3,F4,F5,{0,1},STRING 7209,3004,15302,5203,2,1,EXAMPLEA 6417,3984,16445,5546,15,1,EXAMPLEB 8822,3973,23712,7517,18,0,EXPAMPLEC
The desired output file is like this:
@RELATION relationData @attribute 'att[F1]' numeric @attribute 'att[F2]' numeric @attribute 'att[F3]' numeric @attribute 'att[F4]' numeric @attribute 'att[F5]' numeric @attribute 'att[F6]' {0,1} @attribute 'class' STRING @data 7209,3004,15302,5203,2,1,EXAMPLEA 6417,3984,16445,5546,15,1,EXAMPLEB 8822,3973,23712,7517,18,1,EXPAMPLEC
So, as you see my code is almost there, but I am unable / unsure how to mark the first row as a variable that is used for the header and start processing the data with row 2.
Thus, my question is: How can I format the output to use the 1st row as a header? Does anyone have any insight? Thanks!
解决方案You are not exactly formatting your desired title to output. Here
for e in range(numAtts): print "@ATTRIBUTE att{0} NUMERIC".format(e)
you are merely formatting value of
e
to output. You need to access thedata[0]
here.for e in range(numAtts): print "@ATTRIBUTE att'[{0}]'' NUMERIC".format(dataa[0][e] )
And later for usage part you can exploit
range/xrange
to skip0th
index.for e in range(1, numAtts): print ",".join(data[e][0:])
Also I would suggest there is no need to store
str
methods in variables you can use method chaining to get desired value. Instead of this:data = [split(strip (line)) for line in f]
use this:
data = [line.strip().split() for line in f]
*********** Edited to include this option ***********
next
also permits the skipping of the first row, beginning the data segment, therefore with the second.next(iter(data)) for item in data[1:]: print ",".join(item[0:])
这篇关于使用第一行作为python的变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!