使用第一行作为python的变量 [英] Using first row as variable with python

查看:363
本文介绍了使用第一行作为python的变量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想改变这段代码,指出哪里更具动态性和特定性。我想将每列中的第一行信息用作替代'numAtts'的头部。这样,第一行也不会包含在@data下的数据中。



这是我的代码:

 # -  *  - 编码:UTF-8  -  *  -  

从optparse导入日志中导入日志
OptionParser
import sys
$ b def main():
LEVELS = {'debug':logging.DEBUG,
'info':logging.INFO,
'warning':logging.WARNING ,
'错误':logging.ERROR,
'critical':logging.CRITICAL}

usage =usage:arff automate [options] \\\

parser = OptionParser(usage = usage,version =%prog 1.0)

#定义选项
parser.add_option( - l,--log, dest =level_name,default =info,help =选择日志级别:debug,info,warning,error,critical)

#Parsing arguments
(options,args )= parser.parse_args()

#Mandatory参数
如果len(args)!= 1:
parser.error(参数数量不正确)

inputPath = args [0]


#启动程序------------------

打开(inputPath,r)为f:
strip = str.strip
split = str.split
data = [split(strip(line))for line in f]

########################################## #####################
##在这里修改##

numAtts = len(data [0])
logging.info(Number of attributes:+ str(numAtts))

print@RELATION relationData
print

for e范围(numAtts):
打印@ATTRIBUTE att {0} NUMERIC.format(e)

################ ###############################################

classSet = set()
用于数据中的e:
classSet.add(e [-1])$ ​​b
$ b print@ATTRIBUTE class {%s} %(,。join(classSet))
print

print@DATA

用于数据项目:
print ,。join(item [0:])


if __name__ ==__main__:
main()



输入文件如下所示(以制表符分隔):

  F1 F2 F3 F4 F5 F6 STRING 
7209 3004 15302 5203 2 1例A
6417 3984 16445 5546 15 1例b
8822 3973 23712 7517 18 0 EXPAMPLEC

输出文件(实际)如下所示:

  @RELATION relationData 

@ATTRIBUTE att0 NUMERIC
@ATTRIBUTE att1 NUMERIC
@ATTRIBUTE att2 NUM ERIC
@ATTRIBUTE ATT3 NUMERIC
@ATTRIBUTE ATT4 NUMERIC
@ATTRIBUTE att5 NUMERIC
@ATTRIBUTE att6 NUMERIC
@ATTRIBUTE类{例B,STRING,EXPAMPLEC,照例a}

@DATA
F1,F2,F3,F4,F5,{0,1},STRING
7209,3004,15302,5203,2,1,例如
6417,3984,16445,5546,15,1,EXAMPLEB
8822,3973,23712,7517,18,0,EXPAMPLEC

所需的输出文件如下所示:

  @关系关系数据
@attribute'att [F1]'数字
@attribute'att [F2]'数字
@attribute'att [F3]'数字
@attribute'att [ F4]'数字
@attribute'att [F5]'数字
@attribute'att [F6]'{0,1}
@attribute'class'STRING

@data
7209,3004,15302,5203,2,1,EXAMPLEA
6417,3984,16445,5546,15,1,EXAMPLEB
8822,3973,23712,7517, 18,1,EXPAMPLEC

所以,当你看到我的代码几乎在那里,但我无法/不确定如何标记第一行作为用于标题的变量并开始处理第2行的数据。



因此,我的问题是:如何格式化要使用的输出第一行作为头部?
有没有人有任何见解?谢谢!

解决方案

您不是完全格式化您想要的输出标题。这里

 用于范围内的e(numAtts):
print@ATTRIBUTE att {0} NUMERIC.format e)

你只是格式化 e 输出。您需要在这里访问 data [0]

 在范围(numAtts):
打印 @ATTRIBUTE ATT '[{0}]' NUMERIC .format(DataA的[0] [E])

后面的用法部分可以利用 range / xrange 跳过 0th code> index。

 适用于范围(1,numAtts)中的e:
print,。 join(data [e] [0:])

另外我建议不需要存储 str 变量中的方法可以使用方法链来获得期望的值。
而不是:

  data = [split(strip(line))for line in f] 

使用此:

 <$ c 










$ ********** 编辑包含此选项 ***********



next 也允许跳过第一行,开始数据段,因此第二行跳过。

  next(iter(data))
for data in data [1:]:
print,。join(item [0:])


I want to change this piece of code where indicated to be more dynamic and specific. I would like to use the first row information in each column as a header that substitutes 'numAtts'. That way, the first row would also not be included in the data underneath the @data.

Here is my code:

# -*- coding: UTF-8 -*-

import logging
from optparse import OptionParser
import sys

def main():
    LEVELS = {'debug': logging.DEBUG,
              'info': logging.INFO,
              'warning': logging.WARNING,
              'error': logging.ERROR,
              'critical': logging.CRITICAL}

    usage = "usage: arff automate [options]\n ."
    parser = OptionParser(usage=usage, version="%prog 1.0")

    #Defining options   
    parser.add_option("-l", "--log", dest="level_name", default="info", help="choose the logging level: debug, info, warning, error, critical")    

    #Parsing arguments
    (options, args) = parser.parse_args()

    #Mandatory arguments    
    if len(args) != 1:
        parser.error("incorrect number of arguments")

    inputPath = args[0]


    # Start program ------------------

    with open(inputPath, "r") as f:
        strip = str.strip
        split = str.split
        data = [split(strip (line)) for line in f]

###############################################################
## modify here##

    numAtts = len(data[0])
    logging.info(" Number of attributes : "+str(numAtts) )

    print "@RELATION relationData"
    print ""

    for e in range(numAtts):
        print "@ATTRIBUTE att{0} NUMERIC".format(e)

###############################################################

    classSet = set()
    for e in data:
        classSet.add(e[-1])

    print "@ATTRIBUTE class {%s}" % (",".join(classSet))
    print ""

    print "@DATA"

    for item in data:
        print ",".join(item[0:])


if __name__ == "__main__":
    main()

The input file is like this (tab-separated):

F1  F2  F3  F4  F5  F6  STRING
7209    3004    15302   5203    2   1   EXAMPLEA
6417    3984    16445   5546    15  1   EXAMPLEB
8822    3973    23712   7517    18  0   EXPAMPLEC

The output file (actual) is like this:

@RELATION relationData

@ATTRIBUTE att0 NUMERIC
@ATTRIBUTE att1 NUMERIC
@ATTRIBUTE att2 NUMERIC
@ATTRIBUTE att3 NUMERIC
@ATTRIBUTE att4 NUMERIC
@ATTRIBUTE att5 NUMERIC
@ATTRIBUTE att6 NUMERIC
@ATTRIBUTE class {EXAMPLEB,STRING,EXPAMPLEC,EXAMPLEA}

@DATA
F1,F2,F3,F4,F5,{0,1},STRING
7209,3004,15302,5203,2,1,EXAMPLEA
6417,3984,16445,5546,15,1,EXAMPLEB
8822,3973,23712,7517,18,0,EXPAMPLEC

The desired output file is like this:

@RELATION relationData
@attribute 'att[F1]' numeric
@attribute 'att[F2]' numeric
@attribute 'att[F3]' numeric
@attribute 'att[F4]' numeric
@attribute 'att[F5]' numeric
@attribute 'att[F6]' {0,1}
@attribute 'class' STRING

@data
7209,3004,15302,5203,2,1,EXAMPLEA
6417,3984,16445,5546,15,1,EXAMPLEB
8822,3973,23712,7517,18,1,EXPAMPLEC

So, as you see my code is almost there, but I am unable / unsure how to mark the first row as a variable that is used for the header and start processing the data with row 2.

Thus, my question is: How can I format the output to use the 1st row as a header? Does anyone have any insight? Thanks!

解决方案

You are not exactly formatting your desired title to output. Here

for e in range(numAtts):
        print "@ATTRIBUTE att{0} NUMERIC".format(e)

you are merely formatting value of e to output. You need to access the data[0] here.

for e in range(numAtts):
        print "@ATTRIBUTE att'[{0}]'' NUMERIC".format(dataa[0][e] )

And later for usage part you can exploit range/xrange to skip 0th index.

for e in range(1, numAtts):
    print ",".join(data[e][0:])

Also I would suggest there is no need to store str methods in variables you can use method chaining to get desired value. Instead of this:

data = [split(strip (line)) for line in f]

use this:

data = [line.strip().split() for line in f]

*********** Edited to include this option ***********

next also permits the skipping of the first row, beginning the data segment, therefore with the second.

next(iter(data))
for item in data[1:]:
    print ",".join(item[0:])

这篇关于使用第一行作为python的变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆