加载svmlight格式错误 [英] Load svmlight format error

查看:143
本文介绍了加载svmlight格式错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我尝试使用 svmlight python软件包时,我已经将数据转换为svmlight格式我得到一个错误。这应该是非常基本的,我不明白发生了什么事情。这里是代码:

pre $ import $ s $ $ $ $ b $ training_data = open('thedata','w')
model = svmlight.learn(training_data,type ='classification',verbosity = 0)

还试过:

pre $ training_data = numpy.load('thedata')
$ b $ p
$ b $ pre $ training_data = __import __('thedata')


解决方案

一个明显的问题是当您截断数据文件时你打开它,因为你正在指定写模式w。这意味着将没有数据可读。



无论如何,如果您的数据文件与此<您需要导入它,因为它是一个python文件。这应该可以工作:假设你的数据文件名为data.py,那么应该将数据导入train0作为training_data#import $ s





$ b#或者你可以使用__import __()
#training_data = __import __('data')。train0

model = svmlight.learn(training_data,type ='classification',verbosity = 0)

您可能希望将数据与示例进行比较。 b
$ b

在数据文件格式明确之后进行编辑

输入文件需要被解析成如下这样的元组列表:

$ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $([target,[(feature_1,value_1),(feature_2,value_2) ,...(feature_n,value_n)]),
(target,[(feature_1,value_1),(feature_2,value_2),...(feature_n,value_n)]),
...
]

svmlight包似乎不支持从SVM文件中读取文件格式,并没有任何解析函数,所以它必须b用Python实现。 SVM文件如下所示:

 < target> <特征>:其中值GT; <特征>:其中值GT; ...<特征>:< value> #< info> 

所以这里是一个解析器,从文件格式转换为svmlight包所需的格式: p>

  def svm_parse(文件名):
$ b $ def _convert(t):
转换功能和值的适当类型为
return(int(t [0]),float(t [1]))

打开(文件名)为f:$ b $如果不是line.startswith('#'):
line = line.split('#')[0],则为b:
line = line.strip()
。 strip()#删除任何结尾注释
data = line.split()
target = float(data [0])
features = [_convert(feature.split(':'))在数据[1:]功能]
产量(目标,功能)

可以像这样使用它:

  import svmlight 

training_data = list(svm_parse('thedata')) )
model = svmlight.learn(training_data, type ='classification',verbosity = 0)


When I try to use the svmlight python package with data I already converted to svmlight format I get an error. It should be pretty basic, I don't understand what's happening. Here's the code:

import svmlight
training_data = open('thedata', "w")
model=svmlight.learn(training_data, type='classification', verbosity=0)

I've also tried:

training_data = numpy.load('thedata')

and

training_data = __import__('thedata')

解决方案

One obvious problem is that you are truncating your data file when you open it because you are specifying write mode "w". This means that there will be no data to read.

Anyway, you don't need to read the file like that if your data file is like the one in this example, you need to import it because it is a python file. This should work:

import svmlight
from data import train0 as training_data    # assuming your data file is named data.py
# or you could use __import__()
#training_data = __import__('data').train0

model = svmlight.learn(training_data, type='classification', verbosity=0)

You might want to compare your data against that of the example.

Edit after data file format clarified

The input file needs to be parsed into a list of tuples like this:

[(target, [(feature_1, value_1), (feature_2, value_2), ... (feature_n, value_n)]),
 (target, [(feature_1, value_1), (feature_2, value_2), ... (feature_n, value_n)]),
 ...
]

The svmlight package does not appear to support reading from a file in the SVM file format, and there aren't any parsing functions, so it will have to be implemented in Python. SVM files look like this:

<target> <feature>:<value> <feature>:<value> ... <feature>:<value> # <info>

so here is a parser that converts from the file format to that required by the svmlight package:

def svm_parse(filename):

    def _convert(t):
        """Convert feature and value to appropriate types"""
        return (int(t[0]), float(t[1]))

    with open(filename) as f:
        for line in f:
            line = line.strip()
            if not line.startswith('#'):
                line = line.split('#')[0].strip() # remove any trailing comment
                data = line.split()
                target = float(data[0])
                features = [_convert(feature.split(':')) for feature in data[1:]]
                yield (target, features)

And you can use it like this:

import svmlight

training_data = list(svm_parse('thedata'))
model=svmlight.learn(training_data, type='classification', verbosity=0)

这篇关于加载svmlight格式错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆