sklearn没有属性“数据集" [英] sklearn doesn't have attribute 'datasets'

查看:153
本文介绍了sklearn没有属性“数据集"的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经开始将sckikit-learn用于我的工作.因此,我正在研究教程,它提供了加载某些数据集的标准过程:

I have started using sckikit-learn for my work. So I was going through the tutorial which gives standard procedure to load some datasets:

$ python
>>> from sklearn import datasets
>>> iris = datasets.load_iris()
>>> digits = datasets.load_digits()

但是,为了方便起见,我尝试通过以下方式加载数据:

However, for my convenience, I tried loading the data in the following way:

In [1]: import sklearn

In [2]: iris = sklearn.datasets.load_iris()

但是,这会引发以下错误:

However, this throws following error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-2-db77d2036db5> in <module>()
----> 1 iris = sklearn.datasets.load_iris()

AttributeError: 'module' object has no attribute 'datasets'

但是,如果我使用看似相似的方法:

However, if I use the apparently similar method:

In [3]: from sklearn import datasets

In [4]: iris = datasets.load_iris()

它正常工作.实际上,以下内容也适用:

It works without problem. In fact the following also works:

In [5]: iris = sklearn.datasets.load_iris()

我对此完全感到困惑.我是否错过了一些琐碎的事情?两种方法有什么区别?

I am completely confused about this. Am I missing something very trivial? What is the difference between the two approaches?

推荐答案

sklearn程序包. 此答案非常简洁地说:

sklearn is a package. This answer said it very succinctly:

在导入软件包时,只有该软件包的__init__.py文件中的变量/函数/类是直接可见的,子软件包或模块是不可见的.

when you import a package, only variables/functions/classes in the __init__.py file of that package are directly visible, not sub-packages or modules.

datasetssklearn的子程序包.这就是发生这种情况的原因:

datasets is a sub-package of sklearn. This is why this happens:

In [1]: import sklearn

In [2]: sklearn.datasets
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-2-325a2bfc35d0> in <module>()
----> 1 sklearn.datasets

AttributeError: module 'sklearn' has no attribute 'datasets'

但是,它起作用的原因:

However, the reason why this works:

In [3]: from sklearn import datasets

In [4]: sklearn.datasets
Out[4]: <module 'sklearn.datasets' from '/home/ethan/.virtualenvs/test3/lib/python3.5/site-packages/sklearn/datasets/__init__.py'>

是通过执行from sklearn import datasets加载子包datasets时,它会自动添加到包sklearn的名称空间中.这是鲜为人知的.

is that when you load the sub-package datasets by doing from sklearn import datasets it is automatically added to the namespace of the package sklearn. This is one of the lesser-known "traps" of the Python import system.

此外,请注意,如果您查看 __init__.py表示sklearn ,您看到'datasets'作为

Also, note that if you look at the __init__.py for sklearn you will see 'datasets' as a member of __all__, but this only allows you to do:

In [1]: from sklearn import *
In [2]: datasets
Out[2]: <module 'sklearn.datasets' from '/home/ethan/.virtualenvs/test3/lib/python3.5/site-packages/sklearn/datasets/__init__.py'>

最后要注意的一点是,如果您检查sklearndatasets,您将看到,尽管它们是软件包,但它们的类型是module.这是因为所有软件包都被视为模块-但是,并非所有模块都是软件包.

One last point to note is that if you inspect either sklearn or datasets you will see that, although they are packages, their type is module. This is because all packages are considered modules - however, not all modules are packages.

这篇关于sklearn没有属性“数据集"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆