sklearn 没有属性“数据集" [英] sklearn doesn't have attribute 'datasets'

查看:53
本文介绍了sklearn 没有属性“数据集"的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经开始在我的工作中使用 sckikit-learn.所以我正在阅读 tutorial ,它提供了加载一些数据集的标准程序:

I have started using sckikit-learn for my work. So I was going through the tutorial which gives standard procedure to load some datasets:

$ python
>>> from sklearn import datasets
>>> iris = datasets.load_iris()
>>> digits = datasets.load_digits()

但是,为了方便起见,我尝试通过以下方式加载数据:

However, for my convenience, I tried loading the data in the following way:

In [1]: import sklearn

In [2]: iris = sklearn.datasets.load_iris()

然而,这会引发以下错误:

However, this throws following error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-2-db77d2036db5> in <module>()
----> 1 iris = sklearn.datasets.load_iris()

AttributeError: 'module' object has no attribute 'datasets'

但是,如果我使用明显相似的方法:

However, if I use the apparently similar method:

In [3]: from sklearn import datasets

In [4]: iris = datasets.load_iris()

它可以正常工作.事实上,以下也有效:

It works without problem. In fact the following also works:

In [5]: iris = sklearn.datasets.load_iris()

我对此完全困惑.我错过了一些非常微不足道的东西吗?这两种方法有什么区别?

I am completely confused about this. Am I missing something very trivial? What is the difference between the two approaches?

推荐答案

sklearn 是一个 .这个答案说的很简洁:

sklearn is a package. This answer said it very succinctly:

当你导入一个包时,只有该包的__init__.py文件中的变量/函数/类是直接可见的,而不是子包或模块.

when you import a package, only variables/functions/classes in the __init__.py file of that package are directly visible, not sub-packages or modules.

datasetssklearn 的子包.这就是发生这种情况的原因:

datasets is a sub-package of sklearn. This is why this happens:

In [1]: import sklearn

In [2]: sklearn.datasets
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-2-325a2bfc35d0> in <module>()
----> 1 sklearn.datasets

AttributeError: module 'sklearn' has no attribute 'datasets'

然而,这样做的原因:

In [3]: from sklearn import datasets

In [4]: sklearn.datasets
Out[4]: <module 'sklearn.datasets' from '/home/ethan/.virtualenvs/test3/lib/python3.5/site-packages/sklearn/datasets/__init__.py'>

是当您通过执行from sklearn import datasets加载子包datasets时,它会自动添加到包sklearn的命名空间中>.这是鲜为人知的 Python 导入系统的陷阱".

is that when you load the sub-package datasets by doing from sklearn import datasets it is automatically added to the namespace of the package sklearn. This is one of the lesser-known "traps" of the Python import system.

另外,请注意,如果您查看 __init__.py 对于 sklearn看到 'datasets' 作为 __all__,但这只允许你做:

Also, note that if you look at the __init__.py for sklearn you will see 'datasets' as a member of __all__, but this only allows you to do:

In [1]: from sklearn import *
In [2]: datasets
Out[2]: <module 'sklearn.datasets' from '/home/ethan/.virtualenvs/test3/lib/python3.5/site-packages/sklearn/datasets/__init__.py'>

最后一点要注意的是,如果你检查 sklearndatasets 你会发现,虽然它们是包,但它们的类型是 module.这是因为所有包都被视为模块 - 然而,并非所有模块都是包.

One last point to note is that if you inspect either sklearn or datasets you will see that, although they are packages, their type is module. This is because all packages are considered modules - however, not all modules are packages.

这篇关于sklearn 没有属性“数据集"的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆