特征列如何在Tensorflow中工作? [英] How feature columns work in tensorflow?

查看:92
本文介绍了特征列如何在Tensorflow中工作?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我读到张量流中的功能列用于定义我们的数据,但是如何以及为什么?要素列如何工作?如果我们也可以不使用它们而创建自定义估算器,为什么它们甚至存在?

I read that feature columns in tensorflow are used to define our data but how and why? How do feature columns work and why they even exist if we can make a custom estimator without them too?

如果有必要,为什么像keras这样的库不使用它们?

And if they are necessary, why libraries like keras don't use them?

推荐答案

口语

这可能太笼统了.您可能想看一些视频或在机器学习上做更多阅读,因为这是一个广泛的话题.

This may be too general to answer. You may want to watch some videos or do more reading on machine learning, because this is a broad topic.

我将尝试解释数据的用途.

I will try to explain what features of data are used for.

数据的特征"是有意义的变量,应将两个类彼此分开.例如,如果我们选择特征重量",我们就可以区分大象和松鼠.它们的权重非常不同,我们的机器学习算法可以学习理解",体重很重的动物比象松鼠更容易成为大象.在实际情况下,您通常会具有多个功能.

A "feature" of the data is a meaningful variable that should separate two classes from each other. For example, if we choose the feature "weight", we can tell elephants apart from squirrels. They have very different weights, and our machine learning algorithm can learn to "understand" that an animal with a heavy weight is more likely to be an elephant than it is to be a squirrel. In a real scenario you would generally have more than one feature.

我不确定您为什么会说Keras不使用功能.它们是许多分类问题的基本方面.某些数据集可能包含带有标签的数据或带有标签的特征,例如: https://keras .io/datasets/#cifar100-small-image-classification

I'm not sure why you would say that Keras does not use features. They are a fundamental aspect of many classification problems. Some datasets may contain labelled data or labelled features, like this one: https://keras.io/datasets/#cifar100-small-image-classification

当我们不使用功能"时,我认为一种更准确的陈述方式是未标记数据.在这种情况下,机器学习算法仍可以在数据中找到关系,但无需在数据上应用人工标签.

When we "don't use features", I think a more accurate way to state that would be that the data is unlabelled. In this case, a machine learning algorithm can still find relationships in the data, but without human labels applied to the data.

如果您在本页上的<功能>一词是Ctrl+F,您将看到Keras接受它们作为参数的地方:

If you Ctrl+F for the word "features" on this page you will see places where Keras accepts them as an argument: https://keras.io/layers/core/

我不是机器学习专家,所以如果有人能够纠正我的答案,我也将不胜感激.

I am not a machine learning expert so if anyone is able to correct my answer, I would appreciate that too.

在Tensorflow中

我对 Tensorflow的功能列实现的理解尤其是它们使您可以将原始内容数据放入类型化的列中,从而使算法可以更好地区分您要传递的数据类型.例如,纬度"和经度"可以作为两个数字列传递,但正如文档所述此处,对于纬度X经度"使用交叉列可以使模型以更有意义/更有效的方式训练数据.毕竟,纬度"和经度"的真正含义是位置".至于为什么Keras没有此功能,我不确定,希望其他人可以对此主题提供见解.

My understanding of Tensorflow's feature columns implementation in particular is that they allow you to cast raw data into a typed column that allow the algorithm to better distinguish what type of data you are passing. For example Latitude and Longitude could be passed as two numerical columns, but as the docs say here, using a Crossed Column for Latitude X Longitude may allow the model to train on the data in a more meaningful/effective way. After all, what "Latitude" and "Longitude" really mean is "Location." As for why Keras does not have this functionality, I am not sure, hopefully someone else can offer insight on this topic.

这篇关于特征列如何在Tensorflow中工作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆