降维是什么意思? [英] What does dimensionality reduction mean?

查看:556
本文介绍了降维是什么意思?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

降维到底是什么意思?

我搜索了它的含义,只是发现它意味着将原始数据转换为更有用的形式.那么,以有用的形式存储数据有什么好处,我的意思是我该如何在实际生活中(应用程序)使用它?

I searched for its meaning, I just found that it means the transformation of raw data into a more useful form. So what is the benefit of having data in useful form, I mean how can I use it in a practical life (application)?

推荐答案

降维是将非常高维度的数据转换为更低维度的数据,以便每个更低维度都传达更多的信息.

Dimensionality Reduction is about converting data of very high dimensionality into data of much lower dimensionality such that each of the lower dimensions convey much more information.

这通常是在解决机器学习问题时完成的,以获得更好的分类或回归任务功能.

This is typically done while solving machine learning problems to get better features for a classification or regression task.

这里有一个人为的例子-假设您有100部电影和1000人的列表,并且每个人都知道他们喜欢还是不喜欢这100部电影.因此,对于每个实例(在此情况下表示每个人),您都有一个长度为100的二进制向量[如果该人不喜欢第i个电影,则位置i为0,否则为1].
您可以直接在这些向量上执行机器学习任务.但是,相反,您可以决定5种电影类型并使用已经拥有的数据,弄清人们是喜欢还是不喜欢整个类型,并以此方式减少您的数据从大小为100的向量到大小为5的向量[如果该人喜欢类型i,则位置i为1)

Heres a contrived example - Suppose you have a list of 100 movies and 1000 people and for each person, you know whether they like or dislike each of the 100 movies. So for each instance (which in this case means each person) you have a binary vector of length 100 [position i is 0 if that person dislikes the i'th movie, 1 otherwise ].
You can perform your machine learning task on these vectors directly.. but instead you could decide upon 5 genres of movies and using the data you already have, figure out whether the person likes or dislikes the entire genre and, in this way reduce your data from a vector of size 100 into a vector of size 5 [position i is 1 if the person likes genre i]

长度为5的向量可以很好地代表长度为100的向量,因为大多数人可能只喜欢他们喜欢的类型的电影.

The vector of length 5 can be thought of as a good representative of the vector of length 100 because most people might be liking movies only in their preferred genres.

但是,它并不能成为确切的代表,因为在某些情况下,一个人可能会讨厌除该类型以外的所有电影.

However its not going to be an exact representative because there might be cases where a person hates all movies of a genre except one.

问题在于,缩减后的向量将较大的信息传递给大多数信息,而占用的空间却少得多,并且计算速度也更快.

The point is, that the reduced vector conveys most of the information in the larger one while consuming a lot less space and being faster to compute with.

这篇关于降维是什么意思?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆