什么是大数据&什么归类为大数据? [英] What is big data & what classifies as big data?

查看:242
本文介绍了什么是大数据&什么归类为大数据?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经阅读了很多文章,但我似乎没有得到关于什么是大数据的完全清楚的答案。在一个页面中,我看到任何对您的使用来说更大的数据都是大数据,即100 MB被认为是您邮箱的大数据,而不是您的硬盘。而另一篇文章说大数据通常超过1 TB,不同的数量/种类/速度,不能存储在一个系统中。此外,数据应存储在NOSQL数据库中,Hadoop用于转换数据。



此外,我一直在研究解决方案,并想知道我是否可以对其进行分类作为一个大数据。下面的解决方案的片段,



- 数百万的原始数据记录,通常是500多GB的数据。

- SQL数据库作为后面的 - 结束和SSIS / SQL查询来清理/处理数据并将其转换为有意义的形式。

- 使用Spotfire可视化



任何帮助非常感谢。谢谢



我尝试了什么:



我已经表现得很好研究几个网站,如Quora,Stackoverflow和其他个人博客

解决方案

我建​​议你阅读:大数据 - 维基百科,免费的百科全书 [ ^ ]



Wiki写道:

考虑什么大数据取决于用户及其工具的功能,并且扩展功能使大数据成为移动目标。 对于一些组织来说,首次面对数百GB的数据可能会引发重新考虑数据管理选项的需求。对于其他组织,在数据大小成为重要考虑因素之前可能需要数十或数百TB。





如您所见,没有机会定义被识别为大数据的数据部分,因为它取决于:

1)谁评判(人,组织/关注)

2)什么技术用于存储数据(文本文件(txt,xml,csv),媒体文件(如mp3,视频)) ,等等。)
3)使用什么设备(移动设备,本地计算机),

4)用什么技术连接设备(Lan / Wan / WiFi)

5)等



我可以在问题评论中找到我的个人规模。不要把自己束缚到这个尺度,因为你会问多少人你会得到多少答案


一件多长时间字符串?



大的定义取决于你和你的观点,但普遍的共识就是你所说的3 V音量/品种/速度。



此外,大数据被认为是按原样存储的内容,以及稍后处理或计算答案因此解决方案,如map / reduce和hadoop等

I have went through a lot of articles but I dont seem to get a perfectly clear answer on what exactly a BIG DATA is. In one page I saw "any data which is bigger for your usage, is big data i.e. 100 MB is considered big data for your mailbox but not your hard disc". Whereas another article said "big data to be usually more than 1 TB with different volume / variety / velocity and couldn't be stored in a single system". Also that data should be stored in a NOSQL db with Hadoop used to transform data.

Further, I have been working on a solution and was wondering if I could classify it as a big data. Snippets on the solution below,

- Millions of raw data records and usually 500 plus GB of data.
- SQL database as back-end and SSIS / SQL queries to cleanse/process the data and convert it to a meaningful form.
- Visualization using Spotfire

Any help would be much appreciated. Thank you

What I have tried:

I have performed self research on several web sites like Quora, Stackoverflow and other personal blogs

解决方案

I'd suggest to read this: Big data - Wikipedia, the free encyclopedia[^]

Wiki wrote:

What is considered "big data" varies depending on the capabilities of the users and their tools, and expanding capabilities make big data a moving target. "For some organizations, facing hundreds of gigabytes of data for the first time may trigger a need to reconsider data management options. For others, it may take tens or hundreds of terabytes before data size becomes a significant consideration."



As you can see, there's no chance to define the portion of data which is identified as "Big data", because it depends on:
1) who is judging (person, organization/concern)
2) what technology is used for storage data (text file (txt, xml, csv), media files (such as mp3, video), etc.)
3) what devices is used (mobile device, local computer),
4) what technology is used to connect devices (Lan/Wan/WiFi)
5) etc.

My personal scale you can find in the comment to the question. Do not tie yourself to this scale, because as many people you'll ask as many answers you'll get.


How long is a piece of string?

The definition of "big" depends on you and your point of view, but the general consensus is what you said the 3 V's Volume/Variety/Velocity.

Also "big data" is considered what you store "as is" and process or compute answers later hence solutions like map/reduce and hadoop etc.


这篇关于什么是大数据&什么归类为大数据?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆