什么数据库用于大数据存储和操作? [英] What database to use for big data storage and manipulation?

查看:116
本文介绍了什么数据库用于大数据存储和操作?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我必须决定哪个数据库服务器用于我的下一个项目,但是使用MySQL的简单决定就像几乎所有的项目,我现在更难,因为我期望很多记录。

I have to make a decision of which database server to use for my next project, but the simple decision to use MySQL like almost all the projects I did is harder now, because I expect very much records.

数据库将存储用户列表,一些其他不相关的表,以及最后一个,一些用户收集的数据。让我们说,如果我有6000个用户回答关于彼此的测验。简单的数学表明,从这些用户,如果每个人完成关于每个人的测验(在我的项目是99%肯定会发生),我将最终有3599万记录(他们将排除自己,在这种特殊情况下操作是6000 * 5999)。不幸的是,6000可能是一个小数字,真实的一天一天成长。

The database will store a user list, some other irrelevant tables, and the last one, some user-collected data. Let's say, if I have 6000 users responding to a quiz about each other. Simple math shows that from those users, if each one completes the quiz about everyone (and in my project that is 99% sure that will happen) I'll end up with 35.99million records(they will exclude themselves and in this particular situation the operation is 6000*5999). Unfortunately 6000 maybe is a small number, the real one growing day by day.

选择什么? MySQL,也许如果事情进展顺利,项目在集群中扩展? PostgreSQL,MSSQL? Oracle?

What to choose? MySQL and maybe if things go well and the project grows to expand it in a cluster? PostgreSQL, MSSQL? Oracle?

我已经阅读过所有这些,每一个都有它的优点和缺点,但仍然不知道选择什么。 MySQL和PostgreSQL的优点当然是$ 0的开始价格,这在一个通常的自资公司是相当不错的。

I've read about all of them, each one has it's pros and cons, but still don't know what to choose. The advantage of MySQL and PostgreSQL is of course, the starting price of $0 which is pretty nice in a usual self-funded startup.

任何意见,建议?

推荐答案

大多数真正的大型缩放Web属性使用分布式键值存储。也就是说,3500万是大的,但不是大。对于大多数现代数据库,您的主要两个扩展问题应该是吞吐量,当没有单个框可以包含整个数据库时会发生什么。这两个问题可以在某种程度上解决您选择使用的任何数据库。 (缓存,复制,分片等)

Most of the truly large scale web properties use a distributed key-value store. That said, 35 million is large, but not that large. With most modern databases, your main two scaling worries should be throughput and what happens when no single box can contain your entire database anymore. And both of these problems can be solved to some degree for any database you choose to use. (Caching, replication, sharding, etc.)

使用MySQL,直到你不再。在这一点上,你应该用面团滚动,你现在有一个非常理想的问题。

Use MySQL until you can't anymore. At that point, you ought to be rolling in dough anyways and you now have a very desirable problem.

这篇关于什么数据库用于大数据存储和操作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆