处理大量数据 [英] Dealing with large amounts of data

查看:79
本文介绍了处理大量数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们希望存储大量用户数据,这些数据将被很多人每天更改和访问。我们预计

我们服务的用户约为6-8百万,每条记录大约为2000-2500字节。系统需要24/7

运行,因此无法关闭。实现

的最佳方法是什么?我们考虑建立一个服务器集群来保存

信息和另一个集群来备份信息。这是实用的吗?

此外,有哪些软件可以在不同的服务器上分发查询

电话并管理大量的查询

请求?


提前谢谢。


Ben

We are looking to store a large amount of user data that will be
changed and accessed daily by a large number of people. We expect
around 6-8 million subscribers to our service with each record being
approximately 2000-2500 bytes. The system needs to be running 24/7
and therefore cannot be shut down. What is the best way to implement
this? We were thinking of setting up a cluster of servers to hold the
information and another cluster to backup the information. Is this
practical?
Also, what software is available out there that can distribute query
calls across different servers and to manage large amounts of query
requests?

Thank you in advance.

Ben

推荐答案

Digety写道:
我们希望存储大量的用户数据,这些数据每天都会被大量的用户数据更改和访问人。我们希望我们的服务用户大约有6-8百万,每条记录约为2000-2500字节。系统需要24/7运行
因此无法关闭。实现这个的最佳方法是什么?我们考虑建立一个服务器集群来保存
信息,另一个集群来备份信息。这是否实用?
此外,哪些软件可以在不同的服务器上分发查询电话并管理大量的查询请求?

提前谢谢。

We are looking to store a large amount of user data that will be
changed and accessed daily by a large number of people. We expect
around 6-8 million subscribers to our service with each record being
approximately 2000-2500 bytes. The system needs to be running 24/7
and therefore cannot be shut down. What is the best way to implement
this? We were thinking of setting up a cluster of servers to hold the
information and another cluster to backup the information. Is this
practical?
Also, what software is available out there that can distribute query
calls across different servers and to manage large amounts of query
requests?

Thank you in advance.

Ben




您需要咨询有此经验的专业公司

有点儿的东西。我不相信我们的yahoos! :D


Zach



You need to consult with a profession firm that has experience with this
kind of thing. I wouldn''t trust us yahoos! :D

Zach


这个问题对新闻组来说真的太大了。显而易见的问题是

,如果你想建立并运行一个24/7服务器场,为800万

订阅者,那么你需要一个架构团队,开发和

运营人员在高容量,高可用性环境中拥有多年的经验......所以为什么不向他们提出这个问题,因为

他们会更好地了解您的要求吗?


SQL Server的群集支持是针对故障转移群集的,所以可能是

您的高可用性解决方案的一部分。在SQL Server中,分布式

查询可以使用分区视图实现负载平衡。


您可以在Microsoft的可伸缩性站点上找到一些有用的信息:
http://www.microsoft。 com / sql / techinf ... calability.asp


-

David Portas

SQL Server MVP

-
This question is really too big for a newsgroup. The obvious question is
that if you want to build and run a 24/7 server-farm for 8 million
subscribers then you''ll need a team of architecture, development and
operations staff with plenty of years of experience in high-volume,
high-availability environments... so why not ask them this question, since
they''ll be in a better position to understand your requirements?

SQL Server''s clustering support is for Failover Clustering so that may be
part of your solution for high availability. In SQL Server, distributed
queries can be implemented using partitioned views for load balancing.

You may find some useful information on Microsoft''s scalability site:
http://www.microsoft.com/sql/techinf...calability.asp

--
David Portas
SQL Server MVP
--




" Digety" <是****** @ hotmail.com>在消息中写道

news:82 ************************** @ posting.google.c om ...

"Digety" <be******@hotmail.com> wrote in message
news:82**************************@posting.google.c om...
我们希望存储大量的用户数据,这些数据将被很多人每天更改和访问。我们希望我们的服务用户大约有6-8百万,每条记录约为2000-2500字节。系统需要24/7运行
因此无法关闭。实现这个的最佳方法是什么?我们考虑建立一个服务器集群来保存
信息,另一个集群来备份信息。这是否实用?


实用吗?你的预算是多少?你的响应时间要求是什么?


集群不便宜。


6-8百万行不是全部那么多顺便说一下。更重要的是它改变了多少




此外,哪些软件可用于分发查询电话?不同的服务器和管理大量的查询请求?


如果您只是在进行查询,我猜你不需要这个。


举个例子,我有一个四核CPU Xeon盒子(700Mhz),现在运行在
这些天大约50%的CPU(一些新代码刚刚帮助)。我认为每天行动一天就行了一百万美元(然后一夜之间转移到另一台服务器上。)


提前谢谢。

Ben
We are looking to store a large amount of user data that will be
changed and accessed daily by a large number of people. We expect
around 6-8 million subscribers to our service with each record being
approximately 2000-2500 bytes. The system needs to be running 24/7
and therefore cannot be shut down. What is the best way to implement
this? We were thinking of setting up a cluster of servers to hold the
information and another cluster to backup the information. Is this
practical?
Practical? What''s your budget? What''s your response time requirements?

Clustering ain''t cheap.

6-8 million rows isn''t all that much btw. What''s more important is how much
it changes.

Also, what software is available out there that can distribute query
calls across different servers and to manage large amounts of query
requests?
If you''re just doing queries, my guess is you won''t need this.

To give you an example, I''ve got a quad CPU Xeon box (700Mhz) that runs at
about 50% CPU these days (some new code just helped). It INSERTS I think 17
million rows a day (which then get moved to another server overnight.)


Thank you in advance.

Ben



这篇关于处理大量数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆