数据准备上传到Redis服务器 [英] Data preparation to upload into Redis server

查看:114
本文介绍了数据准备上传到Redis服务器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个10GB的.xml文件,我想使用mass insert将其上传到Redis服务器.我需要有关如何将此.xml数据转换为Redis支持的某些键,值或任何其他数据结构的建议?我正在使用流溢出堆栈,例如,如果我占用comments.xml.

I have a 10GB .xml file, which I want to upload into redis server using the mass insert . I need advise on how to convert this .xml data to some key, value or any other data structure supported by redis? I am working with stack over flow dumps and for example, If I take up the comments.xml.

数据模式: 行Id ="5" PostId ="5" Score ="9" Text =这是一个超级理论上的AI问题.一个有趣的讨论!但是不合时宜..." CreationDate ="2014-05-14T00:23: 15.437"UserId =" 34"

Data pattern: row Id="5" PostId="5" Score="9" Text="this is a super theoretical AI question. An interesting discussion! but out of place..." CreationDate="2014-05-14T00:23:15.437" UserId="34"

让我说我想检索由特定用户ID或特定日期发表的所有评论,我该怎么做?

Lets say I want to retrieve all comments made by particular userid or a particular date how do I do that?

首先,

  1. 如何将.xml日期准备为适合Redis的数据结构.

  1. How do I prepare this .xml date into data structure suitable for Redis.

如何将其上传到Redis.我在Windows上使用Redis.命令pipe和cat似乎不起作用.我已经在使用centos感到厌倦,但是我更喜欢在Windows上使用Redis.

How can I upload it into Redis. I am using Redis on windows. The commands pipe and cat does not seem to work. I have tired using centos but I prefer using Redis on windows.

推荐答案

在选择正确的数据结构之前,您需要了解将要进行的查询类型.例如,如果您具有特定于用户的数据,并且需要将每个用户的不同用户活动分组并具有汇总结果,则需要使用不同的结构,构建索引,将数据拆分成块等等.

Before you choose proper data structure you need to understand what type of quires you will make. For example if you have user specific data and you need to group different user activities per user and have aggregated results you need to go with different structures, build indexes, split data in chunks and so on.

相对于大量的聚合数据(45GB),我发现可用于ZRANGE的SortedSets,因为它比LRANGE具有更好的复杂性.您可以根据数据大小将数据拆分为多个块,并在线程中分别处理每个ZRANGE,然后合并结果.

Relatively for large amount of aggregated data (45GB) I found usable SortedSets with ZRANGE because it has better complexity that LRANGE. You can split your data in chunks based on your data size and process each ZRANGE individually in threads and then combine results.

在该结构的顶部,您可以添加带有LISTS的索引,您只需要为相对少量的数据进行数据迭代即可.

On top of that structure you can add indexes with LISTS where you need only to iterate data for relatively small amounts of data.

这篇关于数据准备上传到Redis服务器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆