为什么很多人将 Cassandra 称为面向列的数据库? [英] Why many refer to Cassandra as a Column oriented database?

查看:19
本文介绍了为什么很多人将 Cassandra 称为面向列的数据库?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在互联网上阅读了几篇论文和文档,我发现了许多关于 Cassandra 数据模型的相互矛盾的信息.有很多人将其识别为面向列的数据库,其他人将其识别为面向行的数据库,然后将其定义为两者的混合方式.

Reading several papers and documents on internet, I found many contradictory information about the Cassandra data model. There are many which identify it as a column oriented database, other as a row-oriented and then who define it as a hybrid way of both.

根据我对 Cassandra 如何存储文件的了解,它使用 *-Index.db 文件访问 *-Data.db 文件的正确位置,其中存储了布隆过滤器、列索引,然后是所需行的列.

According to what I know about how Cassandra stores file, it uses the *-Index.db file to access at the right position of the *-Data.db file where it is stored the bloom filter, column index and then the columns of the required row.

在我看来,这是严格面向行的.我有什么遗漏吗?

In my opinion, this is strictly row-oriented. Is there something I'm missing?

推荐答案

  • 如果您查看 Apache Cassandra git repoReadme 文件>,上面写着,
  • Cassandra 是一个分区行存储.行被组织成表格具有必需的主键.

    Cassandra is a partitioned row store. Rows are organized into tables with a required primary key.

    分区意味着 Cassandra 可以将您的数据分布在多台机器在一个应用程序透明的事情.卡桑德拉将当机器被添加和删除时自动重新分区集群.

    Partitioning means that Cassandra can distribute your data across multiple machines in an application-transparent matter. Cassandra will automatically repartition as machines are added and removed from the cluster.

    行存储意味着像关系数据库一样,Cassandra 组织按行和列的数据.

    Row store means that like relational databases, Cassandra organizes data by rows and columns.

    • 面向列或列式数据库按列存储在磁盘上.

      • Column oriented or columnar databases are stored on disk column wise.

        例如:表格奖金表格

          ID         Last    First   Bonus
          1          Doe     John    8000
          2          Smith   Jane    4000
          3          Beck    Sam     1000
        

      • 面向行的数据库管理系统中,数据会这样存储:1,Doe,John,8000;2,Smith,Jane,4000;3,贝克,山姆,1000;

      • In a row-oriented database management system, the data would be stored like this: 1,Doe,John,8000;2,Smith,Jane,4000;3,Beck,Sam,1000;

        面向列的数据库管理系统中,数据会这样存储:
        1,2,3;Doe,Smith,Beck;John,Jane,Sam;8000,4000,1000;

        In a column-oriented database management system, the data would be stored like this:
        1,2,3;Doe,Smith,Beck;John,Jane,Sam;8000,4000,1000;

        Cassandra 基本上是一个列族存储

        Cassandra is basically a column-family store

        Cassandra 会将上述数据存储为,

        Cassandra would store the above data as,

             "Bonuses" : {
                   row1 : { "ID":1, "Last":"Doe", "First":"John", "Bonus":8000},
                   row2 : { "ID":2, "Last":"Smith", "First":"Jane", "Bonus":4000}
                   ...
             }
        

        • 每行的列数也不必相同.一行可以有 100 列,下一行只能有 1 列.

          • Also the number of columns in each row does't have to be the same. One row can have 100 columns and the next row can have only 1 column.

            阅读了解更多细节.

            这篇关于为什么很多人将 Cassandra 称为面向列的数据库?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆