使用礼拜堂处理大量矩阵 [英] Use Chapel to handle massive matrix

查看:86
本文介绍了使用礼拜堂处理大量矩阵的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近遇到过小教堂,我非常想尝试一下。我有两个问题,希望可以解决。

I've recently come across Chapel and I'm very keen to try it out. I have a two-fold problem I'm hoping it can solve.

我通常使用Python或C ++工作。 Java陷入困境。

I typically work in Python or C++. Java when backed into a corner.

我有两个矩阵 I V 。两者都很稀疏,尺寸约为600K x 600K,密度约为1%。

I have two matrices I and V. Both are sparse and of dimension about 600K x 600K, populated at about 1% density.

首先,使用SciPy,我可以将这两个SQL数据库都从SQL数据库加载到内存中时刻。但是,我希望我们的下一次迭代对于我们的机器而言将太大。也许是1.5M ^ 2。在这种情况下,Spark的RDD可能会适合负载。我无法让PyTables做到这一点。我知道这被描述为内核外问题。

First, using SciPy, I can load both from a SQL database into memory at the moment. However, I expect our next iteration will be simply too large for our machines. Perhaps 1.5M^2. In a case like that, RDDs from Spark may work for the load. I wasn't able to get PyTables to make this happen. I understand this is described as an "Out-of-core" problem.

即使它们确实加载了,也执行 I'IV 在数分钟内变为OOM。 (这里 I'是转置的),所以我正在研究将此乘法分布到多个内核(SciPy可以做到)和多台机器(到目前为止不能做到)我知道)。在这里,Spark摔倒了,但是教堂似乎可以回答我的祈祷。

Even if they do get loaded, doing I'IV goes OOM in minutes. (Here I' is transpose), so I'm looking into distributing this multiplication over multiple cores (which SciPy can do) and multiple machines (which it cannot, so far as I know). Here, Spark falls down but Chapel appears to answer my prayers, so-to-speak.

一个严重的限制是机器的预算。例如,我买不起克雷。教堂社区对此有一种模式吗?

A serious limitation is budget on machines. I can't afford a Cray, for instance. Does the Chapel community have a pattern for this?

推荐答案

从一些高级要点开始:


  • 在本质上,Chapel语言更多是关于数组(数据结构),而不是关于矩阵(数学对象)的
    ,尽管显然可以使用数组
    代表一个矩阵。将区别视为支持的
    操作集(例如,数组的迭代,访问和元素操作与
    的转置,叉积和矩阵因式分解)。

  • Chapel支持稀疏数组和关联数组以及密集数组。

  • 教堂阵列可以存储在单个内存本地,也可以分布在
    个多个内存/计算节点之间。

  • 在教堂中,您应该期望
    矩阵/线性代数运算将通过库
    而不是语言来支持。尽管Chapel在此类
    库中起步,但它们仍在扩展中-特别是
    ,Chapel截至2003年还没有对 Distributed
    线性代数运算的库支持。教堂1.15意味着用户将有
    来手动编写此类操作。

  • At its core, the Chapel language is more about arrays (data structures) than about matrices (mathematical objects), though one can obviously use an array to represent a matrix. Think of the distinction as being the set of supported operations (e.g., iteration, access, and elemental operations for arrays vs. transpose, cross-products, and factorings for matrices).
  • Chapel supports sparse and associative arrays as well as dense ones.
  • Chapel arrays can be stored local to a single memory or distributed across multiple memories / compute nodes.
  • In Chapel, you should expect matrices/linear algebra operations to be supported through libraries rather than the language. While Chapel has a start at such libraries, they are still being expanded -- specifically, Chapel does not have library support for distributed linear algebra operations as of Chapel 1.15 meaning that users would have to write such operations manually.

更详细地:

以下程序创建一个块分布的密集数组:

The following program creates a Block-distributed dense array:

use BlockDist;
config const n = 10;

const D = {1..n, 1..n} dmapped Block({1..n, 1..n});  // distributed dense index set
var A: [D] real;                                     // distributed dense array

// assign the array elements in parallel based on the owning locale's (compute node's) ID 
forall a in A do
  a = here.id;

// print out the array
writeln(A);

例如,在6个节点上运行( ./ myProgram -nl 6 ),输出为:

For example, when run on 6 nodes (./myProgram -nl 6), the output is:

0.0 0.0 0.0 0.0 0.0 1.0 1.0 1.0 1.0 1.0
0.0 0.0 0.0 0.0 0.0 1.0 1.0 1.0 1.0 1.0
0.0 0.0 0.0 0.0 0.0 1.0 1.0 1.0 1.0 1.0
0.0 0.0 0.0 0.0 0.0 1.0 1.0 1.0 1.0 1.0
2.0 2.0 2.0 2.0 2.0 3.0 3.0 3.0 3.0 3.0
2.0 2.0 2.0 2.0 2.0 3.0 3.0 3.0 3.0 3.0
2.0 2.0 2.0 2.0 2.0 3.0 3.0 3.0 3.0 3.0
4.0 4.0 4.0 4.0 4.0 5.0 5.0 5.0 5.0 5.0
4.0 4.0 4.0 4.0 4.0 5.0 5.0 5.0 5.0 5.0
4.0 4.0 4.0 4.0 4.0 5.0 5.0 5.0 5.0 5.0

请注意,在多个节点上运行Chapel程序需要将其配置为使用多种语言环境。除了Crays之外,此类程序还可以在群集或网络工作站上运行。

Note that running a Chapel program on multiple nodes requires configuring it to use multiple locales. Such programs can be run on clusters or networked workstations in addition to Crays.

这里有一个程序声明了分布式稀疏数组:

Here's a program that declares a distributed sparse array:

use BlockDist;

config const n = 10;

const D = {1..n, 1..n} dmapped Block({1..n, 1..n});  // distributed dense index set
var SD: sparse subdomain(D);                         // distributed sparse subset
var A: [SD] real;                                    // distributed sparse array

// populate the sparse index set
SD += (1,1);
SD += (n/2, n/4);
SD += (3*n/4, 3*n/4);
SD += (n, n);

// assign the sparse array elements in parallel
forall a in A do
  a = here.id + 1;

// print a dense view of the array
for i in 1..n {
  for j in 1..n do
    write(A[i,j], " ");
  writeln();
}

在六个语言环境中运行会给出:

Running on six locales gives:

1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 
0.0 3.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 
0.0 0.0 0.0 0.0 0.0 0.0 4.0 0.0 0.0 0.0 
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 
0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 6.0 

在以上两个示例中,forall循环将使用以下示例中的多个节点对分布式数组/索引进行计算所有者计算方式,并使用每个节点的多个核心来完成本地工作。

In both the examples above, the forall loops will compute on the distributed arrays / indices using multiple nodes in an owner-computes fashion, and using the multiple cores per node to do the local work.

现在需要注意以下几点:

Now for some caveats:


  • 从Chapel 1.15.0开始,分布式稀疏数组支持仍处于起步阶段,因为该项目的大部分工作都集中在分布式内存上。 te一直致力于任务并行性和分布式密集阵列。伯克利(Berkeley)在今年的年度教堂讲习班上的论文和演讲,迈向教堂的GraphBLAS库强调了几个性能和可​​伸缩性问题,其中一些问题已在master分支上修复,而其他一些仍需要注意。用户对此类功能的反馈和兴趣是加速在这些方面进行改进的最佳方法。

  • Distributed sparse array support is still in its infancy as of Chapel 1.15.0, as most of the project's focus on distributed memory to date has been on task parallelism and distributed dense arrays. A paper+talk from Berkeley in this year's annual Chapel workshop, "Towards a GraphBLAS Library in Chapel" highlighted several performance and scalability issues, some of which have since been fixed on the master branch, others of which still require attention. Feedback and interest from users in such features is the best way to accelerate improvements in these areas.

如开头所述,线性代数库是一种工作原理,正在为教堂进行。过去的版本为 BLAS LAPACK 。教堂1.15包括更高级别的 LinearAlgebra 库的开始。但是目前这些都不支持分布式数组(BLAS和LAPACK是设计好的,因为它还处于早期状态,所以是LinearAlgebra)。

As mentioned at the outset, Linear Algebra libraries are a work-in-progress for Chapel. Past releases have added Chapel modules for BLAS and LAPACK. Chapel 1.15 included the start of a higher-level LinearAlgebra library. But none of these support distributed arrays at present (BLAS and LAPACK by design, LinearAlgebra because it's still early days).

Chapel没有SQL接口( ),尽管有一些社区成员对添加此类支持进行了传言。也可以使用Chapel的I / O功能以某种文本或二进制格式读取数据。或者,您可以潜在地使用Chapel的互操作性功能与可以读取SQL的C库进行交互。

Chapel does not have an SQL interface (yet), though a few community members have made rumblings about adding such support. It may also be possible to use Chapel's I/O features to read the data in some textual or binary format. Or, you could potentially use Chapel's interoperability features to interface with a C library that could read the SQL.

这篇关于使用礼拜堂处理大量矩阵的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆