使用Nvidia的CUDA压缩库 [英] Compression library using Nvidia's CUDA

查看:127
本文介绍了使用Nvidia的CUDA压缩库的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有人知道使用NVIDIA的 CUDA库

Does anyone know a project which implements standard compression methods (like Zip, GZip, BZip2, LZMA,...) using NVIDIA's CUDA library?

我在想,是否可以利用大量并行任务(例如压缩)的算法不会在图形卡上运行比在双核或四核CPU上快得多。

I was wondering if algorithms which can make use of a lot of parallel tasks (like compression) wouldn't run much faster on a graphics card than with a dual or quadcore CPU.

您如何看待这种方法的优缺点?

What do you think about the pros and cons of such an approach?

推荐答案

不知道有人这样做并将其公开。只是恕我直言,这听起来不太有希望。

Not aware of anyone having done that and made it public. Just IMHO, it doesn't sound very promising.

正如Martinus所指出的,某些压缩算法是高度串行的。 LZW之类的块压缩算法可以通过对每个块进行独立编码来并行化。压缩一棵大文件树可以在文件级别并行化。

As Martinus points out, some compression algorithms are highly serial. Block compression algorithms like LZW can be parallelized by coding each block independently. Ziping a large tree of files can be parallelized at the file level.

但是,这些都不是真正的SIMD风格的并行性(单指令多数据),它们是并非大规模并行。

However, none of these is really SIMD-style parallelism (Single Instruction Multiple Data), and they're not massively parallel.

GPU基本上是矢量处理器,您可以在锁步中执行数百或数千条ADD指令,并执行很少的程序

GPUs are basically vector processors, where you can be doing hundreds or thousands of ADD instructions all in lock step, and executing programs where there are very few data-dependent branches.

压缩算法通常听起来更像是SPMD(单程序多数据)或MIMD(多指令多数据)编程模型,更适合

Compression algorithms in general sound more like an SPMD (Single Program Multiple Data) or MIMD (Multiple Instruction Multiple Data) programming model, which is better suited to multicore cpus.

像GPU CUDA这样的GPGPU处理可以加速视频压缩算法,前提是要进行大量的余弦变换像素块或卷积(用于运动检测)并行处理,并且IDCT或卷积子例程可以用无分支代码表示。

Video compression algorithms can be accellerated by GPGPU processing like CUDA only to the extent that there is a very large number of pixel blocks that are being cosine-transformed or convolved (for motion detection) in parallel, and the IDCT or convolution subroutines can be expressed with branchless code.

GPU也很不错数值强度高的算法(数学运算与内存访问的比率。)数值强度低的算法(如添加两个向量)可以大规模并行和SIMD,但在gpu上的运行速度仍然比cpu慢,因为它们内存限制。

GPUs also like algorithms that have high numeric intensity (the ratio of math operations to memory accesses.) Algorithms with low numeric intensity (like adding two vectors) can be massively parallel and SIMD, but still run slower on the gpu than the cpu because they're memory bound.

这篇关于使用Nvidia的CUDA压缩库的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆