需要帮助的映射pre-保留** **缓存在Xilinx / ARM的SoC(ZYNQ 7000)DMA缓冲区 [英] Need help mapping pre-reserved **cacheable** DMA buffer on Xilinx/ARM SoC (Zynq 7000)

查看:1348
本文介绍了需要帮助的映射pre-保留** **缓存在Xilinx / ARM的SoC(ZYNQ 7000)DMA缓冲区的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有与在具有(一个AXI总线)DMA功能的FPGA架构外设基于7000赛灵思ZYNQ板。我们开发的电路,并在ARM内核是Linux。我们在访问从用户空间DMA缓冲区的性能问题,它已经被填满后,硬件

I've got a Xilinx Zynq 7000-based board with a peripheral in the FPGA fabric that has DMA capability (on an AXI bus). We've developed a circuit and are running Linux on the ARM cores. We're having performance problems accessing a DMA buffer from user space after it's been filled by hardware.

摘要:

我们有在引导时$ P $对保留的DRAM的一个部分用作一个大的DMA缓冲器。我们显然使用了错误的API来映射这个缓冲区,因为它似乎未缓存,并且访问速度是可怕的。

We have pre-reserved at boot time a section of DRAM for use as a large DMA buffer. We're apparently using the wrong APIs to map this buffer, because it appears to be uncached, and the access speed is terrible.

使用它甚至为跳出,缓冲区是慢难以维持,由于可怕的表现。 IIUC,ARM缓存是不连贯的DMA,所以我真的AP preciate如何做以下一些见解:

Using it even as a bounce-buffer is untenably slow due to horrible performance. IIUC, ARM caches are not DMA coherent, so I would really appreciate some insight on how to do the following:


  1. 地图DRAM的一个区域到内核的虚拟地址空间,但确保它的缓存

  2. 确保映射入用户空间不也有不良影响,即使需要我们提供我们自己的驱动程序的mmap调用。

  3. 明确地做一个DMA,以确保一致性无效之前从缓存层次结构的物理内存区域。

更多信息:

我一直在试图询问此之前深入研究。不幸的是,这是一个ARM的SoC / FPGA,有这个提供的信息非常少,所以我必须要直接问专家。

I've been trying to research this thoroughly before asking. Unfortunately, this being an ARM SoC/FPGA, there's very little information available on this, so I have to ask the experts directly.

由于这是一个系统级芯片,很多东西是u-boot的硬codeD。例如,在内核和ramdisk都将移交控制权交给内核之前加载到DRAM特定的地方。我们已经采取的这一优势保留DRAM的64MB部分DMA缓冲区(它确实需要那么大,这是为什么我们pre-储备的话)。没有关于冲突的内存类型或内核踩此内存的任何担心,因为引导参数告诉哪些区域的DRAM拥有控制权的核心。

Since this is an SoC, a lot of stuff is hard-coded for u-boot. For instance, the kernel and a ramdisk are loaded to specific places in DRAM before handing control over to the kernel. We've taken advantage of this to reserve a 64MB section of DRAM for a DMA buffer (it does need to be that big, which is why we pre-reserve it). There isn't any worry about conflicting memory types or the kernel stomping on this memory, because the boot parameters tell the kernel what region of DRAM it has control over.

起初,我们试图通过映射此ioremap的物理地址范围到内核空间,但似乎标示的区域不可缓存,并且访问速度是可怕的,即使我们试图用的memcpy,使之成为反弹缓冲区。我们使用/ dev / MEM也映射到这一用户空间,我已经超时的memcpy作为被周围的70MB /秒。

Initially, we tried to map this physical address range into kernel space using ioremap, but that appears to mark the region uncacheable, and the access speed is horrible, even if we try to use memcpy to make it a bounce buffer. We use /dev/mem to map this also into userspace, and I've timed memcpy as being around 70MB/sec.

根据相当数量的关于这个主题搜索,看来,虽然半数的人在那里想用的ioremap这样的(这可能是我们得到了这个想法),ioremap的是不应该被用于此目的,并且有一些应该使用的DMA相关的API。不幸的是,似乎DMA缓冲区分配是完全动态的,我还没有想出如何告诉它,这里已经分配物理地址 - 使用

Based on a fair amount of searching on this topic, it appears that although half the people out there want to use ioremap like this (which is probably where we got the idea from), ioremap is not supposed to be used for this purpose and that there are DMA-related APIs that should be used instead. Unfortunately, it appears that DMA buffer allocation is totally dynamic, and I haven't figured out how to tell it, "here's a physical address already allocated -- use that."

我看了看一号文件是这一个,但它的方式太x86和PC为中心的:
https://www.kernel.org/doc/Documentation/DMA-API -HOWTO.txt

One document I looked at is this one, but it's way too x86 and PC-centric: https://www.kernel.org/doc/Documentation/DMA-API-HOWTO.txt

和这个问题也是在我搜索的顶部出现,但没有真正的答案:
<一href=\"http://stackoverflow.com/questions/17075005/get-the-physical-address-of-a-buffer-under-linux\">get Linux下缓冲区的物理地址

And this question also comes up at the top of my searches, but there's no real answer: get the physical address of a buffer under Linux

综观标准要求,dma_set_mask_and_coherent和家人不会拿pre定义的地址,并希望为PCI设备的结构。我不具有这样的结构,因为这是没有的PCI一个ARM的SoC。我可以手动填充这样的结构,但气味对我来说,滥用API,不会把它当成意。

Looking at the standard calls, dma_set_mask_and_coherent and family won't take a pre-defined address and wants a device structure for PCI. I don't have such a structure, because this is an ARM SoC without PCI. I could manually populate such a structure, but that smells to me like abusing the API, not using it as intended.

顺便说一句:这是一个环形缓冲区,在那里我们DMA数据块分成不同的偏移量,但我们对齐高速缓存线边界,所以不存在的错误共享风险

BTW: This is a ring buffer, where we DMA data blocks into different offsets, but we align to cache line boundaries, so there is no risk of false sharing.

感谢你一百万任何帮助,您可以提供!

Thank you a million for any help you can provide!

更新:看样子,有作为ARM可缓存的DMA缓冲区没有这样的事情,如果你这样做的正常方式。也许,如果我不作ioremap的电话,该区域将不被标记为不可缓存,但我必须弄清楚如何做缓存管理ARM的,我想不通。其中一个问题是,在的memcpy用户空间似乎真的很闹心。是否有对未缓存的内存,我可以使用优化的执行的memcpy?也许我可以写一个。我要搞清楚,如果这处理器有霓虹灯。

UPDATE: It appears that there's no such thing as a cacheable DMA buffer on ARM if you do it the normal way. Maybe if I don't make the ioremap call, the region won't be marked as uncacheable, but then I have to figure out how to do cache management on ARM, which I can't figure out. One of the problems is that memcpy in userspace appears to really suck. Is there a memcpy implementation that's optimized for uncached memory I can use? Maybe I could write one. I have to figure out if this processor has Neon.

推荐答案

你试过用 MMAP实现自己的字符设备()方法重新映射你的缓冲区缓存(通过 remap_pfn_range的指())?

Have you tried implementing your own char device with an mmap() method remapping your buffer as cacheable (by means of remap_pfn_range())?

这篇关于需要帮助的映射pre-保留** **缓存在Xilinx / ARM的SoC(ZYNQ 7000)DMA缓冲区的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆