mmap比ioremap慢 [英] mmap slower than ioremap
问题描述
我正在为运行Linux 2.6.37的ARM设备进行开发.我正在尝试尽快切换IO引脚.我制作了一个小内核模块和一个用户空间应用程序.我尝试了两件事:
I am developing for an ARM device running Linux 2.6.37. I am trying to toggle an IO pin as fast as possible. I made a little kernel module and a user space application. I tried two things :
- 使用
ioremap
直接从内核空间操纵GPIO控制寄存器. -
mmap()
GPIO控制寄存器,无需从用户空间缓存和使用它们.
- Manipulate the GPIO control registers directly from the kernel space using
ioremap
. mmap()
the GPIO control registers without caching and using them from user space.
两种方法都可以,但是第二种方法比第一种方法慢3倍(在示波器上观察到).我想我禁用了所有缓存机制.
Both methods work, but the second is about 3 times slower than the first (observed on oscilloscope). I think I disabled all caching mechanisms.
当然,我想获得两个世界中的最好的:灵活性和易于使用的用户空间以及内核空间的开发速度.
Of course I'd like to get the best of the two worlds : flexibility and ease of development from user space with the speed of kernel space.
有人知道为什么mmap()
可能比ioremap()
慢吗?
Does anybody know why the mmap()
could be slower than the ioremap()
?
这是我的代码:
static int ti81xx_usmap_mmap(struct file* pFile, struct vm_area_struct* pVma)
{
pVma->vm_flags |= VM_RESERVED;
pVma->vm_page_prot = pgprot_noncached(pVma->vm_page_prot);
if (io_remap_pfn_range(pVma, pVma->vm_start, pVma->vm_pgoff,
pVma->vm_end - pVma->vm_start, pVma->vm_page_prot))
return -EAGAIN;
pVma->vm_ops = &ti81xx_usmap_vm_ops;
return 0;
}
static void ti81xx_usmap_test_gpio(void)
{
u32* pGpIoRegisters = ioremap_nocache(TI81XX_GPIO0_BASE, 0x400);
const u32 pin = 1 << 24;
int i;
/* I should use IO read/write functions instead of pointer deferencing,
* but portability isn't the issue here */
pGpIoRegisters[OMAP4_GPIO_OE >> 2] &= ~pin; /* Set pin as output*/
for (i = 0; i < 200000000; ++i)
{
pGpIoRegisters[OMAP4_GPIO_SETDATAOUT >> 2] = pin;
pGpIoRegisters[OMAP4_GPIO_CLEARDATAOUT >> 2] = pin;
}
pGpIoRegisters[OMAP4_GPIO_OE >> 2] |= pin; /* Set pin as input*/
iounmap(pGpIoRegisters);
}
用户空间应用代码
int main(int argc, char** argv)
{
int file, i;
ulong* pGpIoRegisters = NULL;
ulong pin = 1 << 24;
file = open("/dev/ti81xx-usmap", O_RDWR | O_SYNC);
if (file < 0)
{
printf("open failed (%d)\n", errno);
return 1;
}
printf("Toggle from kernel space...");
fflush(stdout);
ioctl(file, TI81XX_USMAP_IOCTL_TEST_GPIO);
printf(" done\n");
pGpIoRegisters = mmap(NULL, 0x400, PROT_READ | PROT_WRITE, MAP_SHARED, file, TI81XX_GPIO0_BASE);
printf("Toggle from user space...");
fflush(stdout);
pGpIoRegisters[OMAP4_GPIO_OE >> 2] &= ~pin;
for (i = 0; i < 30000000; ++i)
{
pGpIoRegisters[OMAP4_GPIO_SETDATAOUT >> 2] = pin;
pGpIoRegisters[OMAP4_GPIO_CLEARDATAOUT >> 2] = pin;
}
pGpIoRegisters[OMAP4_GPIO_OE >> 2] |= pin;
printf(" done\n");
fflush(stdout);
munmap(pGpIoRegisters, 0x400);
close(file);
return 0;
}
推荐答案
这是因为ioremap_nocache()仍在您的VM映射中启用了CPU写缓冲区,而pgprot_noncached()同时禁用了可缓存性和可缓存性.
This is because ioremap_nocache() still enables the CPU write buffer in your VM mapping whereas pgprot_noncached() disables both bufferability and cacheability.
苹果与苹果的比较将是使用ioremap_strongly_ordered()代替.
Apples to apples comparison would be to use ioremap_strongly_ordered() instead.
这篇关于mmap比ioremap慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!