mmap比ioremap慢 [英] mmap slower than ioremap

查看:158
本文介绍了mmap比ioremap慢的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在为运行Linux 2.6.37的ARM设备进行开发.我正在尝试尽快切换IO引脚.我制作了一个小内核模块和一个用户空间应用程序.我尝试了两件事:

I am developing for an ARM device running Linux 2.6.37. I am trying to toggle an IO pin as fast as possible. I made a little kernel module and a user space application. I tried two things :

  1. 使用ioremap直接从内核空间操纵GPIO控制寄存器.
  2. mmap() GPIO控制寄存器,无需从用户空间缓存和使用它们.
  1. Manipulate the GPIO control registers directly from the kernel space using ioremap.
  2. mmap() the GPIO control registers without caching and using them from user space.

两种方法都可以,但是第二种方法比第一种方法慢3倍(在示波器上观察到).我想我禁用了所有缓存机制.

Both methods work, but the second is about 3 times slower than the first (observed on oscilloscope). I think I disabled all caching mechanisms.

当然,我想获得两个世界中的最好的:灵活性和易于使用的用户空间以及内核空间的开发速度.

Of course I'd like to get the best of the two worlds : flexibility and ease of development from user space with the speed of kernel space.

有人知道为什么mmap()可能比ioremap()慢吗?

Does anybody know why the mmap() could be slower than the ioremap() ?

这是我的代码:

static int ti81xx_usmap_mmap(struct file* pFile, struct vm_area_struct* pVma)
{
  pVma->vm_flags |= VM_RESERVED;
  pVma->vm_page_prot = pgprot_noncached(pVma->vm_page_prot);

  if (io_remap_pfn_range(pVma, pVma->vm_start, pVma->vm_pgoff,
                          pVma->vm_end - pVma->vm_start, pVma->vm_page_prot))
     return -EAGAIN;

  pVma->vm_ops = &ti81xx_usmap_vm_ops;
  return 0;
}

static void ti81xx_usmap_test_gpio(void)
{
  u32* pGpIoRegisters = ioremap_nocache(TI81XX_GPIO0_BASE, 0x400);
  const u32 pin = 1 << 24;
  int i;

  /* I should use IO read/write functions instead of pointer deferencing, 
   * but portability isn't the issue here */

  pGpIoRegisters[OMAP4_GPIO_OE >> 2] &= ~pin;    /* Set pin as output*/

  for (i = 0; i < 200000000; ++i)
  {
     pGpIoRegisters[OMAP4_GPIO_SETDATAOUT >> 2] = pin;
     pGpIoRegisters[OMAP4_GPIO_CLEARDATAOUT >> 2] = pin;
  }

  pGpIoRegisters[OMAP4_GPIO_OE >> 2] |= pin;    /* Set pin as input*/

  iounmap(pGpIoRegisters);
}

用户空间应用代码

int main(int argc, char** argv)
{
   int file, i;
   ulong* pGpIoRegisters = NULL;
   ulong pin = 1 << 24;

   file = open("/dev/ti81xx-usmap", O_RDWR | O_SYNC);

   if (file < 0)
   {
      printf("open failed (%d)\n", errno);
      return 1;
   }


   printf("Toggle from kernel space...");
   fflush(stdout);

   ioctl(file, TI81XX_USMAP_IOCTL_TEST_GPIO);

   printf(" done\n");    

   pGpIoRegisters = mmap(NULL, 0x400, PROT_READ | PROT_WRITE, MAP_SHARED, file, TI81XX_GPIO0_BASE);
   printf("Toggle from user space...");
   fflush(stdout);

   pGpIoRegisters[OMAP4_GPIO_OE >> 2] &= ~pin;

   for (i = 0; i < 30000000; ++i)
   {
      pGpIoRegisters[OMAP4_GPIO_SETDATAOUT >> 2] = pin;
      pGpIoRegisters[OMAP4_GPIO_CLEARDATAOUT >> 2] = pin;
   }

   pGpIoRegisters[OMAP4_GPIO_OE >> 2] |= pin;

   printf(" done\n");
   fflush(stdout);
   munmap(pGpIoRegisters, 0x400);    

   close(file);    
   return 0;
}

推荐答案

这是因为ioremap_nocache()仍在您的VM映射中启用了CPU写缓冲区,而pgprot_noncached()同时禁用了可缓存性和可缓存性.

This is because ioremap_nocache() still enables the CPU write buffer in your VM mapping whereas pgprot_noncached() disables both bufferability and cacheability.

苹果与苹果的比较将是使用ioremap_strongly_ordered()代替.

Apples to apples comparison would be to use ioremap_strongly_ordered() instead.

这篇关于mmap比ioremap慢的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆