在操作系统的上下文中,Ring 0 和 Ring 3 是什么? [英] What are Ring 0 and Ring 3 in the context of operating systems?

查看:27
本文介绍了在操作系统的上下文中,Ring 0 和 Ring 3 是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在学习有关 Windows 驱动程序开发的基础知识,我一直在寻找术语 Ring 0Ring 3.这些指的是什么?它们与内核模式用户模式是一回事吗?

解决方案

Linux x86 ring 使用概述

了解环在 Linux 中的使用方式将使您对它们的设计目的有一个很好的了解.

在 x86 保护模式下,CPU 始终处于 4 个环之一.Linux内核只使用0和3:

  • 0 表示内核
  • 3 个用户

这是内核与用户空间最严格和最快速的定义.

为什么 Linux 不使用环 1 和环 2:CPU 特权环:为什么不使用环 1 和 2?

当前的戒指是如何确定的?

当前环由以下组合选择:

  • 全局描述符表:内存中的 GDT 条目表,每个条目都有一个字段 Privl 对环进行编码.

    LGDT 指令将地址设置为当前描述符表.

    另请参阅:http://wiki.osdev.org/Global_Descriptor_Table

  • 段寄存器 CS、DS 等,它们指向 GDT 中条目的索引.

    例如,CS = 0 表示 GDT 的第一个条目当前对正在执行的代码处于活动状态.

每个环可以做什么?

CPU 芯片的物理构造使得:

  • ring 0 可以做任何事情

  • ring 3 无法运行多个指令并写入多个寄存器,最值得注意的是:

    • 不能改变自己的戒指!否则,它可能会将自己设置为 ring 0,而 ring 将无用.

      也就是说,不能修改当前的段描述符,它决定了当前的环.

    • 无法修改页表:x86 分页如何工作?

      也就是说,不能修改CR3寄存器,分页本身防止修改页表.

      出于安全/易于编程的原因,这可以防止一个进程看到其他进程的内存.

    • 无法注册中断处理程序.这些是通过写入内存位置来配置的,这也可以通过分页来防止.

      处理程序在环 0 中运行,会破坏安全模型.

      也就是说,不能使用LGDT和LIDT指令.

    • 不能做像inout这样的IO指令,因此具有任意的硬件访问权限.

      否则,例如,如果任何程序可以直接从磁盘读取,文件权限将毫无用处.

      感谢 Michael Petch:操作系统实际上可以在 ring 3 上允许 IO 指令,这实际上是由 任务状态段.

      如果环 3 一开始没有权限,那么环 3 不可能允许自己这样做.

      Linux 总是不允许它.另请参阅:为什么 Linux 不使用通过 TSS 进行硬件上下文切换?

程序和操作系统如何在环之间转换?

  • 当 CPU 开启时,它开始运行 ring 0 中的初始程序(很好,但它是一个很好的近似值).您可以将此初始程序视为内核(但通常是 一个引导加载程序,然后调用仍在环 0 中的内核).

  • 当用户态进程希望内核为它做一些事情(例如写入文件)时,它会使用生成中断的指令,例如 int 0x80syscall 向内核发出信号.x86-64 Linux 系统调用 hello world 示例:

.data你好世界:.ascii 你好世界
"hello_world_len = .- 你好世界.文本.global _start_开始:/* 写 */移动 $1, %rax移动 $1, %rdi移动 $hello_world, %rsimov $hello_world_len, %rdx系统调用/* 出口 */移动 $60, %rax移动 $0, %rdi系统调用

编译运行:

as -o hello_world.o hello_world.Sld -o hello_world.out hello_world.o./hello_world.out

GitHub 上游.

发生这种情况时,CPU 会调用内核在启动时注册的中断回调处理程序.这是一个 具体的裸机示例,它注册了一个处理程序并使用它.

这个处理程序在 ring 0 中运行,它决定内核是否允许这个动作,执行这个动作,然后在 ring 3.x86_64 中重新启动用户态程序

  • 当使用 exec 系统调用时(或当内核 将启动/init),内核准备新用户态进程的寄存器和内存,然后跳转到入口点,将CPU切换到ring3

  • 如果程序试图做一些淘气的事情,比如写入一个被禁止的寄存器或内存地址(因为分页),CPU 也会在 ring 0 中调用一些内核回调处理程序.

    但是由于用户态很顽皮,内核这次可能会杀死进程,或者用信号警告它.

  • 内核启动时,会设置一个固定频率的硬件时钟,周期性地产生中断.

    此硬件时钟生成运行环 0 的中断,并允许它安排唤醒哪些用户态进程.

    这样,即使进程没有进行任何系统调用,也可以进行调度.

有多环有什么意义?

分离内核和用户态有两个主要优点:

  • 制作程序更容易,因为您更确定一个程序不会干扰另一个程序.例如,一个用户级进程不必担心由于分页而覆盖另一个程序的内存,也不必担心将硬件置于另一个进程的无效状态.
  • 它更安全.例如.文件权限和内存分离可以防止黑客应用程序读取您的银行数据.当然,这假设您信任内核.

如何玩弄它?

我创建了一个裸机设置,应该是直接操作环的好方法:https://github.com/cirosantilli/x86-bare-metal-examples

不幸的是,我没有耐心做一个用户空间的例子,但我确实做到了分页设置,所以用户空间应该是可行的.我很想看到一个拉取请求.

另外,Linux 内核模块在 ring 0 中运行,因此您可以使用它们来尝试特权操作,例如读取控制寄存器:如何从程序中访问控制寄存器cr0,cr2,cr3?得到分段错误

这是 方便的 QEMU + Buildroot 设置 在不杀死主机的情况下尝试一下.

内核模块的缺点是其他 kthread 正在运行,可能会干扰您的实验.但理论上你可以用你的内核模块接管所有的中断处理程序并拥有系统,这实际上是一个有趣的项目.

负环

虽然英特尔手册中实际上并未提及负环,但实际上存在比环 0 本身具有更多功能的 CPU 模式,因此非常适合负环"模式.名字.

一个例子是虚拟化中使用的管理程序模式.

更多详情见:

ARM

在 ARM 中,环被称为异常级别,但主要思想保持不变.

ARMv8 中存在 4 个异常级别,常用的有:

  • EL0:用户空间

  • EL1:内核(ARM 术语中的主管").

    使用 svc 指令(SuperVisor Call)输入,以前称为 swi 统一汇编之前,这是用于进行Linux系统调用的指令.Hello world ARMv8 示例:

    你好.S

    .text.global _start_开始:/* 写 */移动 x0, 1ldr x1,=味精ldr x2, = 长度mov x8, 64服务端 0/* 出口 */移动 x0, 0mov x8, 93服务端 0味精:.ascii 你好系统调用 v8
    "连 = .- 味精

    随着

    创建 VHE 是因为 KVM 等 Linux 内核中的虚拟化解决方案已经超越 Xen(例如,参见上面提到的 AWS 向 KVM 的迁移),因为大多数客户端只需要 Linux 虚拟机,正如您可以想象的那样,在单个项目中,KVM 比 Xen 更简单并且可能更高效.所以现在主机 Linux 内核在这些情况下充当虚拟机管理程序.

    从图中我们可以看出,当E2H 寄存器 HCR_EL2 等于 1,则启用 VHE,并且:

    • Linux 内核在 EL2 而不是 EL1 中运行
    • HCR_EL2.TGE == 1 时,我们是一个常规的主机用户态程序.使用 sudo 可以照常销毁主机.
    • HCR_EL2.TGE == 0 我们是客户操作系统时(例如,当您 在主机 Ubuntu 内的 QEMU KVM 内运行 Ubuntu 操作系统.除非存在 QEMU/主机内核错误,否则执行 sudo 不会破坏主机.

    注意 ARM 可能是出于事后诸葛亮的好处,它对特权级别的命名约定比 x86 更好,而不需要负级别:0 是较低的,3 是最高的.较高级别的创建频率往往高于较低级别.

    当前EL可以通过MRS指令查询:当前执行模式/异常级别等是什么?

    ARM 不需要所有异常级别都存在,以允许不需要该功能以节省芯片面积的实现.ARMv8异常级别"说:

    <块引用>

    实现可能不包括所有异常级别.所有实现必须包括 EL0 和 EL1.EL2 和 EL3 是可选的.

    例如 QEMU 默认为 EL1,但可以使用命令行选项启用 EL2 和 EL3:qemu-system-aarch64在模拟a53开机时进入el1

    在 Ubuntu 18.10 上测试的代码片段.

    I've been learning basics about driver development in Windows I keep finding the terms Ring 0 and Ring 3. What do these refer to? Are they the same thing as kernel mode and user mode?

    解决方案

    Linux x86 ring usage overview

    Understanding how rings are used in Linux will give you a good idea of what they are designed for.

    In x86 protected mode, the CPU is always in one of 4 rings. The Linux kernel only uses 0 and 3:

    • 0 for kernel
    • 3 for users

    This is the most hard and fast definition of kernel vs userland.

    Why Linux does not use rings 1 and 2: CPU Privilege Rings: Why rings 1 and 2 aren't used?

    How is the current ring determined?

    The current ring is selected by a combination of:

    • global descriptor table: a in-memory table of GDT entries, and each entry has a field Privl which encodes the ring.

      The LGDT instruction sets the address to the current descriptor table.

      See also: http://wiki.osdev.org/Global_Descriptor_Table

    • the segment registers CS, DS, etc., which point to the index of an entry in the GDT.

      For example, CS = 0 means the first entry of the GDT is currently active for the executing code.

    What can each ring do?

    The CPU chip is physically built so that:

    • ring 0 can do anything

    • ring 3 cannot run several instructions and write to several registers, most notably:

      • cannot change its own ring! Otherwise, it could set itself to ring 0 and rings would be useless.

        In other words, cannot modify the current segment descriptor, which determines the current ring.

      • cannot modify the page tables: How does x86 paging work?

        In other words, cannot modify the CR3 register, and paging itself prevents modification of the page tables.

        This prevents one process from seeing the memory of other processes for security / ease of programming reasons.

      • cannot register interrupt handlers. Those are configured by writing to memory locations, which is also prevented by paging.

        Handlers run in ring 0, and would break the security model.

        In other words, cannot use the LGDT and LIDT instructions.

      • cannot do IO instructions like in and out, and thus have arbitrary hardware accesses.

        Otherwise, for example, file permissions would be useless if any program could directly read from disk.

        More precisely thanks to Michael Petch: it is actually possible for the OS to allow IO instructions on ring 3, this is actually controlled by the Task state segment.

        What is not possible is for ring 3 to give itself permission to do so if it didn't have it in the first place.

        Linux always disallows it. See also: Why doesn't Linux use the hardware context switch via the TSS?

    How do programs and operating systems transition between rings?

    • when the CPU is turned on, it starts running the initial program in ring 0 (well kind of, but it is a good approximation). You can think this initial program as being the kernel (but it is normally a bootloader that then calls the kernel still in ring 0).

    • when a userland process wants the kernel to do something for it like write to a file, it uses an instruction that generates an interrupt such as int 0x80 or syscall to signal the kernel. x86-64 Linux syscall hello world example:

    .data
    hello_world:
        .ascii "hello world
    "
        hello_world_len = . - hello_world
    .text
    .global _start
    _start:
        /* write */
        mov $1, %rax
        mov $1, %rdi
        mov $hello_world, %rsi
        mov $hello_world_len, %rdx
        syscall
    
        /* exit */
        mov $60, %rax
        mov $0, %rdi
        syscall
    

    compile and run:

    as -o hello_world.o hello_world.S
    ld -o hello_world.out hello_world.o
    ./hello_world.out
    

    GitHub upstream.

    When this happens, the CPU calls an interrupt callback handler which the kernel registered at boot time. Here is a concrete baremetal example that registers a handler and uses it.

    This handler runs in ring 0, which decides if the kernel will allow this action, do the action, and restart the userland program in ring 3. x86_64

    • when the exec system call is used (or when the kernel will start /init), the kernel prepares the registers and memory of the new userland process, then it jumps to the entry point and switches the CPU to ring 3

    • If the program tries to do something naughty like write to a forbidden register or memory address (because of paging), the CPU also calls some kernel callback handler in ring 0.

      But since the userland was naughty, the kernel might kill the process this time, or give it a warning with a signal.

    • When the kernel boots, it setups a hardware clock with some fixed frequency, which generates interrupts periodically.

      This hardware clock generates interrupts that run ring 0, and allow it to schedule which userland processes to wake up.

      This way, scheduling can happen even if the processes are not making any system calls.

    What is the point of having multiple rings?

    There are two major advantages of separating kernel and userland:

    • it is easier to make programs as you are more certain one won't interfere with the other. E.g., one userland process does not have to worry about overwriting the memory of another program because of paging, nor about putting hardware in an invalid state for another process.
    • it is more secure. E.g. file permissions and memory separation could prevent a hacking app from reading your bank data. This supposes, of course, that you trust the kernel.

    How to play around with it?

    I've created a bare metal setup that should be a good way to manipulate rings directly: https://github.com/cirosantilli/x86-bare-metal-examples

    I didn't have the patience to make a userland example unfortunately, but I did go as far as paging setup, so userland should be feasible. I'd love to see a pull request.

    Alternatively, Linux kernel modules run in ring 0, so you can use them to try out privileged operations, e.g. read the control registers: How to access the control registers cr0,cr2,cr3 from a program? Getting segmentation fault

    Here is a convenient QEMU + Buildroot setup to try it out without killing your host.

    The downside of kernel modules is that other kthreads are running and could interfere with your experiments. But in theory you can take over all interrupt handlers with your kernel module and own the system, that would be an interesting project actually.

    Negative rings

    While negative rings are not actually referenced in the Intel manual, there are actually CPU modes which have further capabilities than ring 0 itself, and so are a good fit for the "negative ring" name.

    One example is the hypervisor mode used in virtualization.

    For further details see:

    ARM

    In ARM, the rings are called Exception Levels instead, but the main ideas remain the same.

    There exist 4 exception levels in ARMv8, commonly used as:

    • EL0: userland

    • EL1: kernel ("supervisor" in ARM terminology).

      Entered with the svc instruction (SuperVisor Call), previously known as swi before unified assembly, which is the instruction used to make Linux system calls. Hello world ARMv8 example:

      hello.S

      .text
      .global _start
      _start:
          /* write */
          mov x0, 1
          ldr x1, =msg
          ldr x2, =len
          mov x8, 64
          svc 0
      
          /* exit */
          mov x0, 0
          mov x8, 93
          svc 0
      msg:
          .ascii "hello syscall v8
      "
      len = . - msg
      

      GitHub upstream.

      Test it out with QEMU on Ubuntu 16.04:

      sudo apt-get install qemu-user gcc-arm-linux-gnueabihf
      arm-linux-gnueabihf-as -o hello.o hello.S
      arm-linux-gnueabihf-ld -o hello hello.o
      qemu-arm hello
      

      Here is a concrete baremetal example that registers an SVC handler and does an SVC call.

    • EL2: hypervisors, for example Xen.

      Entered with the hvc instruction (HyperVisor Call).

      A hypervisor is to an OS, what an OS is to userland.

      For example, Xen allows you to run multiple OSes such as Linux or Windows on the same system at the same time, and it isolates the OSes from one another for security and ease of debug, just like Linux does for userland programs.

      Hypervisors are a key part of today's cloud infrastructure: they allow multiple servers to run on a single hardware, keeping hardware usage always close to 100% and saving a lot of money.

      AWS for example used Xen until 2017 when its move to KVM made the news.

    • EL3: yet another level. TODO example.

      Entered with the smc instruction (Secure Mode Call)

    The ARMv8 Architecture Reference Model DDI 0487C.a - Chapter D1 - The AArch64 System Level Programmer's Model - Figure D1-1 illustrates this beautifully:

    The ARM situation changed a bit with the advent of ARMv8.1 Virtualization Host Extensions (VHE). This extension allows the kernel to run in EL2 efficiently:

    VHE was created because in-Linux-kernel virtualization solutions such as KVM have gained ground over Xen (see e.g. AWS' move to KVM mentioned above), because most clients only need Linux VMs, and as you can imagine, being all in a single project, KVM is simpler and potentially more efficient than Xen. So now the host Linux kernel acts as the hypervisor in those cases.

    From the image we can see that when the bit E2H of register HCR_EL2 equals 1, then VHE is enabled, and:

    • the Linux kernel runs in EL2 instead of EL1
    • when HCR_EL2.TGE == 1, we are a regular host userland program. Using sudo can destroy the host as usual.
    • when HCR_EL2.TGE == 0 we are a guest OS (e.g. when you run an Ubuntu OS inside QEMU KVM inside the host Ubuntu. Doing sudo cannot destroy the host unless there's a QEMU/host kernel bug.

    Note how ARM, maybe due to the benefit of hindsight, has a better naming convention for the privilege levels than x86, without the need for negative levels: 0 being the lower and 3 highest. Higher levels tend to be created more often than lower ones.

    The current EL can be queried with the MRS instruction: what is the current execution mode/exception level, etc?

    ARM does not require all exception levels to be present to allow for implementations that don't need the feature to save chip area. ARMv8 "Exception levels" says:

    An implementation might not include all of the Exception levels. All implementations must include EL0 and EL1. EL2 and EL3 are optional.

    QEMU for example defaults to EL1, but EL2 and EL3 can be enabled with command line options: qemu-system-aarch64 entering el1 when emulating a53 power up

    Code snippets tested on Ubuntu 18.10.

    这篇关于在操作系统的上下文中,Ring 0 和 Ring 3 是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆