汇编指令如何在 CPU 上转化为电压变化? [英] How does an assembly instruction turn into voltage changes on the CPU?

查看:28
本文介绍了汇编指令如何在 CPU 上转化为电压变化?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

过去 3 到 5 年,我一直在使用 C 和 CPython.考虑一下我在这里的知识基础.

如果我要对支持它的处理器使用诸如 MOV AL, 61h 之类的汇编指令,那么处理器内部究竟是什么来解释此代码并将其作为电压信号分派?怎么可能会执行这样一个简单的指令?

当我尝试思考 MOV AL, 61h 甚至 XOR EAX, EBX 中包含的众多步骤时,汇编甚至感觉像是一种高级语言.

我读了一些评论,问为什么当 x86 系列在嵌入式系统中不常见时我把它作为嵌入式.欢迎来到我自己的无知.现在我想,如果我对此一无所知,那么很可能其他人也对此一无所知.

考虑到大家在回答中付出的努力,我很难选择一个最喜欢的答案,但我觉得不得不做出决定.没有受伤的感觉,伙计们.

我经常发现,我对计算机了解得越多,我意识到自己真正了解的就越少.感谢您让我对微代码和晶体管逻辑敞开心扉!

EDIT #2:多亏了这个线程,我才理解了为什么 XOR EAX, EAXMOV EAX, 0h 更快.:)

解决方案

我最近开始阅读 Charles Petzold 的名为 Code 的书,到目前为止,它完全涵盖了我认为您感兴趣的事物类型.但是在购买/借阅之前,我还没有完全通读这本书.

这是我相对简短的回答,不是 Petzolds...希望与您的好奇心一致.

您听说过我假设的晶体管.最初使用晶体管的方法是用于晶体管收音机之类的东西.它基本上是一个放大器,将漂浮在空气中的微小无线电信号馈送到晶体管的输入端,该晶体管打开或关闭旁边电路上的电流.然后你用更高的功率连接电路,这样你就可以接收一个非常小的信号,将其放大并将其馈入扬声器并收听广播电台(还有更多的工作可以隔离频率并保持晶体管平衡,但是你明白我希望的想法).

既然晶体管已经存在,那么就可以将晶体管用作开关,就像电灯开关一样.收音机就像一个调光灯开关,您可以将它从一直打开到一直关闭的任何位置.非调光灯开关要么全开要么全关,开关中间有一个神奇的地方,它可以转换.我们在数字电子产品中以相同的方式使用晶体管.取一个晶体管的输出并将其馈入另一个晶体管的输入.第一个的输出肯定不是像无线电波这样的小信号,它迫使第二个晶体管一路导通或一路关断.这导致了 TTL 或晶体管晶体管逻辑的概念.基本上你有一个晶体管来驱动高电压或称之为 1,在它上吸收零电压,我们称之为 0.然后你用其他电子设备安排输入,这样你就可以创建与门(如果两个输入是 1,则输出为 1),或门(如果一个或另一个输入为 1,则输出为 1).反相器、NAND、门、或非门(一个或一个反相器)等.曾经有一本 TTL 手册,你可以购买 8 个左右的引脚芯片,这些芯片有一个或两个或四个某种门(NAND、NOR、AND 等)在内部运行,每个输入有两个输入和一个输出.现在我们不需要那些,创建可编程逻辑或具有数百万个晶体管的专用芯片更便宜.但我们仍然考虑硬件设计的 AND、OR 和 NOT 门.(通常更像 nand 和 nor).

我不知道他们现在教什么,但概念是相同的,对于存储器,触发器可以被认为是这些 TTL 对 (NANDS) 中的两个,其中一个的输出连接到另一个的输入.就这样吧.这基本上是我们所说的 SRAM 或静态 ram 中的一个位.SRAM 基本上需要 4 个晶体管.Dram 或动态 ram 您自己放入计算机的记忆棒每位占用一个晶体管,因此对于初学者来说,您可以了解为什么 DRAM 是您购买千兆字节的东西.只要电源不熄灭,Sram 位就会记住您将它们设置为什么.一旦你告诉它,Dram 就会开始忘记你告诉它的内容,基本上 dram 以第三种不同的方式使用晶体管,有一些电容(如电容器,这里不会进入),就像一个微型可充电电池,只要你给它充电并拔下充电器,它就会开始放电.想象一下架子上的一排玻璃杯,每个玻璃杯上都有小孔,这些是你的 dram 部分,你希望其中一些是一些,所以你有一个助手填满你想成为的眼镜.那个助手必须不断地给投手注满水,然后沿着一排走,让一"位的杯子装满水,让零"位的杯子保持空.因此,在任何时候您想查看您的数据是什么,您都可以通过查找绝对高于中间值的水位为 1 和绝对低于中间值的水位为零来查看和读取 1 和 0.所以即使打开电源后,如果助手无法将眼镜装满以区分 1 和 0,它们最终会看起来像 0 并耗尽.这是每个芯片更多位的权衡.所以这里的简短故事是,在处理器之外,我们使用 dram 作为我们的大容量存储器,并且有辅助逻辑负责将 1 保持为 1,将 0 保持为 0.但是在芯片内部,AX 寄存器和 DS 寄存器例如使用触发器或 sram 保存您的数据.对于您所知道的每一位(例如 AX 寄存器中的位),可能有成百上千或更多位用于将这些位进出该 AX 寄存器.

您知道处理器以某种时钟速度运行,如今大约为每秒 2 GHz 或 20 亿个时钟.想想由晶体生成的时钟,这是另一个话题,但逻辑将时钟视为一个电压,在这个时钟频率 2ghz 或其他任何(gameboy 的进步是 17mhz,旧的 ipods 大约 75mhz,原装 ibm pc 4.77mhz).

因此用作开关的晶体管允许我们获取电压并将其转换为我们作为硬件工程师和软件工程师所熟悉的 1 和 0,甚至可以为我们提供 AND、OR 和 NOT 逻辑功能.我们有这些神奇的晶体,可以让我们获得准确的电压振荡.

所以我们现在可以做一些事情,比如,如果时钟是一,我的状态变量说我处于取指令状态,那么我需要切换一些门,以便我想要的指令的地址,这在程序计数器中,在内存总线上熄灭,以便内存逻辑可以给我 MOV AL,61h 的指令.您可以在 x86 手册中查找,发现其中一些操作码位表示这是一个 mov 操作,目标是 EAX 寄存器的低 8 位,而 mov 的源是一个立即数,这意味着它在这条指令之后的内存位置.所以我们需要将该指令/操作码保存在某处,并在下一个时钟周期获取下一个内存位置.所以现在我们已经保存了 mov al,immediate 并且我们从内存中读取了值 61h,我们可以切换一些晶体管逻辑,以便该 61h 的第 0 位存储在 al 的第 0 位触发器和第 1 位到第 1 位等.

你问这一切是怎么发生的?考虑一个执行一些数学公式的 python 函数.你从程序的顶部开始,输入一些作为变量的公式输入,你可以在程序中进行单独的步骤,可能会在此处添加一个常量或从库中调用平方根函数等.在底部你返回答案.硬件逻辑以同样的方式完成,今天使用的编程语言中的一种看起来很像 C.主要区别在于您的硬件功能可能有成百上千的输入,而输出是一位.在每个时钟周期,AL 寄存器的第 0 位正在使用一个巨大的算法进行计算,具体取决于您想要查看多远.想想您为数学运算调用的平方根函数,该函数本身就是其中之一,一些输入产生输出,它可能调用其他函数,可能是乘法或除法.因此,您可能有一些地方可以将其视为 AL 寄存器位 0 之前的最后一步,其功能是:如果时钟为 1,则 AL[0] = AL_next[0];否则 AL[0] = AL[0];但是有一个更高的函数包含从其他输入计算的下一个位,一个更高的函数和一个更高的函数,其中大部分是由编译器创建的,就像你的三行 python 可以变成成百上千汇编器的行数.几行 HDL 可以变成成百上千或更多的晶体管.硬件人员通常不会查看特定位的最低级别公式来找出所有可能的输入以及所有可能的 AND 和 OR 以及 NOT 计算所需的任何内容,而不是您可能检查程序生成的汇编程序.但如果你愿意,你也可以.

关于微编码的说明,大多数处理器不使用微编码.例如,您使用 x86 进入它是因为它是当时的一个很好的指令集,但从表面上看,它很难跟上现代的步伐.其他指令集不需要微编码,直接按照我上面描述的方式使用逻辑.您可以将微编码视为使用不同指令集/汇编语言的不同处理器,该语言模拟您在表面上看到的指令集.不像您尝试在 mac 上模拟 windows 或在 windows 上模拟 linux 等那样复杂.微编码层是专门为这项工作设计的,您可能会想到只有 AX、BX、CX、DX 四个寄存器,但是有里面还有很多.很自然地,一个汇编程序可以以某种方式在一个或多个内核中的多个执行路径上执行.就像闹钟或洗衣机中的处理器一样,微代码程序简单小巧,可调试并烧入硬件,希望永远不需要固件更新.至少是理想的.但是就像您的 ipod 或手机一样,您有时确实需要修复错误或其他任何东西,并且有一种方法可以升级您的处理器(BIOS 或其他软件会在启动时加载补丁).假设您打开电视遥控器或计算器的电池仓,您可能会看到一个孔,在那里您可以看到一些裸露的金属触点,可能是三个、五个或多个.对于一些遥控器和计算器,如果你真的想要你可以重新编程,更新固件.但通常不会,理想情况下,遥控器是完美的或完美的,足以让电视机的寿命更长.微编码提供了在市场上获得非常复杂的产品(数百万、数亿个晶体管)的能力,并在未来修复该领域中的大和可修复的错误.想象一下,您的团队在 18 个月内编写了一个 2 亿行的 Python 程序,并且必须交付它,否则公司将无法获得竞赛产品.同样的事情,除了可以在现场更新的代码的一小部分之外,其余部分必须保持刻在石头上.对于闹钟或烤面包机,如果有错误或需要帮助的东西,您只需将其扔掉,然后再找一个.

如果您翻阅维基百科或只是谷歌的东西,您可以查看指令集和机器语言,例如 6502、z80、8080 和其他处理器.可能有 8 个寄存器和 250 条指令,您可以从晶体管的数量中感觉到,与每个时钟计算触发器中每个位所需的逻辑门序列相比,250 条汇编指令仍然是一种非常高级的语言循环.你的假设是正确的.除了微编码处理器之外,这种低级逻辑不能以任何方式重新编程,您必须使用软件修复硬件错误(对于已经或将要交付且未报废的硬件).

查一下 Petzold 的那本书,他在解释东西方面做得非常出色,远比我能写的任何东西都要好.

I've been working in C and CPython for the past 3 - 5 years. Consider that my base of knowledge here.

If I were to use an assembly instruction such as MOV AL, 61h to a processor that supported it, what exactly is inside the processor that interprets this code and dispatches it as voltage signals? How would such a simple instruction likely be carried out?

Assembly even feels like a high level language when I try to think of the multitude of steps contained in MOV AL, 61h or even XOR EAX, EBX.

EDIT: I read a few comments asking why I put this as embedded when the x86-family is not common in embedded systems. Welcome to my own ignorance. Now I figure that if I'm ignorant about this, there are likely others ignorant of it as well.

It was difficult for me to pick a favorite answer considering the effort you all put into your answers, but I felt compelled to make a decision. No hurt feelings, fellas.

I often find that the more I learn about computers the less I realize I actually know. Thank you for opening my mind to microcode and transistor logic!

EDIT #2: Thanks to this thread, I have just comprehended why XOR EAX, EAX is faster than MOV EAX, 0h. :)

解决方案

I recently started reading Charles Petzold book titled Code, which so far covers exactly the kinds of things I assume you are curious about. But I have not gotten all the way through so thumb through the book first before buying/borrowing.

This is my relatively short answer, not Petzolds...and hopefully in line with what you were curios about.

You have heard of the transistor I assume. The original way to use a transistor was for things like a transistor radio. it is an amplifier basically, take the tiny little radio signal floating in air and feed it into the input of the transistor which opens or closes the flow of current on a circuit next to it. And you wire that circuit with higher power, so you can take a very small signal, amplify it and feed it into a speaker for example and listen to the radio station (there is more to it isolating the frequency and keeping the transistor balanced, but you get the idea I hope).

Now that the transistor exists that lead to was a way to use a transistor as a switch, like a light switch. The radio is like a dimmer light switch you can turn it to anywhere from all the way on to all the way off. A non-dimmer light switch is either all on or all off, there is some magic place in the middle of the switch where it changes over. We use transistors the same way in digital electronics. Take the output of one transistor and feed it into another transistors input. The output of the first is certainly not a small signal like the radio wave, it forces the second transistor all the way on or all the way off. that leads to the concept of TTL or transistor-transistor logic. Basically you have one transistor that drives a high voltage or lets call it a 1, and on that sinks a zero voltage, lets call that a 0. And you arrange the inputs with other electronics so that you can create AND gates (if both inputs are a 1 then the output is a 1), OR gates (if either one or the other input is a 1 then the output is a one). Inverters, NAND, gates, NOR gates (an or with an inverter) etc. There used to be a TTL handbook and you could buy 8 or so pin chips that had one or two or four of some kind of gate (NAND, NOR, AND, etc) functions inside, two inputs and an output for each. Now we dont need those it is cheaper to create programmable logic or dedicated chips with many millions of transistors. But we still think in terms of AND, OR, and NOT gates for hardware design. (usually more like nand and nor).

I dont know what they teach now but the concept is the same, for memory a flip flop can be thought of as two of these TTL pairs (NANDS) tied together with the output of one going to the input of the other. Lets leave it at that. That is basically a single bit in what we call SRAM, or static ram. sram takes basically 4 transistors. Dram or dynamic ram the memory sticks you put in your computer yourself take one transistor per bit, so for starters you can see why dram is the thing you buy gigabytes worth of. Sram bits remember what you set them to so long as the power doesnt go out. Dram starts to forget what you told it as soon as you tell it, basically dram uses the transistor in yet a third different way, there is some capacitance (as in capacitor, wont get into that here) that is like a tiny rechargeable battery, as soon as you charge it and unplug the charger it starts to drain. Think of a row of glasses on a shelf with little holes in each glass, these are your dram bits, you want some of them to be ones so you have an assistant fill up the glasses you want to be a one. That assistant has to constantly fill up the pitcher and go down the row and keep the "one" bit glasses full enough with water, and let the "zero" bit glasses remain empty. So that at any time you want to see what your data is you can look over and read the ones and zeros by looking for water levels that are definitely above the middle being a one and levels definitely below the middle being a zero.. So even with the power on, if the assistant is not able to keep the glasses full enough to tell a one from a zero they will eventually all look like zeros and drain out. Its the trade off for more bits per chip. So short story here is that outside the processor we use dram for our bulk memory, and there is assistant logic that takes care of keeping the ones a one and zeros a zero. But inside the chip, the AX register and DS registers for example keep your data using flip flops or sram. And for every bit you know about like the bits in the AX register, there are likely hundreds or thousands or more that are used to get the bits into and out of that AX register.

You know that processors run at some clock speed, these days around 2 gigahertz or two billion clocks per second. Think of the clock, which is generated by a crystal, another topic, but the logic sees that clock as a voltage that goes high and zero high and zero at this clock rate 2ghz or whatever (gameboy advances are 17mhz, old ipods around 75mhz, original ibm pc 4.77mhz).

So transistors used as switches allow us to take voltage and turn it into the ones and zeros we are familiar with both as hardware engineers and software engineers, and go so far as to give us AND, OR, and NOT logic functions. And we have these magic crystals that allow us to get an accurate oscillation of voltage.

So we can now do things like say, if the clock is a one, and my state variable says I am in the fetch instruction state, then I need to switch some gates so that the address of the instruction I want, which is in the program counter, goes out on the memory bus, so that the memory logic can give me my instruction for MOV AL,61h. You can look this up in a x86 manual, and find that some of those opcode bits say this is a mov operation and the target is the lower 8 bits of the EAX register, and the source of the mov is an immediate value which means it is in the memory location after this instruction. So we need to save that instruction/opcode somewhere and fetch the next memory location on the next clock cycle. so now we have saved the mov al, immediate and we have the value 61h read from memory and we can switch some transistor logic so that bit 0 of that 61h is stored in the bit 0 flipflop of al and bit 1 to bit 1, etc.

How does all that happen you ask? Think about a python function performing some math formula. you start at the top of the program with some inputs to the formula that come in as variables, you have individual steps through the program that might add a constant here or call the square root function from a library, etc. And at the bottom you return the answer. Hardware logic is done the same way, and today programming languages are used one of which looks a lot like C. The main difference is your hardware functions might have hundreds or thousands of inputs and the output is a single bit. On every clock cycle, bit 0 of the AL register is being computed with a huge algorithm depending how far out you want to look. Think about that square root function you called for your math operation, that function itself is one of these some inputs produce an output, and it may call other functions maybe a multiply or divide. So you likely have a bit somewhere that you can think of as the last step before bit 0 of the AL register and its function is: if clock is one then AL[0] = AL_next[0]; else AL[0] = AL[0]; But there is a higher function that contains that next al bit computed from other inputs, and a higher function and a higher function and much of these are created by the compiler in the same way that your three lines of python can turn into hundreds or thousands of lines of assembler. A few lines of HDL can become hundreds or thousands or more transistors. hardware folks dont normally look at the lowest level formula for a particular bit to find out all the possible inputs and all the possible ANDs and ORs and NOTs that it takes to compute any more than you probably inspect the assembler generated by your programs. but you could if you wanted to.

A note on microcoding, most processors do not use microcoding. you get into it with the x86 for example because it was a fine instruction set for its day but on the surface struggles to keep up with modern times. other instruction sets do not need microcoding and use logic directly in the way I described above. You can think of microcoding as a different processor using a different instruction set/assembly language that is emulating the instruction set that you see on the surface. Not as complicated as when you try to emulate windows on a mac or linux on windows, etc. The microcoding layer is designed specifically for the job, you may think of there only being the four registers AX, BX, CX, DX, but there are many more inside. And naturally that one assembly program somehow can get executed on multiple execution paths in one core or multiple cores. Just like the processor in your alarm clock or washing machine, the microcode program is simple and small and debugged and burned into the hardware hopefully never needing a firmware update. At least ideally. but like your ipod or phone for example you sometimes do want a bug fix or whatever and there is a way to upgrade your processor (the bios or other software loads a patch on boot). Say you open the battery compartment to your TV remote control or calculator, you might see a hole where you can see some bare metal contacts in a row, maybe three or 5 or many. For some remotes and calculators if you really wanted to you could reprogram it, update the firmware. Normally not though, ideally that remote is perfect or perfect enough to outlive the TV set. Microcoding provides the ability to get the very complicated product (millions, hundreds of millions of transistors) on the market and fix the big and fixable bugs in the field down the road. Imagine a 200 million line python program your team wrote in say 18 months and having to deliver it or the company will fail to the competitions product. Same kind of thing except only a small portion of that code you can update in the field the rest has to remain carved in stone. for the alarm clock or toaster, if there is a bug or the thing needs help you just throw it out and get another.

If you dig through wikipedia or just google stuff you can look at the instruction sets and machine language for things like the 6502, z80, 8080, and other processors. There may be 8 registers and 250 instructions and you can get a feel from the number of transistors that that 250 assembly instructions is still a very high level language compared to the sequence of logic gates it takes to compute each bit in a flip flop per clock cycle. You are correct in that assumption. Except for the microcoded processors, this low level logic is not re-programmable in any way, you have to fix the hardware bugs with software (for hardware that is or going to be delivered and not scrapped).

Look up that Petzold book, he does an excellent job of explaining stuff, far superior to anything I could ever write.

这篇关于汇编指令如何在 CPU 上转化为电压变化?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆