大会:创建链接节点组成的数组 [英] Assembly: Creating an array of linked nodes

查看:122
本文介绍了大会:创建链接节点组成的数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

首先,如果这个问题是不恰当的,因为我不提供任何code或没有做我自己的想法,我很抱歉,我会删除这个问题。

Firstly, if this question is inappropriate because I am not providing any code or not doing any thinking on my own, I apologize, and I will delete this question.

有关的分配,我们需要创建节点的数组来模拟一个链表。每个节点具有一个整数值和一个指向列表中的下一个节点。这里是我的 .DATA 部分

For an assignment, we are required to create an array of nodes to simulate a linked list. Each node has an integer value and a pointer to the next node in the list. Here is my .DATA section

.DATA
linked_list DWORD 5 DUP (?) ;We are allowed to assume the linked list will have 5 items
linked_node STRUCT
    value BYTE  ?
    next BYTE ?
linked_node ENDS

我不确定如果我定义我的 STRUCT 正确的,因为我不能确定的是什么接下来的类型应该。另外,我很困惑,如何处理这个问题。要插入一个节点到 linked_list 我应该能够编写 MOV [ESI +型linked_list * ECX] ,正确吗?当然,我需要 INC ECX 每次。我感到困惑的是怎么做的 MOV linked_node.next,指针指向下一个节点有一些运营商,让我设置的指针下一个索引数组中等于 linked_node.next ?还是我想这个错误?任何帮助将是AP preciated!

I am unsure if I am defining my STRUCT correctly, as I am unsure of what the type of next should be. Also, I am confused as how to approach this problem. To insert a node into linked_list I should be able to write mov [esi+TYPE linked_list*ecx], correct? Of course, I'd need to inc ecx every time. What I'm confused about is how to do mov linked_node.next, "pointer to next node" Is there some sort of operator that would allow me to set the pointer to the next index in the array equal to a linked_node.next ? Or am I thinking about this incorrectly? Any help would be appreciated!

推荐答案

想想在语言上你的设计,你的的熟悉。 preferably C,因为指针和值C是直接映射到ASM的概念。

Think about your design in terms of a language you are familiar with. Preferably C, because pointers and values in C are concepts that map directly to asm.

假设你想通过存储指针head元素来跟踪你的链表。

Let's say you want to keep track of your linked list by storing a pointer to the head element.

#include <stdint.h>  // for int8_t

struct node {
    int8_t next;  // array index.  More commonly, you'd use  struct node *next;
                  // negative values for .next are a sentinel, like a NULL pointer, marking the end of the list
    int8_t val;
};

struct node storage[5];  // .next field indexes into this array
uint8_t free_position = 0;  // when you need a new node, take index = free_position++;

int8_t head = -1;  // start with an empty list

有技巧,以减少边角的情况下,就像表头是一个完整的节点,而不是仅仅是一个参考(指针或索引)。你可以把它当作第一要素,而不必检查空列表的案例比比皆是。

There are tricks to reduce corner cases, like having the list head be a full node, rather than just a reference (pointer or index). You can treat it as a first element, instead of having to check for the empty-list case everywhere.

总之,给定一个节点引用中int8_t P (其中p是一个指针链表节点,在链表code标准的变量名)时,下一个节点是存储[p.next] 。下一个节点的 VAL 存储[p.next] .VAL

Anyway, given a node reference int8_t p (where p is the standard variable name for a pointer to a list node, in linked list code), the next node is storage[p.next]. The next node's val is storage[p.next].val.

让我们看看这个看起来像ASM。该 NASM手动谈判有关它是如何的宏系统可以帮助您使用全局做出code结构更具可读性,但我没有做任何东西,宏观此。您可以定义宏 VAL 什么的,用0和1,所以可以说 [存储+ RDX * 2 + NEXT] 。甚至可以说,需要一个参数,所以你可以说宏 [NEXT(RDX * 2)] 。如果你不小心,你可以用code结束那的更多的困惑阅读,虽然。

Let's see what this looks like in asm. The NASM manual talks about how it's macro system can help you make code using global structs more readable, but I haven't done any macro stuff for this. You might define macros for NEXT and VAL or something, with 0 and 1, so you can say [storage + rdx*2 + NEXT]. Or even a macro that takes an argument, so you could say [NEXT(rdx*2)]. If you're not careful, you could end up with code that's more confusing to read, though.

section .bss
storage: resw 5   ;; reserve 5 words of zero-initialized space
free_position: db 0   ;; uint8_t free_position = 0;


section .data
head: db -1       ;; int8_t head = -1;

section .text


; p is stored in rdx.  It's an integer index into storage
;  We'll access  storage  directly, without loading it into a register.
;  (normally you'd have it in a reg, since it would be space you got from malloc/realloc)

     ; lea rsi, [rel storage]   ;; If you want RIP-relative addressing. 
     ;; There is no [RIP+offset + scale*index] addressing mode, because global arrays are for tiny / toy programs.

    test    edx, edx
    js  .err_empty_list       ;; check for p=empty list (sign-bit means negative)

    movsx   eax, byte [storage + 2*rdx]   ;; load p.next into eax, with sign-extension
    test    eax, eax
    js  .err_empty_list       ;; check that there is a next element

    movsx   eax, byte [storage + 2*rax + 1]  ;;  load storage[p.next].val, sign extended into eax
        ;; The final +1 in the effective address is because the val byte is 2nd.
        ;; you could have used a 3rd register if you wanted to keep p.next around for future use

    ret  ;; or not, if this is just the middle of some larger function


.err_empty_list:   ; .symbol is a local symbol, doesn't have to be unique for the whole file
    ud2   ; TODO: report an error instead of running an invalid insns

请注意,我们将通过一个32位章,而不是完整的64位RAX符号扩展逃脱较短的指令编码。如果该值是负的,我们不打算使用 RAX 作为地址的一部分。我们只是使用 MOVSX 作为一种零出寄存器的休息,因为 MOV人,[存储+ 2 * RDX] 将离开 RAX 的高56位与旧的内容。

Notice that we get away with shorter instruction encoding by sign-extending into a 32bit reg, not to the full 64bit rax. If the value is negative, we aren't going to use rax as part of an address. We're just using movsx as a way to zero-out the rest of the register, because mov al, [storage + 2*rdx] would leave the upper 56 bits of rax with the old contents.

另一种方式做,这将是 MOVZX EAX,字节[...] /测试人,人,由于8位测试是一样快EN code和执行作为32位测试指令。此外, MOVZX 作为负载比 MOVSX 一个周期的延迟更低,对AMD推土机系列CPU(虽然它们都仍然需要一个整数执行单元,不像英特尔其中 MOVSX / ZX 是由负载端口完全处理)。

Another way to do this would be to movzx eax, byte [...] / test al, al, because the 8-bit test is just as fast to encode and execute as a 32bit test instruction. Also, movzx as a load has one cycle lower latency than movsx, on AMD Bulldozer-family CPUs (although they both still take an integer execution unit, unlike Intel where movsx/zx is handled entirely by a load port).

无论哪种方式, MOVSX MOVZX 是装载的8位数据,因为你避免问题的​​好办法与阅读全章写的部分章后,和/或假的依赖(在章的高位的previous内容,即使的的知道你已经归零了, CPU硬件还是要跟踪它)。但如果你知道你不优化英特尔pre-Haswell的,那么你就不必担心局部寄存器写入。 Haswell的做双簿记或东西,以避免额外的微指令阅读时用旧的全部价值合并部分值。 AMD的CPU,P4和Silvermont不分开全章跟踪局部暂存器,所以你不用担心是假的依赖。

Either way, movsx or movzx is a good way to load 8-bit data, because you avoid problems with reading the full reg after writing a partial reg, and/or a false-dependency (on the previous contents of the upper bits of the reg, even if you know you already zeroed it, the CPU hardware still has to track it). Except if you know you're not optimizing for Intel pre-Haswell, then you don't have to worry about partial-register writes. Haswell does dual-bookkeeping or something to avoid extra uops to merge the partial value with the old full value when reading. AMD CPUs, P4, and Silvermont don't track partial-regs separately from the full-reg, so all you have to worry about is the false dependency.

另外请注意,您可以加载接下来 VAL 挤在一起,像

Also note that you can load the next and val packed together, like

.search_loop:
    movzx    eax,  word [storage + rdx*2]   ; next in al,  val in ah
    test     ah, ah
    jz   .found_a_zero_val
    movzx    edx, al        ; use .next for the next iteration
    test     al, al
    jns   .search_loop

    ;; if we get here, we didn't find a zero val
    ret

.found_a_zero_val:
    ;; do something with the element referred to by `rdx`

请注意我们如何使用 MOVZX 无论如何,因为在一个有效的地址,所有的寄存器必须是相同的大小。 (所以字[存储+人* 2] 不起作用。)

Notice how we have to use movzx anyway, because all the registers in an effective address have to be the same size. (So word [storage + al*2] doesn't work.)

这可能是比较有用走另一条路,一个节点的两个字段存储与单店,如 MOV [存储+ RDX * 2],AX 或东西,让后接下来 VAL 进入,可能是从不同的来源。 (这是你可能希望使用一个普通字节的负载,而不是MOVZX,如果你不已经有它在另一个寄存器的情况)。这不是什么大不了的事:不要让你的code难读或更复杂,只是为了避免做两字节商店。至少,直到你发现存储端口微指令在某些循环中的瓶颈。

This is probably more useful going the other way, to store both fields of a node with a single store, like mov [storage + rdx*2], ax or something, after getting next into al, and val into ah, probably from separate sources. (This is a case where you might want to use a regular byte load, instead of a movzx, if you don't already have it in another register). This isn't a big deal: don't make your code hard to read or more complex just to avoid doing two byte-stores. At least, not until you find out that store-port uops are the bottleneck in some loop.

使用索引到一个数组,而不是一个指针,可以节省大量的空间,尤其。在哪里指针采取8个字节的64位系统。如果你没有需要释放单个节点(即数据结构,只有永远的增长,或一次全部,当它被删除删除),则新节点的分配很简单:只要保持在数组粘底他们,和的realloc(3)。或者使用C ++ 的std ::矢量

Using an index into an array, instead of a pointer, can save a lot of space, esp. on 64bit systems where pointers take 8 bytes. If you don't need to free individual nodes (i.e. data structure only ever grows, or is deleted all at once when it is deleted), then an allocator for new nodes is trivial: just keep sticking them at the end of the array, and realloc(3). Or use a c++ std::vector.

使用这些积木,你应该是所有设置来实现通常的链表交易算法。只是存储字节与 MOV [存储+ RDX * 2],人或什么的。

With those building blocks, you should be all set to implement the usual linked list algos. Just store bytes with mov [storage + rdx*2], al or whatever.

如果您需要关于如何实现与处理所有的特殊情况下,用尽可能少的分支尽可能干净交易算法链表的想法,看一看的这codereview问题。它是Java,但是我的回答是非常C风格。其他答案有一些很好的技巧,也有些我借了我的答案。 (例如,使用虚节点避免分支来处理插入作为一种新的头特殊情况)。

If you need ideas on how to implement linked lists with clean algos that handle all the special-cases with as few branches as possible, have a look at this Codereview question. It's for Java, but my answer is very C-style. The other answers have some nice tricks, too, some of which I borrowed for my answer. (e.g. using a dummy node avoids branching to handle the insertion-as-a-new-head special case).

这篇关于大会:创建链接节点组成的数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆