如何使用内核哈希表API? [英] How to use the kernel hashtable API?

查看:287
本文介绍了如何使用内核哈希表API?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试理解和使用内核哈希表格,我已经阅读了链接,但我听不懂.我的第一个问题是:为什么我们的结构必须在其中包含struct h_list?如果我们要通过struct h_list访问我们的结构,我们的结构不应该在struct h_list之内,而不是相反吗?阅读本教程后,我尝试编写以下代码:

I'm trying to understand and use the kernel hash tables and I've already read this and this link, but I didn't understand none of them. My first question is: Why our struct has to have the struct h_list inside it? If we're gonna access our struct through the struct h_list our struct shouldn't be within struct h_list and not the opposite? After reading the tutorial I've tried to write the following code:

DECLARE_HASHTABLE(nodes_hash, 3)

hash_init(nodes_hash);

struct h_node{
    int data;
    char name[MAX_NAME_SIZE]; /*The key is the name of lua state*/
    struct hlist_node node;
};

struct h_node a = {
    .data = 3,
    .name = "foo",
    .node = 0   
};

struct h_node b = {
    .data = 7,
    .name = "bar",
    .node = 0   
};    

hash_add(nodes_hash,&a.node, "foo");
hash_add(nodes_hash,&b.node, "bar");

但是,这甚至无法编译.我做错了什么?我需要密钥与struct h_node中存在的名称相同.所以我希望我的哈希表是这样的:

But this does not even compile. What I'm doing wrong? I need that the key be the same name present in the struct h_node. So I would like that my hash table be like this:

PS:在我的哈希表中,它永远不会发生冲突(我将其处理为永远不会发生),因此键可以是struct h_node

PS: In my hash table it will never occur a collision (I'll handle it to never occur) so the key can be the name in the struct h_node

推荐答案

为什么我们的结构必须在其中包含struct h_list?如果我们要通过struct h_list访问我们的结构,那么我们的结构不应该在struct h_list之内,而不是相反的吗?

Why our struct has to have the struct h_list inside it? If we're gonna access our struct through the struct h_list our struct shouldn't be within struct h_list and not the opposite?

这是因为在Linux内核中如何实现哈希表.哈希表只是一个固定大小为struct hlist_head的数组.这些每个代表一个存储桶,并且是链表的头.哈希表仅包含一堆struct hlist_node链接列表,仅此而已.它并没有真正地存储"或存储.整个用户定义的结构,它仅包含指向每个元素struct hlist_node字段的指针.

That's because how hash tables are implemented in the Linux kernel. An hashtable is just an array of fixed size of struct hlist_head. Each of those represents a bucket, and is the head of a linked list. The hashtable only contains a bunch of linked lists of struct hlist_node, nothing else. It does not really "store" the entire user defined struct, it merely holds a pointer to the struct hlist_node field of each element.

将元素添加到哈希表时,将选择一个存储桶,并将指向结构的struct hlist_node字段的指针插入到存储桶列表中.以后当您检索一个元素时(例如通过hash_for_each()),container_of()宏用于取回您的真实结构,知道其类型以及用户定义的结构内类型为struct hlist_node的结构成员的名称.

When you add an element to the hashtable, a bucket is selected, and a pointer to the struct hlist_node field of your struct is inserted in the bucket list. When you later retrieve an element (for example through hash_for_each()), the container_of() macro is used to get your real structure back, knowing its type and the name of the struct member of type struct hlist_node inside your user defined structure.

可以在源代码后面看到.例如,对于hash_for_each(),我们有:

This can be seen following the source code. For example, for hash_for_each() we have:

  1. hash_for_each(name, bkt, obj, member) 确实:

 for ((bkt) = 0, obj = NULL; obj == NULL && (bkt) < HASH_SIZE(name);\
                 (bkt)++)\
         hlist_for_each_entry(obj, &name[bkt], member)

  • hlist_for_each_entry() 确实:

  • hlist_for_each_entry() does:

     for (pos = hlist_entry_safe((head)->first, typeof(*(pos)), member);\
          pos;                           \
          pos = hlist_entry_safe((pos)->member.next, typeof(*(pos)), member))
    

  • hlist_entry_safe() 确实:

  • hlist_entry_safe() does:

     ({ typeof(ptr) ____ptr = (ptr); \
        ____ptr ? hlist_entry(____ptr, type, member) : NULL; \
     })
    

  • 最后是 使用container_of()来获取真实的结构:

  • And finally hlist_entry() uses container_of() to get the real struct:

     #define hlist_entry(ptr, type, member) container_of(ptr,type,member)
    

  • 我需要密钥与struct h_node中存在的名称相同.

    这在本地是不可能的. Linux内核哈希表API仅处理整数键.如果您看一下 linux/hashtable.h中的实现,您可以看到使用的哈希函数是 hash_64() ,它们都采用无符号整数值(分别为u32u64).

    This is not possible natively. The Linux kernel hashtable API only deals with integer keys. If you take a look at the implementation in linux/hashtable.h, you can see the hash functions used are hash_32() and hash_64(), and they both take unsigned integer values (u32 and u64 respectively).

    Linux内核哈希表API非常有限,并且它肯定不会实现与其他编程语言所使用的哈希表相同的哈希表.它不能使用字符串作为键,并且具有固定的大小.

    The Linux kernel hashtable API is very limited and it definitely does not implement the same kind of hashtable that you are used to in other programming languages. It cannot use strings as keys, and it has a fixed size.

    如果要使用字符串,则必须对这些字符串进行哈希处理以生成无符号整数.为此,您可以使用 xxhash() 或编写您自己的函数. xxhash()函数是一个相对较新的函数,似乎尚未在内核代码中使用,因此,我认为您的内核很可能没有它就已配置,并且您没有可用的内核.

    If you want to use strings, you will have to hash those strings to generate an unsigned integer. To do so you can either use xxhash() or write your own function. The xxhash() function is relatively new and does not yet seem to be used in kernel code, so I think your kernel was most likely configured without it and you don't have it available.

    无论如何,请注意,如果哈希函数将不同的字符串转换为相同的键,或者hash_add()最终在哈希表数组中选择了相同的索引要插入元素,则将两个元素放置在同一哈希表存储桶中.因此,在检索任何元素(例如使用hash_for_each_possible())时,您需要将此因素考虑在内并正确检查其name.

    In any case, beware that if the hash function transforms different strings into the same key, or if hash_add() ends up choosing the same index in the hashtable array to insert the elements, then the two elements will be placed in the same hashtable bucket. Therefore, when retrieving any element (using for example hash_for_each_possible()) you need to take this into consideration and correctly check its name.

    这是一个完整的工作示例,用于演示内核哈希表的基本用法,已在内核v4.9上进行了测试,但也应在最新的v5.7上工作.请注意,在此示例中,为简单起见,我在模块_init函数的堆栈上分配了变量.这意味着除了该函数内部的代码外,我无法在代码的其他任何地方执行hash_for_each_possible().如果您想要一个全局哈希表,该哈希表能够保存以后由不同函数访问的元素,则需要使用kmalloc()动态分配它们.

    Here's a complete working example to demonstrate basic usage of kernel hashtables, tested on kernel v4.9, but should work on the latest v5.7 too. Note that in this example I am allocating the variables on the stack of the module _init function just for simplicity. This means that I cannot do hash_for_each_possible() from anywhere else in the code except from inside that function. If you want a global hashtable capable of holding elements that are later accessed by different functions, you will need to allocate them dynamically using kmalloc().

    // SPDX-License-Identifier: GPL-3.0
    #include <linux/hashtable.h> // hashtable API
    #include <linux/module.h>    // module_{init,exit}, MODULE_*
    #include <linux/string.h>    // strcpy, strcmp
    #include <linux/types.h>     // u32 etc.
    
    #define MAX 32
    
    struct h_node {
        int data;
        char name[MAX];
        struct hlist_node node;
    };
    
    DECLARE_HASHTABLE(tbl, 3);
    
    // Just to demonstrate the behavior when two keys are equal.
    static u32 myhash(const char *s) {
        u32 key = 0;
        char c;
    
        while ((c = *s++))
            key += c;
    
        return key;
    }
    
    static int __init myhashtable_init(void)
    {
        struct h_node a, b, *cur;
        u32 key_a, key_b;
        unsigned bkt;
    
        pr_info("myhashtable: module loaded\n");
    
        a.data = 3;
        strcpy(a.name, "foo");
    
        b.data = 7;
        strcpy(b.name, "oof");
    
        /* Calculate key for each element.
         * Since the above hash function only sums the characters, we will
         * end up having two identical keys here.
         */
        key_a = myhash(a.name);
        key_b = myhash(b.name);
    
        pr_info("myhashtable: key_a = %u, key_b = %u\n", key_a, key_b);
    
        // Initialize the hashtable.
        hash_init(tbl);
    
        // Insert the elements.
        hash_add(tbl, &a.node, key_a);
        hash_add(tbl, &b.node, key_b);
    
        // List all elements in the table.
        hash_for_each(tbl, bkt, cur, node) {
            pr_info("myhashtable: element: data = %d, name = %s\n",
                cur->data, cur->name);
        }
    
        // Get the element with name = "foo".
        hash_for_each_possible(tbl, cur, node, key_a) {
            pr_info("myhashtable: match for key %u: data = %d, name = %s\n",
                key_a, cur->data, cur->name);
    
            // Check the name.
            if (!strcmp(cur->name, "foo")) {
                pr_info("myhashtable: element named \"foo\" found!\n");
                break;
            }
        }
    
        // Remove elements.
        hash_del(&a.node);
        hash_del(&b.node);
    
        return 0;
    }
    
    static void __exit myhashtable_exit(void)
    {
        // Do nothing (needed to be able to unload the module).
        pr_info("myhashtable: module unloaded\n");
    }
    
    
    module_init(myhashtable_init);
    module_exit(myhashtable_exit);
    MODULE_VERSION("0.1");
    MODULE_DESCRIPTION("Silly kernel hashtable API example module.");
    MODULE_AUTHOR("Marco Bonelli");
    MODULE_LICENSE("GPL");
    

    dmesg在我的计算机上输出:

    dmesg output on my machine:

    [ 3174.567029] myhashtable: key_a = 324, key_b = 324
    [ 3174.567030] myhashtable: element: data = 7, name = oof
    [ 3174.567031] myhashtable: element: data = 3, name = foo
    [ 3174.567032] myhashtable: match for key 324: data = 7, name = oof
    [ 3174.567033] myhashtable: match for key 324: data = 3, name = foo
    [ 3174.567033] myhashtable: element named "foo" found!
    

    这篇关于如何使用内核哈希表API?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆