如何理解这个dmesg的错误消息? [英] How to understand this dmesg error message?

查看:374
本文介绍了如何理解这个dmesg的错误消息?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经写了这个简单的模块来处理的设备,并调用它的一些电源管理方法,如 .suspend .resume 。在它的初始化,模块简单的查找特定的设备,并尝试调用它的方法。

 的#include<的Linux / kernel.h>
#包括LT&; Linux的/ - module.h中GT;
#包括LT&; Linux的/ device.h中>
#包括LT&; Linux的/ pci.h>静态INT __init mfps_driver_init(无效){结构pci_dev * dev的= NULL;
结构pci_driver *驱动器= NULL;
结构装置*设备= NULL;开发= pci_get_device(0x8086,0x15a2,NULL);如果((DEV == NULL)||(DEV == 0)){    printk的(KERN_INFOLEONZO:未找到SIZE%LD \\ N的sizeof(DEV));}其他{    司机= dev亡>驱动程序;    printk的(KERN_INFOLEONZO:我发现SIZE%LD \\ n的设备,的sizeof(DEV));
    printk的(KERN_INFOLEONZO:这里是它的驱动程序名称%s \\ n,驱动程序 - >名);
    printk的(KERN_INFOLEONZO:称其suspend方法\\ n);    *设备= dev亡>开发;    device_lock(设备);    device_unlock(设备);
}返回0;}静态无效__exit mfps_driver_exit(无效){}
宏module_init(mfps_driver_init);
宏module_exit(mfps_driver_exit);

在code编译成功。但是,我得到一个内核错误,当我加载模块:

 须藤insmod的MyFirstPowerState.ko

和dmesg的显示以下输出

  [59.545180] MyFirstPowerState:模块许可证未指定会污染内核。
[59.545183]禁用锁调试由于内核污点
[59.546010] LEONZO:我发现了大小为8的设备
[59.546012] LEONZO:这里是它的驱动程序名称E1000E
[59.546013] LEONZO:调用它suspend方法
[59.546021] BUG:无法在处理内核空指针引用(空)
[59.546051] IP:[< ffffffffc011907e>] mfps_driver_init + 0x7e格式/为0x1000 [MyFirstPowerState]
[59.546077] PGD 0
[59.546085]抱歉:0002 [1] SMP
在联[59.546097]模块:MyFirstPowerState(POE +)xt_CHECKSUM ARC4 iwlmvm mac80211 snd_hda_ codec_hdmi snd_hda_ codec_realtek iwlwifi snd_hda_ codec_generic rtsx_pci_ms memstick cfg80211 nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_tcpudp ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute桥STP LLC ebtable_filter ebtables的ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw iptable_filter ip_tables都是x_tables dm_crypt hp_wmi sparse_keymap intel_rapl iosf_mbi x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel KVM crct10dif_pclmul dm_multipath crc32_pclmul scsi_dh aesni_intel aes_x86_64 LRW gf128mul glue_helper ablk_helper cryptd joydev serio_raw lpc_ich uvcvideo snd_seq_midi snd_seq_midi_event snd_rawmidi snd_hda_intel snd_hda_controller snd_hda_ codeC videobuf2_vmalloc snd_hwdep shpchp snd_pcm videobuf2_memops videobuf2_core v4l2_common snd_seq E1000E(OE)i915_bpo PTP mei_me pps_core妹videodev媒体snd_seq_device intel_ips snd_timer drm_kms_helper DRM btusb SND i2c_algo_bit声音核心8250_fintek hp_accel lis3lv02d input_polldev tpm_infineon hp_wireless mac_hid parport_pc ppdev LP parport RFCOMM BNEP蓝牙binfmt_misc BTRFS XOR raid6_pq dm_mirror dm_region_hash dm_log UAS usb_storage hid_generic USBHID HID rtsx_pci_sdmmc AHCI psmouse libahci rtsx_pci WMI视频
[59.546577] CPU:1 PID:4180通讯:insmod的污点:P OE 3.19.0-51泛型#58〜14.04.1 Ubuntu的
[59.546613]硬件名称:惠普惠普EliteBook 840 G2 / 2216,BIOS版本M71。 01.05 2015年3月26日
[59.546648]任务:ffff880241a7b110 TI:ffff880242f68000 task.ti:ffff880242f68000
[59.546678] RIP:0010:并[d ffffffffc011907e]的计算值并[d ffffffffc011907e]的计算值mfps_driver_init + 0x7e格式/ 0x1000的[MyFirstPowerState]
[59.546720] RSP:0018:ffff880242f6bd18 EFLAGS:00010246
[59.546741] RAX:0000000000000000 RBX:ffff880245b4d000 RCX:00000000000000ae
[59.546772] RDX:0000000000000000 RSI:ffff880245b4d098 RDI:0000000000000000
[59.546807] RBP:ffff880242f6bd28 R08:000000000000000a R09:0000000000000000
[59.546839] R10:0000000000000d53 R11:ffff880242f6b9de R12:ffffffffc06a8000
[59.546868] R13:0000000000000000 R14:ffffffffc0119000 R15:ffff880242f6bef8
[59.546900] FS:00007f8787aa6740(0000)GS:ffff88024f440000(0000)knlGS:0000000000000000
[59.546921] CS:0010 DS:0000 ES:0000 CR0:0000000080050033
[59.546936] CR2:0000000000000000 CR3:0000000244393000 CR4:00000000003407e0
[59.546955] DR0:0000000000000000 DR1:0000000000000000 DR2:0000000000000000
[59.546978] DR3:0000000000000000 DR6:00000000fffe0ff0 DR7:0000000000000400
[59.547006]堆栈:
[59.547014] ffffffff81c1d060 ffff880204cd3280 ffff880242f6bda8 ffffffff81002144
[59.547046] 0000000000000001 0000000000000002 0000000000000001 ffff8801f8ddc4c0
[59.547079] ffff880242f6bd88 ffffffff811cef19 ffffffff810f7aac 0000000000000018
[59.547114]呼叫追踪:
[59.547131]并[d ffffffff81002144]的计算值do_one_initcall + 0xd4 / 0x210
[59.547162]并[d ffffffff811cef19]的计算值? kmem_cache_alloc_trace + 0x199 / 0x220
[59.547194]并[d ffffffff810f7aac]的计算值? load_module + 0x164c / 0x1cc0
[59.547222]并[d ffffffff810f7ae5]的计算值load_module + 0x1685 / 0x1cc0
[59.547247]并[d ffffffff810f3380]的计算值? store_uevent + 0X40 / 0X40
[59.547274]并[d ffffffff810f8296]的计算值SyS_finit_module + 0x86可以/ 0XB0
[59.547298]并[d ffffffff817b788d]的计算值system_call_fastpath + 0x16 / 0x1b
[59.547314] code:C7 80 C0 4B C0 31 C0 E8 19 14 69 C1 48 C7 C7 A8 C0 4B C0 31 C0 E8 0B 14 69 C1 31 C0 48 8D B3​​ 98 00 00 00 B9 AE 00 00 00 48 89 -C 7烷基,F3> A5 BF 60 00 00 00 26 E8 69 C7 BF C1 60 00 00 00 E8交流C5 69
[59.547393] RIP [< ffffffffc011907e>] mfps_driver_init + 0x7e格式/为0x1000 [MyFirstPowerState]
[59.547416] RSP< ffff880242f6bd18>
[59.547425] CR2:0000000000000000
[59.554577] --- [结束跟踪42e3b1c73677cdfa] ---

我也注意到它因此,不可能删除模块:

 须藤rmmod的MyFirstPowerState.ko
rmmod的:错误:模块MyFirstPowerState在使用

的这是什么code意味着任何想法,以及如何纠正错误?


解决方案

我将试图解释文本是dmesg的波纹管的大规模墙。作为一个音符在括号中的左边的值的时候,我与他们有关的什么忘了,但你不真正的问题。


  

[59.545180] MyFirstPowerState:模块许可证未指定会污染内核。
  [59.545183]禁用锁调试由于内核污点


这是因为你没有声明模块的许可证。通常你会看到人们把这样的事情在他们的code在同一部分作为宏module_init。

  MODULE_LICENSE(GPL);


  

[59.546010] LEONZO:我发现了大小为8的设备
  [59.546012] LEONZO:这里是它的驱动程序名称E1000E
  [59.546013] LEONZO:调用它suspend方法


这是你的printk消息没有什么特别在这里。


  

[59.546021] BUG:在(

无法处理内核 NULL 指针引用

这里是你的崩溃的原因其实生活。内核尝试取消引用NULL指针而导致赛格故障。关于究竟是什么意思看<一更多详细信息href=\"http://stackoverflow.com/questions/4007268/what-exactly-is-meant-by-de-referencing-a-null-pointer\">here.正如伊恩在评论如前所述,它看起来像你的崩溃的原因是你把 *设备= dev亡&GT;开发而不是设备=开发 - 方式&gt;开发在code你有尝试价值设备点分配到的dev-&GT;开发然而,由于设备= NULL 目前您试图取消引用NULL导致飞机坠毁。


  

[59.546051] IP:[] mfps_driver_init + 0x7e格式/为0x1000 [MyFirstPowerState]
  [59.546648]任务:ffff880241a7b110 TI:ffff880242f68000 task.ti:ffff880242f68000


包含于上述这些错误的大块没有太多有价值的,你目前并有质感谁已经部署了一些东西,一些特定的用户有问题的人。它上市之类的东西安装的硬件,这导致崩溃的模块,而模块还呼吁所有的东西,在你的情况是众所周知的。


  

[59.546678] RIP:0010:[] [] mfps_driver_init + 0x7e格式/为0x1000 [MyFirstPowerState] [59.547079] ffff880242f6bd88 ffffffff811cef19 ffffffff810f7aac 0000000000000018


在本节的一切是,如果你有没有装配经验,集信息意味着什么给你,但我会建议知道的基本知识它在这些情况下帮助。上半部分是寄存器和它们当前值,下半部分是当前堆栈帧。

 &GT; [59.547114]呼叫追踪:
[59.547131]并[d ffffffff81002144]的计算值do_one_initcall + 0xd4 / 0x210
[59.547162]并[d ffffffff811cef19]的计算值? kmem_cache_alloc_trace + 0x199 / 0x220
[59.547194]并[d ffffffff810f7aac]的计算值? load_module + 0x164c / 0x1cc0

在呼叫跟踪中什么都可以非常有帮助的,特别是当模块变得漫长而艰难的事情中断类似调试。基本上它是列出了每一个函数调用(或其他方式)的系统做出了导致这种崩溃。在你的情况,因为你从加载模块直奔崩溃的痕迹真的只有你load_module一些包装和一些深层次的系统调用一起。但是,如果说你的负载模块调用另一个函数,并导致你可以在这里看到这个调用路径崩溃。

最后一点显得更加的寄存器。

希望这解释文本,你从dmesg的获得,当你引起内核问题墙上(不知道这是一个惊恐有人请指正)。如果有什么,仍然是模糊的,我会尽量解释虽然我绝不是这方面的专家。

I have wrote this simple module to handle a device and call some of its power management methods such as .suspend and .resume. At its initialization, the module simple lookup for a particular device and try to call its methods.

#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/device.h>
#include <linux/pci.h>

static int __init mfps_driver_init(void){

struct pci_dev    *dev      = NULL;
struct pci_driver *driver   = NULL;
struct device     *device   = NULL;

dev = pci_get_device(0x8086, 0x15a2, NULL);

if((dev == NULL) || (dev == 0)){

    printk(KERN_INFO "LEONZO: NOTHING FOUND SIZE %ld\n", sizeof(dev));

} else {

    driver = dev->driver;

    printk(KERN_INFO "LEONZO: I FOUND THE DEVICE OF THE SIZE %ld\n", sizeof(dev));
    printk(KERN_INFO "LEONZO: HERE IS ITS DRIVER NAME %s\n", driver->name);
    printk(KERN_INFO "LEONZO: CALLING IT SUSPEND METHOD\n");

    *device = dev->dev;

    device_lock(device);

    device_unlock(device);
}

return 0;

}

static void __exit mfps_driver_exit(void){

}


module_init(mfps_driver_init);
module_exit(mfps_driver_exit);

The code compile successfully. But the I get a kernel bug when I load the module:

sudo insmod MyFirstPowerState.ko

And the dmesg show the following output

[   59.545180] MyFirstPowerState: module license 'unspecified' taints   kernel. 
[   59.545183] Disabling lock debugging due to kernel taint
[   59.546010] LEONZO: I FOUND THE DEVICE OF THE SIZE 8
[   59.546012] LEONZO: HERE IS ITS DRIVER NAME e1000e
[   59.546013] LEONZO: CALLING IT SUSPEND METHOD
[   59.546021] BUG: unable to handle kernel NULL pointer dereference         at           (null)
[   59.546051] IP: [<ffffffffc011907e>] mfps_driver_init+0x7e/0x1000         [MyFirstPowerState]
[   59.546077] PGD 0 
[   59.546085] Oops: 0002 [#1] SMP 
[   59.546097] Modules linked in: MyFirstPowerState(POE+) xt_CHECKSUM arc4 iwlmvm mac80211 snd_hda_codec_hdmi snd_hda_codec_realtek iwlwifi snd_hda_codec_generic rtsx_pci_ms memstick cfg80211 nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_tcpudp ip6t_REJECT nf_reject_ipv6 ipt_REJECT nf_reject_ipv4 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw iptable_filter ip_tables x_tables dm_crypt hp_wmi sparse_keymap intel_rapl iosf_mbi x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm crct10dif_pclmul dm_multipath crc32_pclmul scsi_dh aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd joydev serio_raw lpc_ich uvcvideo snd_seq_midi snd_seq_midi_event snd_rawmidi snd_hda_intel snd_hda_controller snd_hda_codec videobuf2_vmalloc snd_hwdep shpchp snd_pcm videobuf2_memops videobuf2_core v4l2_common snd_seq e1000e(OE) i915_bpo ptp mei_me pps_core mei videodev media snd_seq_device intel_ips snd_timer drm_kms_helper drm btusb snd i2c_algo_bit soundcore 8250_fintek hp_accel lis3lv02d input_polldev tpm_infineon hp_wireless mac_hid parport_pc ppdev lp parport rfcomm bnep bluetooth binfmt_misc btrfs xor raid6_pq dm_mirror dm_region_hash dm_log uas usb_storage hid_generic usbhid hid rtsx_pci_sdmmc ahci psmouse libahci rtsx_pci wmi video
[   59.546577] CPU: 1 PID: 4180 Comm: insmod Tainted: P           OE   3.19.0-51-generic #58~14.04.1-Ubuntu
[   59.546613] Hardware name: Hewlett-Packard HP EliteBook 840 G2/2216, BIOS M71 Ver. 01.05 03/26/2015
[   59.546648] task: ffff880241a7b110 ti: ffff880242f68000 task.ti: ffff880242f68000
[   59.546678] RIP: 0010:[<ffffffffc011907e>]  [<ffffffffc011907e>] mfps_driver_init+0x7e/0x1000 [MyFirstPowerState]
[   59.546720] RSP: 0018:ffff880242f6bd18  EFLAGS: 00010246
[   59.546741] RAX: 0000000000000000 RBX: ffff880245b4d000 RCX: 00000000000000ae
[   59.546772] RDX: 0000000000000000 RSI: ffff880245b4d098 RDI: 0000000000000000
[   59.546807] RBP: ffff880242f6bd28 R08: 000000000000000a R09: 0000000000000000
[   59.546839] R10: 0000000000000d53 R11: ffff880242f6b9de R12: ffffffffc06a8000
[   59.546868] R13: 0000000000000000 R14: ffffffffc0119000 R15: ffff880242f6bef8
[   59.546900] FS:  00007f8787aa6740(0000) GS:ffff88024f440000(0000) knlGS:0000000000000000
[   59.546921] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   59.546936] CR2: 0000000000000000 CR3: 0000000244393000 CR4: 00000000003407e0
[   59.546955] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   59.546978] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   59.547006] Stack:
[   59.547014]  ffffffff81c1d060 ffff880204cd3280 ffff880242f6bda8 ffffffff81002144
[   59.547046]  0000000000000001 0000000000000002 ffff8801f8ddc4c0 0000000000000001
[   59.547079]  ffff880242f6bd88 ffffffff811cef19 ffffffff810f7aac 0000000000000018
[   59.547114] Call Trace:
[   59.547131]  [<ffffffff81002144>] do_one_initcall+0xd4/0x210
[   59.547162]  [<ffffffff811cef19>] ? kmem_cache_alloc_trace+0x199/0x220
[   59.547194]  [<ffffffff810f7aac>] ? load_module+0x164c/0x1cc0
[   59.547222]  [<ffffffff810f7ae5>] load_module+0x1685/0x1cc0
[   59.547247]  [<ffffffff810f3380>] ? store_uevent+0x40/0x40
[   59.547274]  [<ffffffff810f8296>] SyS_finit_module+0x86/0xb0
[   59.547298]  [<ffffffff817b788d>] system_call_fastpath+0x16/0x1b
[   59.547314] Code: c7 80 c0 4b c0 31 c0 e8 19 14 69 c1 48 c7 c7 a8 c0  4b c0 31 c0 e8 0b 14 69 c1 31 c0 48 8d b3 98 00 00 00 b9 ae 00 00 00 48 89 c7 <f3> a5 bf 60 00 00 00 e8 26 c7 69 c1 bf 60 00 00 00 e8 ac c5 69 
[   59.547393] RIP  [<ffffffffc011907e>] mfps_driver_init+0x7e/0x1000 [MyFirstPowerState]
[   59.547416]  RSP <ffff880242f6bd18>
[   59.547425] CR2: 0000000000000000
[   59.554577] ---[ end trace 42e3b1c73677cdfa ]---

I also notice that it is therefore impossible to remove the module:

sudo rmmod MyFirstPowerState.ko 
rmmod: ERROR: Module MyFirstPowerState is in use

Any idea of what this code mean and how to correct the error ?

解决方案

I will be attempting to explain the massive wall of text that is dmesg bellow. As a note the values in brackets to the left are times I forget with what exactly they are in relation to but for you they don't really matter.

[ 59.545180] MyFirstPowerState: module license 'unspecified' taints kernel. [ 59.545183] Disabling lock debugging due to kernel taint

This is because you did not declare a module license. Usually you will see people put something like this in their code in the same section as the module_init.

MODULE_LICENSE("GPL");

[ 59.546010] LEONZO: I FOUND THE DEVICE OF THE SIZE 8 [ 59.546012] LEONZO: HERE IS ITS DRIVER NAME e1000e [ 59.546013] LEONZO: CALLING IT SUSPEND METHOD

These are your printk messages nothing really special here.

[ 59.546021] BUG: unable to handle kernel NULL pointer dereference at (null)

Here is where the cause for your crash actually lives. The kernel tried to dereference a NULL pointer which causes a seg fault. For more details on what exactly that means see here. As Ian noted in the comments earlier it looks like the cause of your crash is you put *device=dev->dev instead of device=dev->dev. In the code you have you attempted to assign the value device points to to dev->dev however since device=NULL currently you attempted to dereference NULL causing a crash.

[ 59.546051] IP: [] mfps_driver_init+0x7e/0x1000 [MyFirstPowerState] [ 59.546648] task: ffff880241a7b110 ti: ffff880242f68000 task.ti: ffff880242f68000

The chunk of errors contained within those above do not have much valuable to you currently and are more for people who have deployed something and some specific user has a problem. It is listing things like the hardware installed, the module that caused the crash, and modules that is also calling all things that in your case are very well known.

[ 59.546678] RIP: 0010:[] [] mfps_driver_init+0x7e/0x1000 [MyFirstPowerState][ 59.547079] ffff880242f6bd88 ffffffff811cef19 ffffffff810f7aac 0000000000000018

Everything in this section is assembly information which if you have no assembly experience means nothing to you although I would suggest knowing the basics it does help in these cases. The top half is registers and their current values and the bottom half is the current stack frame.

> [   59.547114] Call Trace:
[   59.547131]  [<ffffffff81002144>] do_one_initcall+0xd4/0x210
[   59.547162]  [<ffffffff811cef19>] ? kmem_cache_alloc_trace+0x199/0x220
[   59.547194]  [<ffffffff810f7aac>] ? load_module+0x164c/0x1cc0

Everything within the call trace can be exceptionally helpful especially when the module becomes long and difficult to debug with things like interrupts. Basically it is listing out every single function call (or otherwise) the system has made to lead to this crash. In your case since you went from the load module straight to the crash the trace really only has your load_module along with some wrappers and some deep system calls. However, if say your load module called another function and that caused the crash you could see this call path here.

The last little bit appears to be more registers.

Hopefully that explained the wall of text that you get from dmesg when you cause a kernel issue (not sure if this is a panic someone please correct me). If there is anything that is still vague I'll try to explain although I am by no means an expert on this.

这篇关于如何理解这个dmesg的错误消息?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆