Openmpi和vargrind [英] Openmpi and vargrind
问题描述
我只是使用valgrind来测试openmpi-1.4/example中提供的示例:
I just use a valgrind to test an example provide in openmpi-1.4/example:
mpirun.openmpi --np 2 valgrind --log-file=output.dat --leak-check=full --tool=memcheck ./ring_c
然后我在以下output.dat中找到了
then I found below in output.dat:
== 30450 ==系统调用参数writev(vector [...])指向未初始化的字节
==30450== Syscall param writev(vector[...]) points to uninitialised byte(s)
== 30450 ==在0x54DC150处:__writev_nocancel(syscall-template.S:81)
==30450== at 0x54DC150: __writev_nocancel (syscall-template.S:81)
== 30450 ==通过0x7E3B312:mca_oob_tcp_msg_send_handler(在/usr/lib/openmpi/lib/openmpi/mca_oob_tcp.so中)
==30450== by 0x7E3B312: mca_oob_tcp_msg_send_handler (in /usr/lib/openmpi/lib/openmpi/mca_oob_tcp.so)
== 30450 ==通过0x7E3C50A:mca_oob_tcp_peer_send(在/usr/lib/openmpi/lib/openmpi/mca_oob_tcp.so中)
==30450== by 0x7E3C50A: mca_oob_tcp_peer_send (in /usr/lib/openmpi/lib/openmpi/mca_oob_tcp.so)
== 30450 ==通过0x7E40266:mca_oob_tcp_send_nb(在/usr/lib/openmpi/lib/openmpi/mca_oob_tcp.so中)
==30450== by 0x7E40266: mca_oob_tcp_send_nb (in /usr/lib/openmpi/lib/openmpi/mca_oob_tcp.so)
== 30450 ==通过0x7C2FFB7:orte_rml_oob_send(在/usr/lib/openmpi/lib/openmpi/mca_rml_oob.so中)
==30450== by 0x7C2FFB7: orte_rml_oob_send (in /usr/lib/openmpi/lib/openmpi/mca_rml_oob.so)
== 30450 ==通过0x7C30637:orte_rml_oob_send_buffer(在/usr/lib/openmpi/lib/openmpi/mca_rml_oob.so中)
==30450== by 0x7C30637: orte_rml_oob_send_buffer (in /usr/lib/openmpi/lib/openmpi/mca_rml_oob.so)
== 30450 ==通过0x824CBAE:??? (在/usr/lib/openmpi/lib/openmpi/mca_grpcomm_bad.so中)
==30450== by 0x824CBAE: ??? (in /usr/lib/openmpi/lib/openmpi/mca_grpcomm_bad.so)
== 30450 ==通过0x4E900FB:ompi_mpi_init(在/usr/lib/openmpi/lib/libmpi.so.1.0.8中) == 30450 ==通过0x4EA8499:PMPI_Init(在/usr/lib/openmpi/lib/libmpi.so.1.0.8中)
==30450== by 0x4E900FB: ompi_mpi_init (in /usr/lib/openmpi/lib/libmpi.so.1.0.8) ==30450== by 0x4EA8499: PMPI_Init (in /usr/lib/openmpi/lib/libmpi.so.1.0.8)
== 30450 ==通过0x4009AD:主(ring_c.c:19)
==30450== by 0x4009AD: main (ring_c.c:19)
== 30450 ==地址0x65c0321在256个分配大小的块内为161个字节
==30450== Address 0x65c0321 is 161 bytes inside a block of size 256 alloc'd
== 30450 ==在0x4C2DEAE:重新分配(在/usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so中)
==30450== at 0x4C2DEAE: realloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
== 30450 ==通过0x4F1E619:opal_dss_buffer_extend(在/usr/lib/openmpi/lib/libmpi.so.1.0.8中)
==30450== by 0x4F1E619: opal_dss_buffer_extend (in /usr/lib/openmpi/lib/libmpi.so.1.0.8)
== 30450 ==通过0x4F1E9D0:opal_dss_copy_payload(在/usr/lib/openmpi/lib/libmpi.so.1.0.8中)
==30450== by 0x4F1E9D0: opal_dss_copy_payload (in /usr/lib/openmpi/lib/libmpi.so.1.0.8)
== 30450 ==通过0x4EFA3DD:orte_grpcomm_base_pack_modex_entries(在/usr/lib/openmpi/lib/libmpi.so.1.0.8中)
==30450== by 0x4EFA3DD: orte_grpcomm_base_pack_modex_entries (in /usr/lib/openmpi/lib/libmpi.so.1.0.8)
== 30450 ==通过0x824CA8F:??? (在/usr/lib/openmpi/lib/openmpi/mca_grpcomm_bad.so中)
==30450== by 0x824CA8F: ??? (in /usr/lib/openmpi/lib/openmpi/mca_grpcomm_bad.so)
== 30450 ==通过0x4E900FB:ompi_mpi_init(在/usr/lib/openmpi/lib/libmpi.so.1.0.8中)
==30450== by 0x4E900FB: ompi_mpi_init (in /usr/lib/openmpi/lib/libmpi.so.1.0.8)
== 30450 ==通过0x4EA8499:PMPI_Init(在/usr/lib/openmpi/lib/libmpi.so.1.0.8中)
==30450== by 0x4EA8499: PMPI_Init (in /usr/lib/openmpi/lib/libmpi.so.1.0.8)
== 30450 ==通过0x4009AD:主(ring_c.c:19)
==30450== by 0x4009AD: main (ring_c.c:19)
== 30450 ==堆摘要:
==30450== HEAP SUMMARY:
== 30450 ==在出口处使用:2982974字节,分为1482个块
==30450== in use at exit: 298,974 bytes in 1,482 blocks
== 30450 ==堆总使用量:7,740个分配,6,258个空闲,13,223,431个字节分配
==30450== total heap usage: 7,740 allocs, 6,258 frees, 13,223,431 bytes allocated
... ... ...
... ... ...
== 30450 ==泄漏摘要:
==30450== LEAK SUMMARY:
== 30450 ==绝对丢失:69块中的51132字节
==30450== definitely lost: 51,132 bytes in 69 blocks
== 30450 ==间接丢失:39个块中的14,378个字节
==30450== indirectly lost: 14,378 bytes in 39 blocks
== 30450 ==可能丢失:0字节,分为0个块
==30450== possibly lost: 0 bytes in 0 blocks
== 30450 ==仍然可以访问:233,464字节,分为1,374个块
==30450== still reachable: 233,464 bytes in 1,374 blocks
== 30450 ==已抑制:0字节,分为0个块
==30450== suppressed: 0 bytes in 0 blocks
== 30450 ==未显示可访问的块(找到指针的块).
==30450== Reachable blocks (those to which a pointer was found) are not shown.
== 30450 ==要查看它们,请重新运行:--leak-check = full --show-leak-kinds = all
==30450== To see them, rerun with: --leak-check=full --show-leak-kinds=all
== 30450 ==
==30450==
== 30450 ==对于检测到和抑制的错误的计数,请重新运行:-v
==30450== For counts of detected and suppressed errors, rerun with: -v
== 30450 ==使用--track-origins = yes查看未初始化值的来源
==30450== Use --track-origins=yes to see where uninitialized values come from
== 30450 ==错误摘要:来自63个上下文的63个错误(禁止显示:0至0)
==30450== ERROR SUMMARY: 63 errors from 63 contexts (suppressed: 0 from 0)
它有基于内存检查结果的内存泄漏. 由于示例是由openmpi-1.4开发人员提供的,是否意味着每个将openmpi-1.4用作库的程序都将遇到内存泄漏? 弗雷德
It has memory leak based on the memorycheck results. Since the example is provided by openmpi-1.4 developers, does it mean every program using openmpi-1.4 as a libary will meet memory leak? Fred
推荐答案
出于性能原因,OpenMPI并非valgrind-clean.但是,根据 FAQ ,提供了一个禁止文件. /p>
For performance reasons, OpenMPI is not valgrind-clean. However, as per the FAQ, a supression file is provided.
mpirun -np 2 valgrind --suppressions=$PREFIX/share/openmpi/openmpi-valgrind.supp
这篇关于Openmpi和vargrind的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!