cmake - 链接静态库 pytorch 在构建过程中找不到其内部函数 [英] cmake - linking static library pytorch cannot find its internal functions during build

查看:24
本文介绍了cmake - 链接静态库 pytorch 在构建过程中找不到其内部函数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用 cmake 构建一个程序.出于多种原因,必须使用静态库而不是动态库来构建程序,而且我需要使用 PyTorch,所以这就是我所做的:

I'm trying to build a program using cmake. For several reasons, the program must be built using static libraries rather than dynamic libraries, and I need to use PyTorch so this is what I've done:

  1. 下载并安装了 PyTorch 静态库(我在正确的路径中找到了 libtorch.a,在 /home/me/pytorch/torch/lib)
  2. 使用以下内容制作CMakeLists.txt:

cmake_minimum_required(VERSION 3.5.1 FATAL_ERROR)
project(example-app LANGUAGES CXX)
find_package(Torch REQUIRED)
add_executable(example-app example-app.cpp argparse/argparse.cpp)
target_link_libraries(example-app "${TORCH_LIBRARIES}" -static -fopenmp)
set_property(TARGET example-app PROPERTY CXX_STANDARD 14)

仅供参考,example-app.cpp 是包含 main 函数的文件,argparse/ 是一个目录,其中包含 example 中调用的函数的一些源代码-app.cpp

FYI, example-app.cpp is the file with the main function, and argparse/ is a directory with some source code for functions called in example-app.cpp

它一直工作到 cmake -DCMAKE_PREFIX_PATH=/home/me/pytorch/torch ..,但是下面的 build 会产生一些错误,说找不到参考一些函数,即以fbgemm::开头的函数.fbgemm 是(据我所知)某种用于实现 PyTorch 的 GEMM 库.

It works until cmake -DCMAKE_PREFIX_PATH=/home/me/pytorch/torch .., but the following build incurs some errors, saying it could not find the reference to some functions, namely functions starting with fbgemm::. fbgemm is (as long as I know) some sort of GEMM library used in implementing PyTorch.

在我看来,在链接静态 PyTorch 库时,其内部库(如 fbgemm 内容)尚未正确链接,但我不是 cmake 方面的专家老实说并不完全确定.

It seems to me that while linking the static PyTorch library, its internal libraries like fbgemm stuff have not been linked properly, but I'm not an expert on cmake and honestly not entirely sure.

我做错了什么,还是有解决这个问题的方法?任何帮助或朝正确方向推动的努力将不胜感激.

Am I doing something wrong, or is there a workaround for this problem? Any help or push in the right direction would be greatly appreciated.

附言

  1. 确切的错误没有被发布,因为它太长了,但它主要由未定义引用 ~ 错误组成.如果查看错误消息可能对某些人有所帮助,我很乐意编辑问题并将其发布.

  1. The exact error has not been posted because it is way too long, but it consists of mostly undefined reference to ~ errors. If looking at the error message might be helpful for some people, I'd be happy to edit the question and post it.

build如果我从代码中删除需要库函数的部分而不注释掉 #include <torch/torch.h> 来自 example-app.cpp.

building and running the file works fine if I remove the parts that require the library's functions from the code without commenting out #include <torch/torch.h> from example-app.cpp.

推荐答案

最近通过 PyTorch 的静态链接经历了类似的过程,老实说它不太漂亮.

Lately went through similar process with static linking of PyTorch and to be honest it wasn't too pretty.

我将概述我已采取的步骤(您可以在 torchlambda 中找到确切的源代码,这里CMakeLists.txt它还包括 AWS 开发工具包和 AWS Lambda 静态构建),/a> 是从源代码构建 pytorch 的脚本(通过 /scripts/build_mobile.sh 克隆和构建,仅支持 CPU)),虽然它只支持 CPU(尽管如果您需要 CUDA,类似的步骤应该没问题,它至少会让您入门).

I will outline the steps I have undertaken (you can find exact source code in torchlambda, here is CMakeLists.txt (it also includes AWS SDK and AWS Lambda static builds), here is a script building pytorch from source ( cloning and building via /scripts/build_mobile.sh with only CPU support)), though it's only with CPU support (though similar steps should be fine if you need CUDA, it will get you started at least).

首先,您需要预先构建的静态库文件(所有都需要是静态的,因此没有.so,只有那些带有的文件. 扩展是合适的).

First of all, you need pre-built static library files (all of them need to be static, hence no .so, only those with .a extension are suitable).

Tbh 我一直在寻找 PyTorch安装上提供的那些页面,但只有shared 版本.在一个 GitHub 问题中,我找到了一种下载它们的方法,如下所示:

Tbh I've been looking for those provided by PyTorch on installation page, yet there is only shared version. In one GitHub issue I've found a way to download them as follows:

而不是下载(此处通过 wget)共享库:

Instead of downloading (here via wget) shared libraries:

$ wget https://download.pytorch.org/libtorch/cu101/libtorch-shared-with-deps-1.4.0.zip

您将 shared 重命名为 static(如在这个问题中),所以它会变成:

you rename shared to static (as described in this issue), so it would become:

$ wget https://download.pytorch.org/libtorch/cu101/libtorch-static-with-deps-1.4.0.zip

然而,当你下载它时,lib 文件夹下没有 libtorch.a(没有找到 libcaffe2.a这个问题),所以我剩下的就是从源代码明确构建.

Yet, when you download it there is no libtorch.a under lib folder (didn't find libcaffe2.a either as indicated by this issue), so what I was left with was building explicitly from source.

如果您以某种方式拥有这些文件(如果有,请提供您从何处获取这些文件),您可以跳过下一步.

If you have those files somehow (if so, please provide where you got them from please), you can skip the next step.

对于 CPU 版本,我使用了 /pytorch/scripts/build_mobile.sh 文件,如果需要 GPU 支持,您可以以此为基础构建您的版本(也许您只需将 -DUSE_CUDA=ON 传递给此脚本,但不确定).

For CPU version I have used /pytorch/scripts/build_mobile.sh file, you can base your version off of this if GPU support is needed (maybe you only have to pass -DUSE_CUDA=ON to this script, not sure though).

最重要的是 cmake-DBUILD_SHARED_LIBS=OFF 以便将所有内容构建为 static 库.您还可以检查 脚本build_mobile.sh 的参数也是如此.

Most important is cmake's -DBUILD_SHARED_LIBS=OFF in order to build everything as static library. You can also check script from my tool which passes arguments to build_mobile.sh as well.

默认情况下,在上面运行将在 /pytorch/build_mobile/install 中为您提供静态文件,其中有您需要的一切.

Running above will give you static files in /pytorch/build_mobile/install by default where there is everything you need.

现在您可以将上述构建文件复制到 /usr/local(最好不要,除非您使用 Docker 作为 torchlambda)或设置从您的 CMakeLists.txt 中访问它的路径,如下所示:

Now you can copy above build files to /usr/local (better not to unless you are using Docker as torchlambda) or set path to it from within your CMakeLists.txt like this:

set(LIBTORCH "/path/to/pytorch/build_mobile/install")

# Below will append libtorch to path so CMake can see files
set(CMAKE_PREFIX_PATH "${CMAKE_PREFIX_PATH};${LIBTORCH}")

现在其余的都很好除了target_link_libraries,应该是(如这个问题,请参阅此处列出的相关问题以获取更多参考)与 -Wl,--whole-archive 链接器标志一起使用,这让我想到这个:

Now the rest is fine except target_link_libraries, which should be (as indicated by this issue, see related issues listed there for additional reference) used with -Wl,--whole-archive linker flag, which brought me to this:

target_link_libraries(example-app PRIVATE -lm
        -Wl,--whole-archive "${TORCH_LIBRARIES}"
        -Wl,--no-whole-archive
        -lpthread
        ${CMAKE_DL_LIBS})

您可能不需要 -lm-lpthread${CMAKE_DL_LIBS},尽管我在 亚马逊 Linux AMI.

You may not need either of -lm, -lpthread or ${CMAKE_DL_LIBS}, though I needed it when building on Amazon Linux AMI.

现在您可以开始构建应用程序了.标准的 libtorch 方式应该没问题,但这是我使用的另一个命令:

Now you are off to building your application. Standard libtorch way should be fine but here is another command I used:

mkdir build && 
  cd build &&  
  cmake .. && 
  cmake --build . --config Release

上面将创建 build 文件夹,其中 example-app 二进制文件现在应该是安全的.

Above will create build folder where example-app binary should be now safely located.

最后使用 ld build/example-app 来验证 PyTorch 中的所有内容都是静态链接的,参见 上述问题5.,您的输出应该看起来相似.

Finally use ld build/example-app to verify everything from PyTorch was statically linked, see aforementioned issue point 5., your output should look similar.

这篇关于cmake - 链接静态库 pytorch 在构建过程中找不到其内部函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆