如何使用 CMake 3.23 和 MSVC 2019 制作工作 CUDA 11.6

我找不到解决方案来管理如何使用标准 MSVC 2019 编译器在 Windows 上的 CMake 项目中使用语言 CUDA。

我正在尝试配置和编译 this hello-cmake-cuda repository (也在 this blog post 中描述)。

CMakeLists.txt 文件内容:

cmake_minimum_required(VERSION 3.8 FATAL_ERROR)
project(hello LANGUAGES CXX CUDA)
enable_language(CUDA)
add_executable(hello hello.cu)

这是 cmake .. 命令的输出,从构建目录中运行:

PS C:GitRepocuda_hellobuild> cmake ..
-- Selecting Windows SDK version 10.0.18362.0 to target Windows 10.0.22000.
CMake Error at C:/Program Files/CMake/share/cmake-3.23/Modules/CMakeDetermineCUDACompiler.cmake:311 (message):
  CMAKE_CUDA_ARCHITECTURES must be valid if set.
Call Stack (most recent call first):
  CMakeLists.txt:5 (project)

-- Configuring incomplete, errors occurred!
See also "C:/GitRepo/cuda_hello/build/CMakeFiles/CMakeOutput.log".
See also "C:/GitRepo/cuda_hello/build/CMakeFiles/CMakeError.log".

这意味着来自 architectures_testedCMakeDetermineCUDACompiler.cmake:311 是空的...

如何让 CMake 完成其配置和构建的简单程序?

我的开发环境

  • 操作系统:Windows 11 版本 10.0.22000 Build 22000
  • 编译器:Microsoft Visual Studio Community 2019 版本 16.11.11
  • CMake 版本为 3.23
  • CUDA 版本为 11.6

我已经尝试了每个软件的不同版本,并且一直遇到同样的问题。目前我决定继续使用这些版本。

我的 GPU 配置正确:它显示为 nvidia-smi ,我还能够构建和运行 deviceQuery CUDA 示例:

CUDA Device Query (Runtime API) version (CUDART static linking)

Detected 1 CUDA Capable device(s)

Device 0: "NVIDIA GeForce GTX 1650"
  CUDA Driver Version / Runtime Version          11.6 / 11.6
  CUDA Capability Major/Minor version number:    7.5
  etc. etc. ...

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 11.6, CUDA Runtime Version = 11.6, NumDevs = 1
Result = PASS

我的环境 PATH 变量:

PS C:GitRepohello-cuda-cmake-master> $env:path -split ";"
C:Program FilesNVIDIA GPU Computing ToolkitCUDAv11.6bin
C:Program FilesNVIDIA GPU Computing ToolkitCUDAv11.6libnvvp
C:Program FilesNVIDIA GPU Computing ToolkitCUDAv11.3bin
C:Program FilesNVIDIA GPU Computing ToolkitCUDAv11.3libnvvp

C:Program Files (x86)Common FilesOracleJavajavapath
C:Python38Scripts
C:Python38
C:Windowssystem32
C:Windows
C:WindowsSystem32Wbem
C:WindowsSystem32WindowsPowerShellv1.0
C:WindowsSystem32OpenSSH
C:Program Files (x86)NVIDIA CorporationPhysXCommon
C:Program FilesNVIDIA CorporationNVIDIA NvDLISR
C:Program FilesPuTTY
C:Program Files (x86)PuTTY
C:Program FilesMicrosoft SQL Server110ToolsBinn
C:Program FilesTortoiseSVNbin
C:Program FilesTortoiseGitbin
C:Program FilesMicrosoft VS Codebin
C:WINDOWSsystem32
C:WINDOWS
C:WINDOWSSystem32Wbem
C:WINDOWSSystem32WindowsPowerShellv1.0
C:WINDOWSSystem32OpenSSH
C:Program FilesDockerDockerresourcesbin
C:ProgramDataDockerDesktopversion-bin
C:Program FilesGitcmd
C:WINDOWSsystem32
C:WINDOWS
C:WINDOWSSystem32Wbem
C:WINDOWSSystem32WindowsPowerShellv1.0
C:WINDOWSSystem32OpenSSH
C:Program FilesNVIDIA CorporationNsight Compute 2022.1.1
C:Program FilesCMakebin
C:Ruby30-x64bin
C:UsersThibault GEFFROY.cargobin
C:UsersThibault GEFFROYAppDataLocalMicrosoftWindowsApps
C:Program FilesOpenCppCoverage
C:intelFPGA20.1modelsim_asewin32aloem

我尝试过但没有奏效的方法

如果我尝试插入想要的 CMAKE_CUDA_ARCHITECTURES

set(CMAKE_CUDA_ARCHITECTURES 75)

我得到:

PS C:GitRepocuda_hellobuild> cmake ..
-- Selecting Windows SDK version 10.0.18362.0 to target Windows 10.0.22000.
-- The CUDA compiler identification is unknown
CMake Error at C:/Program Files/CMake/share/cmake-3.23/Modules/CMakeDetermineCUDACompiler.cmake:654 (message):
  The CMAKE_CUDA_ARCHITECTURES:

    75

  do not all work with this compiler.  Try:

  instead.
Call Stack (most recent call first):
  CMakeLists.txt:5 (project)

-- Configuring incomplete, errors occurred!
See also "C:/GitRepo/cuda_hello/build/CMakeFiles/CMakeOutput.log".
See also "C:/GitRepo/cuda_hello/build/CMakeFiles/CMakeError.log".

如果我尝试使用 FindCUDA 模块来设置 CMAKE_CUDA_ARCHITECTURES - @alfC here 给出的解决方案 - 我得到:

PS C:GitRepocuda_hellobuild> cmake ..
CMake Error at C:/Program Files/CMake/share/cmake-3.23/Modules/FindCUDA/select_compute_arch.cmake:120 (file):
  file failed to open for writing (Permission denied):

    /detect_cuda_compute_capabilities.cpp
Call Stack (most recent call first):
  CMakeLists.txt:4 (CUDA_DETECT_INSTALLED_GPUS)

CMake Error: The source directory "CMAKE_FLAGS" does not exist.
Specify --help for usage, or press the help button on the CMake GUI.
CMake Error at C:/Program Files/CMake/share/cmake-3.23/Modules/FindCUDA/select_compute_arch.cmake:141 (try_run):
  Failed to configure test project build system.
Call Stack (most recent call first):
  CMakeLists.txt:4 (CUDA_DETECT_INSTALLED_GPUS)

CMake Error: TRY_COMPILE attempt to remove -rf directory that does not contain CMakeTmp:/detect_cuda_compute_capabilities.cpp
-- Configuring incomplete, errors occurred!
See also "C:/GitRepo/cuda_hello/build/CMakeFiles/CMakeOutput.log".
See also "C:/GitRepo/cuda_hello/build/CMakeFiles/CMakeError.log".

最后,如果我尝试调用 find_package(CUDA) ,我会得到:

PS C:GitRepocuda_hellobuild> cmake ..
CMake Error at C:/Program Files/CMake/share/cmake-3.23/Modules/FindCUDA.cmake:677 (cmake_initialize_per_config_variable):
  Unknown CMake command "cmake_initialize_per_config_variable".
Call Stack (most recent call first):
  CMakeLists.txt:2 (find_package)

-- Configuring incomplete, errors occurred!
See also "C:/GitRepo/cuda_hello/build/CMakeFiles/CMakeOutput.log".
See also "C:/GitRepo/cuda_hello/build/CMakeFiles/CMakeError.log".

编辑1:

对@einpoklum 解决方案 this 的回答:

感谢您的建议,但它也不起作用。

以下是 cmake -B buildyour repository 命令的输出:

PS C:GitRepohello-cuda-cmake-master> cmake -B build
-- Building for: Visual Studio 16 2019
-- Selecting Windows SDK version 10.0.18362.0 to target Windows 10.0.22000.
-- The CUDA compiler identification is unknown
CMake Error at C:/Program Files/CMake/share/cmake-3.23/Modules/CMakeDetermineCUDACompiler.cmake:633 (message):
  Failed to detect a default CUDA architecture.

  Compiler output:

Call Stack (most recent call first):
  CMakeLists.txt:2 (project)

-- Configuring incomplete, errors occurred!
See also "C:/GitRepo/hello-cuda-cmake-master/build/CMakeFiles/CMakeOutput.log".
See also "C:/GitRepo/hello-cuda-cmake-master/build/CMakeFiles/CMakeError.log".

使用 PowerShell 或 MSVC 命令提示符时的输出相同。


以下是使用 cmake-gui 时的 cmake 变量及其值:

Cmake Gui


当使用简单的 nvcc 构建命令时:来自 MSVC 命令提示符的 nvcc hello.cu 我得到:

nvcc fatal   : Could not set up the environment for Microsoft Visual Studio using 'c:/Program Files (x86)/Microsoft Visual Studio/2019/Community/VC/Tools/MSVC/14.29.30133/bin/HostX86/x86/../../../../../../../VC/Auxiliary/Build/vcvars64.bat'

但是 PATH 是有效的,并且脚本 vcvars64.bat 存在于该位置。


如果我将 find_package(CUDAToolkit) 添加到 CMakeLists.txt 会发生什么

新的 CMakeLists.txt

cmake_minimum_required(VERSION 3.18 FATAL_ERROR)
find_package(CUDAToolkit)
project(hello LANGUAGES CUDA)
add_executable(hello hello.cu)

输出 :

PS C:GitRepohello-cuda-cmake-master> cmake -B build
-- Building for: Visual Studio 16 2019
-- Found CUDAToolkit: C:/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v11.6/include (found version "11.6.124")
-- Selecting Windows SDK version 10.0.18362.0 to target Windows 10.0.22000.
-- The CUDA compiler identification is unknown
CMake Error at C:/Program Files/CMake/share/cmake-3.23/Modules/CMakeDetermineCUDACompiler.cmake:633 (message):
  Failed to detect a default CUDA architecture.

  Compiler output:

Call Stack (most recent call first):
  CMakeLists.txt:3 (project)

-- Configuring incomplete, errors occurred!
See also "C:/GitRepo/hello-cuda-cmake-master/build/CMakeFiles/CMakeOutput.log".
See also "C:/GitRepo/hello-cuda-cmake-master/build/CMakeFiles/CMakeError.log".

编辑2:

我正在尝试编译没有 CMake 的 CUDA sample BlackScholes,并提供了 MSVC 2019 解决方案。

我最终得到这个错误:

Severity        Code        Description        Project        File        Line        Suppression State
Error        MSB3721        The command ""C:Program FilesNVIDIA GPU Computing ToolkitCUDAv11.6binnvcc.exe" -gencode=arch=compute_35,code="sm_35,compute_35" -gencode=arch=compute_37,code="sm_37,compute_37" -gencode=arch=compute_50,code="sm_50,compute_50" -gencode=arch=compute_52,code="sm_52,compute_52" -gencode=arch=compute_60,code="sm_60,compute_60" -gencode=arch=compute_61,code="sm_61,compute_61" -gencode=arch=compute_70,code="sm_70,compute_70" -gencode=arch=compute_75,code="sm_75,compute_75" -gencode=arch=compute_80,code="sm_80,compute_80" -gencode=arch=compute_86,code="sm_86,compute_86" --use-local-env -ccbin "C:Program Files (x86)Microsoft Visual Studio2019CommunityVCToolsMSVC14.29.30133binHostX86x64" -x cu   -I./ -I../../../Common -I./ -I"C:Program FilesNVIDIA GPU Computing ToolkitCUDAv11.6/include" -I../../../Common -I"C:Program FilesNVIDIA GPU Computing ToolkitCUDAv11.6include"  -G   --keep-dir x64Debug  -maxrregcount=0  --machine 64 --compile -cudart static -Xcompiler "/wd 4819"  --threads 0 -g  -DWIN32 -DWIN32 -D_MBCS -D_MBCS -Xcompiler "/EHsc /W3 /nologo /Od /Fdx64/Debug/vc142.pdb /FS /Zi /RTC1 /MTd " -o "C:ProgramDataNVIDIA CorporationCUDA Samplesv11.6cuda-samplesSamples5_Domain_SpecificBlackScholesx64DebugBlackScholes.cu.obj" "C:ProgramDataNVIDIA CorporationCUDA Samplesv11.6cuda-samplesSamples5_Domain_SpecificBlackScholesBlackScholes.cu"" exited with code 1.        BlackScholes        C:Program Files (x86)Microsoft Visual Studio2019CommunityMSBuildMicrosoftVCv160BuildCustomizationsCUDA 11.6.targets        790

在使用 WSL 2 Ubuntu 20.4 和 following CUDA installation 以及这些 instructions 构建 BlackScholes 示例时,我得到以下输出:

$ sudo make BlackScholes
/usr/local/cuda/bin/nvcc -ccbin g++ -I../../../Common  -m64    -maxrregcount=16 --threads 0 --std=c++11 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o BlackScholes.o -c BlackScholes.cu
nvcc warning : The 'compute_35', 'compute_37', 'sm_35', and 'sm_37' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
ptxas warning : For profile sm_86 adjusting per thread register count of 16 to lower bound of 24
ptxas warning : For profile sm_80 adjusting per thread register count of 16 to lower bound of 24
ptxas warning : For profile sm_70 adjusting per thread register count of 16 to lower bound of 24
ptxas warning : For profile sm_75 adjusting per thread register count of 16 to lower bound of 24
/usr/local/cuda/bin/nvcc -ccbin g++ -I../../../Common  -m64    -maxrregcount=16 --threads 0 --std=c++11 -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o BlackScholes_gold.o -c BlackScholes_gold.cpp
nvcc warning : The 'compute_35', 'compute_37', 'sm_35', and 'sm_37' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
/usr/local/cuda/bin/nvcc -ccbin g++   -m64      -gencode arch=compute_35,code=sm_35 -gencode arch=compute_37,code=sm_37 -gencode arch=compute_50,code=sm_50 -gencode arch=compute_52,code=sm_52 -gencode arch=compute_60,code=sm_60 -gencode arch=compute_61,code=sm_61 -gencode arch=compute_70,code=sm_70 -gencode arch=compute_75,code=sm_75 -gencode arch=compute_80,code=sm_80 -gencode arch=compute_86,code=sm_86 -gencode arch=compute_86,code=compute_86 -o BlackScholes BlackScholes.o BlackScholes_gold.o
nvcc warning : The 'compute_35', 'compute_37', 'sm_35', and 'sm_37' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
mkdir -p ../../../bin/x86_64/linux/release
cp BlackScholes ../../../bin/x86_64/linux/release

$ ./BlackScholes
[./BlackScholes] - Starting...
GPU Device 0: "Turing" with compute capability 7.5

Initializing data...
...allocating CPU memory for options.
...allocating GPU memory for options.
...generating input data in CPU mem.
...copying input data to GPU mem.
Data init done.

Executing Black-Scholes GPU kernel (512 iterations)...
Options count             : 8000000
BlackScholesGPU() time    : 0.722482 msec
Effective memory bandwidth: 110.729334 GB/s
Gigaoptions per second    : 11.072933

BlackScholes, Throughput = 11.0729 GOptions/s, Time = 0.00072 s, Size = 8000000 options, NumDevsUsed = 1, Workgroup = 128

Reading back GPU results...
Checking the results...
...running CPU calculations.

Comparing the results...
L1 norm: 1.741792E-07
Max absolute error: 1.192093E-05

Shutting down...
...releasing GPU memory.
...releasing CPU memory.
Shutdown done.

[BlackScholes] - Test Summary

NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.

Test passed
stack overflow How to make working CUDA 11.6 with CMake 3.23 and MSVC 2019
原文答案

答案:

作者头像

从 CMake 3.18 开始,我们不再使用 FindCUDA.cmake 模块——既不直接也不通过 find_package(CUDA) 。这已被替换为 find_package(CUDAToolkit) (它使用了 FindCUDAToolkit.cmake 模块)。

但实际上,对于您简单的 hello-world 项目 - 您甚至不需要这样做,因为从 CMake 3.8 开始,CUDA 是 CMake 的“一等公民”语言。嗯,有点。因此,这是一个您可以使用的 CMakeLists.txt 文件:

cmake_minimum_required(VERSION 3.18 FATAL_ERROR)
PROJECT(cuda_hello LANGUAGES CUDA)
add_executable(hello hello.cu)

我已经使用 CUDA 11.6 和 Visual Studio 16(又名 VS 2019)在 Windows 10(企业评估)VM 上对此进行了测试。

注意: cmake_minimum_required() 行中的版本号可能是 critical !使用 cuda_hello 存储库中的版本号 - 它对我不起作用,因为 CMAKE_CUDA_ARCHITECTURES 值是 demanded 存在。

现在,在使用 CMake 进行配置后,您可以运行 ccmake ,您将在其中看到 CMAKE_CUDA_ARCHITECTURES 值。将其更改为您要使用的内容。再次,我为您提供最简单和最基本的做事方式,不一定是最花哨和最强大的。


我已经在 fork of the hello-cuda-cmake repository 中为您设置了所有这些。

作者头像

尝试添加:

set(CMAKE_CUDA_ARCHITECTURES 60 61 62 70 72 75 86)
set(CMAKE_CUDA_COMPILER /usr/local/cuda-11.6/bin/nvcc)

检查 https://arnon.dk/matching-sm-architectures-arch-and-gencode-for-various-nvidia-cards/ 中的 CUDA 架构并更改 CMAKE_CUDA_ARCHITECTURES 的参数。

并将 CMAKE_CUDA_COMPILER 链接到 nvcc。

这是我完整的 CMakeLists.txt:

cmake_minimum_required(VERSION 3.20 FATAL_ERROR)

set(CMAKE_CUDA_ARCHITECTURES 60 61 62 70 72 75 86)
set(CMAKE_CUDA_COMPILER /usr/local/cuda-11.6/bin/nvcc)

project(cudatest CUDA)
find_package(CUDAToolkit)

set(CMAKE_CUDA_STANDARD 14)

add_executable(cudatest main.cu)

set_target_properties(cudatest PROPERTIES
    CUDA_SEPARABLE_COMPILATION ON)

我的 GPU 是 GeForce GTX 1660,CMake 版本 3.23,CUDA 版本 11.6。

这是我为开发一些项目制作的Docker镜像: https://github.com/GuangchenJ/cuda-dev ,你可以尝试使用它。

作者头像

操作系统环境:

1.窗口10(visual studio 2022社区)

  1. cuda:cuda 11.6,nvcc
    3.cpp标准:17
    4.ide:vscode
    5.cmake。

这个项目名称是: hellogpu

cmake 文件:

cmake_minimum_required(VERSION 3.0.0)
project(hellogpu CUDA)

include(CTest)
enable_testing()

add_executable(${PROJECT_NAME} main.cu)

set_target_properties(${PROJECT_NAME} PROPERTIES
        CUDA_SEPARABLE_COMPILATION ON)
set(CPACK_PROJECT_NAME ${PROJECT_NAME})
set(CPACK_PROJECT_VERSION ${PROJECT_VERSION})
include(CPack)
作者头像

我有同样的问题,我通过安装旧版本的 CMake 解决了它。更准确地说:**3.18 之前的版本。

显然 CMake 在 3.18 中添加了对 CUDA 的第一方语言支持,这就是这些无意义的问题 ( "Try: indead" ) 的来源。