WindowsVista 64bit+VS2008配置CUDA环境

2009-9-10 作者: Tobeabetterman_He 来源: CUDA中文站

关键字: CUDA VS2008 环境配置 

  1、  软件准备

  1.1   cudadriver_2.3_winvista_64_190.38_general

  1.2   cudatoolkit_2.3_win_64

  1.3   cudasdk_2.3_win_64

  1.4   VS2008

  安装前将之前安装的sdk、toolkit、driver等卸载,再依次安装上述软件。如果开发平台没有支持CUDA的显卡,则不需要安装cudadriver_2.3_winvista_64_190.38_general。

  2、  安装检查

  2.1 在cmd下执行nvcc –V可以查看当前版本号

  nvcc: NVIDIA (R) Cuda compiler driver                                           

  Copyright (c) 2005-2009 NVIDIA Corporation                                     

  Built on Mon_Aug__3_19:43:55_PDT_2009                                      

  Cuda compilation tools, release 2.3, V0.2.1221                                     

  2.2 执行bandwidthtest查看配置是否正常

  进入\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK\C\bin\win64\Release>目录,执行

  .\bandwidthTest.exe --memory=pinned --mode=range --start=10240000 --end=10240000 -increment=10240000

  若正常会有类似信息

  Running on...... 

  device 0:Quadro FX 580 

  Range Mode                 

  Host to Device Bandwidth for Pinned memory  

  Transfer Size (Bytes)   Bandwidth(MB/s)     

  10240000               5101.1

  Range Mode  

  Device to Host Bandwidth for Pinned memory    

  Transfer Size (Bytes)   Bandwidth(MB/s) 

  10240000               4650.8  

  Range Mode  

  Device to Device Bandwidth 

  Transfer Size (Bytes)   Bandwidth(MB/s)  

  10240000               14812.5      

  &&&& Test PASSED 

  Press ENTER to exit...   

  2.3 执行deviceQuery.exe查看显卡具体型号

  .\ deviceQuery.exe

  若正常会有类似信息

  CUDA Device Query (Runtime API) version (CUDART static linking)

  There is 1 device supporting CUDA

  Device 0: "Quadro FX 580" 

  CUDA Driver Version:  2.30                                  

  CUDA Runtime Version:     2.30

  CUDA Capability Major revision number:         1 

  CUDA Capability Minor revision number:         1  

  Total amount of global memory:  536870912 bytes

  Number of multiprocessors:  4  

  Number of cores:  32  

  Total amount of constant memory:  65536 bytes 

  Total amount of shared memory per block:       16384 bytes 

  Total number of registers available per block: 8192  

  Warp size:  32 

  Maximum number of threads per block:  512 

  Maximum sizes of each dimension of a block:    512 x 512 x 64

  Maximum sizes of each dimension of a grid:     65535 x 65535 x 1 

  Maximum memory pitch:  262144 bytes

  Texture alignment: 256 bytes 

  Clock rate: 1.13 GHz 

  Concurrent copy and execution:  Yes

  Run time limit on kernels:No 

  Integrated:   No     

  Support host page-locked memory mapping:       No                                       

  Compute mode:     Default (multiple host threads can use this device simultaneously) 

  Test PASSED       

  Press ENTER to exit...  

  根据信息可以推算显卡的单精度浮点处理性能为3*32*1.13=108.48Gflops

  3、设置系统环境变量

  3.1 将安装的CUDA的sdk的路径加到系统环境变量中:

  例如C:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK\C\bin\win64

  下的

  ├─Debug

  ├─EmuDebug

  └─EmuRelease

  几个目录都加入到系统环境变量PATH中,这样才能在运行程序的时候找到相应的dll库。

  3.2 将编译需要的头文件放到vs2008环境中

  复制C:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK\C\common目录到C:\Users\dawning\Documents\Visual Studio 2008下

  4、VS2008建立CUDA简单的工程

  4.1 将模板项目C:\ProgramData\NVIDIA Corporation\NVIDIA GPU Computing SDK\C\src\ template拷贝到vs2008项目目录C:\Users\dawning\Documents\Visual Studio 2008\Projects

  4.2 打开vs2008,打开模板项目template_vc90

  4.3 右键点击template.cu选择自定义编译选项%



责任编辑:熊东旭