Particle Billboarding with the Geometry Shader (GLSL)
Particle Billboarding
Geometry Shader (GLSL)
Billboarding in the Vertex Shader (GLSL)
Billboarding
Vertex Shader (GLSL)
Leap Motion: Touchless Hand Tracking Controller
Leap Motion
Touchless Hand Tracking
GPU Buffers: Introduction to OpenGL 4.3 Shader Storage Buffers Objects
OpenGL 4.3
Shader Storage Buffers
GPU Buffers: Introduction to OpenGL 3.1 Uniform Buffers Objects
OpenGL 3.1
Uniform Buffers
Simple Introduction to OpenGL 4 Shader Subroutines
Intro to OpenGL 4
Shader Subroutines

Posts Tagged ‘cudaMemcpy’


Programming a Matrix Multiplication for GPUs with CUDA

Be the first to comment - What do you think?  Posted by JeGX - 2008/10/14 at 10:45

Categories: NVIDIA CUDA, Programming   Tags: , , , , , ,

CUDA makes it possible to program the GPU with the language C. This article will show you the steps to code a matrix multiplication routine in CUDA:

  • allocate memory on the GPU with cudaMalloc or cudaMallocPitch (for aligned memory allocation)
  • move data to the GPU with cudaMemcpy2D
  • select the kernel domain, write the kernel and run it
  • move results back from the GPU to the host with cudaMemcpy2D
  • free resources with cudaFree