Author Topic: CUDA vs Quad Core Performance Test  (Read 2138 times)

0 Members and 1 Guest are viewing this topic.

Stefan

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 2787
    • View Profile
CUDA vs Quad Core Performance Test
« on: December 01, 2010, 12:29:48 AM »
Quote
...matrix multiplication program that can multiply two 1,536x1,536 dimension matrices using a single Nvidia GPU, a single thread on your primary CPU, and 4 threads using OpenMP

Code: [Select]
******Matrix Multiplication Performance Analysis CUDA program*******
based on Nvidia reference program with OpenMP for CPU multithreading

Select which GPU to run the test on. Enter 1 for the first GPU, etc.
Select the number of threads for the CPU test.
Select the block multiple for the matrix size. (version one is 96)
(64 for 1024x1024, 96 for 1536x1536, 128 for 2048x2048, etc.)

device name: GeForce GTX 465    <----- creating CUDA context on this device
device sharedMemPerBlock: 49152
device totalGlobalMem: 1041694720
device regsPerBlock: 32768
device warpSize: 32
device memPitch: 2147483647
device maxThreadsPerBlock: 1024
device maxThreadsDim[0]: 1024
device maxThreadsDim[1]: 1024
device maxThreadsDim[2]: 64
device maxGridSize[0]: 65535
device maxGridSize[1]: 65535
device maxGridSize[2]: 1
device totalConstMem: 65536
device major: 2
device minor: 0
device clockRate: 810000
device textureAlignment: 512
device deviceOverlap: 1
device multiProcessorCount: 11
Total CUDA cores: 352

Processing time for GPU: 16 (ms)
Processing time for CPU 1 thread: 3703 (ms)
Processing time for CPU 4 threads: 984 (ms)
CPU multithread speedup: 3.763211, efficiency: 94.080284
CPU to GPU time ratio (CUDA Speedup): 61.500000