NVIDIA Kepler GK110 Architecture Whitepaper: 2880 CUDA Cores and Compute Capability 3.5

NVIDIA Kepler GK110 die

NVIDIA has published (GTC 2012) a whitepaper that details some computing aspects of the upcoming high-end Kepler GPU, the GK110. This GPU is clearly focused on computing with its 7.1 billion transistors, 15 SMX, 2880 CUDA cores (192 CUDA cores per SMX) and 240 texture units (16 TU per SMX).

NVIDIA Kepler GK110 die

The GK110 GPU supports the new CUDA Compute Capability 3.5:

NVIDIA Kepler GK110 die

Among the new features available with the GK110, GPUDirect sounds really interesting:

When working with a large amount of data, increasing the data throughput and reducing latency is vital to increasing compute performance. Kepler GK110 supports the RDMA feature in NVIDIA GPUDirect, which is designed to improve performance by allowing direct access to GPU memory by third‐party devices such as IB adapters, NICs, and SSDs. When using CUDA 5.0, GPUDirect provides the following important features:

  • Direct memory access (DMA) between NIC and GPU without the need for CPU side data buffering.
  • Significantly improved MPISend/MPIRecv efficiency between GPU and other nodes in a network.
  • Eliminates CPU bandwidth and latency bottlenecks
  • Works with variety of 3rd party network, capture, and storage devices

NVIDIA Kepler GK110 GPUDirect technology

You can download the GK110 whitepaper HERE.