Author Topic: NVIDIA CUDA Toolkit 4.0 RC available to Registered Developers  (Read 7960 times)

0 Members and 1 Guest are viewing this topic.

Stefan

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 4815
Quote
CUDA Toolkit 4.0 RC (March 2011)

Release Highlights

Easier Application Porting

    * Share GPUs across multiple threads
    * Use all GPUs in the system concurrently from a single host thread
    * No-copy pinning of system memory, a faster alternative to cudaMallocHost()
    * C++ new/delete and support for virtual functions
    * Support for inline PTX assembly
    * Thrust library of templated performance primitives such as sort, reduce, etc.
    * NVIDIA Performance Primitives (NPP) library for image/video processing
    * Layered Textures for working with same size/format textures at larger sizes and higher performance

Faster Multi-GPU Programming

    * Unified Virtual Addressing
    * GPUDirect v2.0 support for Peer-to-Peer Communication

New & Improved Developer Tools

    * Automated Performance Analysis in Visual Profiler
    * C++ debugging in cuda-gdb
    * GPU binary disassembler for Fermi architecture (cuobjdump)

Public download: CUDA Toolkit 4.0 Overview