During the recently held SC09 conference in Portland, Oregon - Intel finally managed to reach its original performance goal for Larrabee. Back in 2006, when we first got the first details about Larrabee, the performance goal was "1TFLOPS@ 16 cores, 2.0 GHz clock, 150W TDP". During Justin Rattner's keynote, Intel demonstrated the performance of LRB as it stands today.

Full story at BSN

Have you been looking for a performance monitor tool for your brand new and expensive Nvidia graphics card, and didn’t find anything working for a Windows 7 x64 system?

Get it from Acmelabs

HPMC is a small OpenGL/C/C++-library that extractes iso-surfaces of volumetric data directly on the GPU.

The library analyzes a lattice of scalar values describing a scalar field that is either stored in a Texture3D or can be accessed through an application-provided snippet of shader code. The output is a sequence of vertex positions and normals that form a triangulation of the iso-surface. HPMC provides traversal code to be included in an application vertex shader, which allows direct extraction in the vertex shader. Using the OpenGL transform feedback mechanism, the triangulation can be stored directly into a buffer object.   

3D-Tech News Around The Web / NVIDIA SceniX available
« on: December 01, 2009, 06:01:38 PM »
Changes/Improvements in SceniX v5.
* Introduced a size culling algorithm to cull objects that cover less
  than an optional user-specified size on screen. 
* Fixed a crash on converting triangles to triangle strips after
  optimizers have been applied.
* Primitives, when added to a Shape, did not add the Shape object as
  owner. This has been fixed. 
* Window dimensions were not passed to the ViewState prior to a window
  resize event. This has been fixed now.
* The supplied Cg runtime has been upgraded to version 2.2.0010
* Changing the GPU format of a TextureImage remained unconsidered if
  the change happened after image upload. This has been fixed.
* Skinned animations were not working anymore after applying
  optimizers. This has been fixed.   

Changes/Improvements in SceniX v5. beta
* Added the wxPP sample, an example showing how to implement cumulative
  post-processing effects using FBOs and Cg shaders. Specifically, this
  example shows how to implement HDR using post-processing techniques in SceniX.
* Fixed an issue with shadow handling
* Improved the wxiRT sample, e.g. it now supports multiple GPUs.
* Added the PhysXViewer sample, A simple example showing one way to integrate
  PhysX with SceniX.
* Cg shaders, if auto-generated by the supplied COLLADA loader, will
  compile to the latest available profile now. Previously, these
  shaders were compiled using the gp4*p profiles, which caused even
  simple phong shaders to not work on older hardware.
* Added CudaRT.dll so the CUDA toolkit does not have to be installed
  to run the wxiRT sample.

3D-Tech News Around The Web / Colin McRae: DiRT 2 DX11 demo available
« on: December 01, 2009, 05:08:56 PM »
Colin McRae: DiRT 2 features a roster of contemporary off-road events, taking players to the most diverse and challenging real-world environments. This World Tour has players competing in aggressive multi-car and intense solo races at extraordinary new locations, from canyon racing and jungle trails to city stadium-based events.

Powered by the third generation of the EGO™ Engine’s award-winning racing game technology, Colin McRae: DiRT 2 benefits from tuned-up car-handling physics system and new damage engine effects. It also showcases a spectacular new level of visual fidelity, with cars and tracks twice as detailed as those seen in Race Driver: GRID.

Colin McRae: DiRT 2’s garage  houses a best-in-class collection of officially licensed rally cars and off-road vehicles, specifically selected to deliver aggressive and fast paced racing. Covering seven vehicle classes, players will be given the keys to powerful vehicles right from the off. In Colin McRae: DiRT 2 the opening drive is the Group N Subaru; essentially making the ultimate car from the original game the starting point in the sequel and the rides just get even more impressive.

In addition to the World Tour, Colin McRae: DiRT 2 comes complete with full online functionality that will be core to the overall experience, with head-to-head competitive online play and new social features to engage the racing community. Prepare for mud, gravel, dust and dirt too in Colin McRae: DiRT 2!

Direct download

To run the internal benchmark use:
dirt2.exe -benchmark example_benchmark.xml

3D-Tech News Around The Web / TechPowerUp GPU-Z v0.3.8 available
« on: December 01, 2009, 05:02:13 PM »
GPU-Z 0.3.8

    * Added framework for translations in GPU-Z, to do that we need your help. Please go to to submit contributions in your language
    * Added sensors to monitor GPU load percentages on NVIDIA
    * Fixed startup on Windows 2000 (DLL not found)
    * Improved detection/added sensors for ATI M86
    * Fixed several NVAPI crashes
    * Fixed crash when PhysX not available
    * GPU-Z can now be set in its system menu to be always on top of other windows

3D-Tech News Around The Web / Notepad++ v5.6 available
« on: November 30, 2009, 11:10:03 AM »
Notepad++ v5.6 new features and fixed bugs (from v5.5.1) :

1.  Add languages encoding - Chinese traditional (BIG5), Chinese Simplified (GB2312), Japanese (Shift JIS), Korean (EUC), Thai (TIS-620), Hebrew (iso-8859-8), Hebrew (1255), Central European (1250), Cyrillic (1251), Cyrillic (KOI8-U), Cyrillic (KOI8-R), Cyrillic (Mac), Western European(1252), Greek (1253), Turkish(1254), Arabic (1256), Baltic (1257), Vietnamese (1258), ISO_8859-1 to ISO_8859-16 and a lot of more.
2.  Add auto-detection of HTML and XML files encodings.
3.  Add COBOL, D, Gui4Cli, PowerShell and R language support.
4.  Add Marker Jumper feature (Jump down/up : Ctrl+Num/Ctrl+Shift+Num).
5.  Add indent guide line highlighting for html/xml tags.
6.  Add system tray context menu and new command argument "-systemtray".
7.  Fix Unicode to ANSI encoding bug.
8.  Fix last recent file list menu items localization encoding bug.
9.  Fix last recent file number goes to zero issue.
10. Add new command argument "--help".
11. Fix Calltip hint bug and add a new capacity in it.
12. Add the ability to add the second keyword group for user in both LISP and Scheme languages.
13. Fix the wrap symbol display problem.
14. Add SQL ESC symbol '\'.
15. Fix column editor insert number bug in virtual space mode.
16. Fix status bar displaying "-2 char" issue for a empty document.
17. Fix installation of NppShell64 failed issue in installer.

ATI Catalyst™ Driver Hotfix to improve Dirt 2 performance

Problem Description:

ATI CrossFireX™ is not being utilized when playing Dirt 2.


AMD has developed a Hotfix to resolve this issue.  This Hotfix applies only to PCI-Express graphic card in the Radeon™ HD-series.

3D-Tech News Around The Web / Re: SiSoftware Sandra to support OpenCL
« on: November 27, 2009, 04:28:08 PM »
Lite version already available at BenchmarkHQ

3D-Tech News Around The Web / Jacket GBENCH (v1.0) available
« on: November 26, 2009, 06:18:26 PM »
Jacket GBENCH allows users to gauge the GPU performance of their computer relative to equivalent benchmarks obtained from a variety of other computers, including the CPU of the same computer. Benchmarks include six different tasks, common to the technical computing community:

    1.      LU:      LU decomposition of 1024 x 1024 matrix
    2.      FFT:      Fast Fourier Transform of a 2^20 x 1 vector
    3.      BLAS:      Matrix multiplication of two 1024x1024 matrices
    4.      3D Conv:      Convolution of 64x64x64 array with 3x3x3 kernel
    5.      FOR/GFOR:      Matrix-vector multiplication of 1024x1024x32 array
    6.      Equations:      Solution of a system of 1024 equations

GBENCH is a practical application benchmark measured in real seconds and is not meant to be a scientific or theoretical benchmark measured in GFLOPs. Also note that for fairness, arithmetic precisions (e.g. double, single) have been matched on the CPU and GPU. Finally, the data sizes used in these computations are large enough to exploit data parallelism (e.g. no scalar arithmetic was attempted). This benchmark assumes a data parallel problem.

Note: GBENCH v1.0 was built using Jacket v1.2.1 and MATLAB R2009b. GBENCH requires CUDA 2.3 (driver and toolkit).

will not work with older CUDA versions!
Here is my result:

3D-Tech News Around The Web / SiSoftware Sandra to support OpenCL
« on: November 26, 2009, 05:51:47 PM »
London, UK, 30th November 2009
SiSoftware releases its suite of OpenCL GPGPU (General Purpose Graphics Processor Unit) benchmarks as part of SiSoftware Sandra 2010, the latest version of our award-winning utility, which includes remote analysis, benchmarking and diagnostic features for PCs, servers, and networks.

3D-Tech News Around The Web / Perlin Noise Terrain Raycasting on the GPU
« on: November 26, 2009, 05:37:49 PM »
Here a first trial to raycast perlin noise on the fly for achieving volumetric terrain rendering. In the demo, a 128^3 sized random volume data is used as a base for the scenes on the screenshots above.

3D-Tech News Around The Web / round-up
« on: November 26, 2009, 05:36:26 PM » was busy this week publishing many articles:

CfP: GPU-CFD Minisymposium at ECCOMAS-CFD 2010
A fast two-dimensional floodplain inundation model
Cellular Level Agent Based Modelling on the Graphics Processing Unit
CheCUDA: A Checkpoint/restart Tool for CUDA Applications
GPULib v1.2.2 released
PyCUDA: GPU Run-Time Code Generation for High-Performance Computing
NVIDIA Tesla GPUs to Communicate Faster Over Mellanox InfiniBand Networks
PGI CUDA Fortran Now Available from The Portland Group
Uncluttering Graph Layouts Using Anisotropic Diffusion and Mass Transport
OpenMM 1.0 beta Release
Monte Carlo Simulation of Photon Migration in 3D Turbid Media Accelerated by Graphics Processing Units

NVIDIA is pleased to announce the release of version 1.0 of the NVIDIA Performance Primitives (NPP) library.

Performance primitives are foundational building blocks for performing GPU accelerated processing. The initial set of functionality in the library focuses on functions for imaging and video processing and is widely applicable for developers in these areas. The NPP library is written to maximize flexibility, while maintaining high performance.

NPP can be easily integrated into existing applications and allows developers to take advantage of GPU acceleration without having to write code for the GPU.

The functions contained in this version span fundamental operations such as add, multiply and divide to advanced operations such as perspective warps, discrete cosine transformation, histogram and Canny filtering.

The library currently supports Linux 32/64, Windows 32/64 and Mac OS and is available at

Please note that the Linux version of this release has been tested on only CentOS 4.7 but is expected to run on other distros as well.

A short while ago I wrote about my work on DirectWrite usage in Firefox. Next to DirectWrite, Microsoft also published another new API with Windows 7 (and the Vista Platform Update), called Direct2D. Direct2D is designed as a replacement for GDI and functions as a vector graphics rendering engine, using GPU acceleration to give large performance boosts to transformations and blending operations.

Full story at

With its new Unreal Development Kit, Epic Games is bringing game development to the masses by offering some powerful tools for free. Ars talked to Epic VP Mark Rein to discuss the new initiative and how aspiring developers can get the most out of it.

Full story at ArsTechnica

3D-Tech News Around The Web / GPU Computing Collaboration Network launched
« on: November 25, 2009, 05:58:12 PM »
Last week (during SC09) the Coordinated Science Laboratory of the University of Illinois at Urbana-Champaign announced the GPU Computing Collaboration Network to foster collaboration among users of GPUs

Full story at InsideHPC

Graphics core licensor Imagination Technologies Group plc (Kings Langley, England) is preparing compilers that will be able to assign tasks across both graphics and general-purpose processing units.

Full story at EE Times

3D-Tech News Around The Web / Monte Carlo eXtreme
« on: November 24, 2009, 06:17:14 PM »
Fast photon migration simulations powered by GPU-based parallel computing

    * Monte Carlo eXtreme, or MCX, is a Monte Carlo simulation software for photon migration in 3D turbid media. It uses Graphics Processing Units (GPU) based massively parallel computing techniques and is extremely fast compared to traditional CPU-based simulations. Using an nVidia 8800GT graphics card (14MP/114Cores), the acceleration is about 300x~400x with over 1700 parallel threads; this ratio can be as high as 700x on a high-end GTX 295 GPU (multiply by another 2x if both GPUs on GTX295 are used).

3D-Tech News Around The Web / Beta Test GPUs with MathWorks Products
« on: November 23, 2009, 03:09:26 PM »
The MathWorks is working to provide features that will enable users to accelerate computations by taking advantage of GPUs (NVIDIA GPUs in the first release). We are looking for users who can help test these new capabilities and provide feedback.

Apply at Mathworks

