For around $10,500, you can have the new Quadro Plex 2200 D2 that features two Quadro FX 5800 GPUs and 8Gb of graphics memory (4 GB per GPU). Quadro Plex of visual co-processors for workstation is designed for large datasets and models processing.
More info on Quadro Plex: Quadro Plex 2200 D2 @ NVIDIA
Here is a thesis that discusses the usage of NVIDIA’s CUDA in two applications:
- Einstein@Home: a distributed computing software
- OpenSteer: a game-like application.
CUDA exposes the GPU processing power in the C programming language and can be integrated in existing applications with ease. But in order to exploit the power a GPU can deliver, one has to design the data structures in order to become optimized for CUDA.
Download the thesis here: GPU usage and data structure design (1366)
The hardware.fr staff has been invited by NVIDIA to make a point on CUDA. Since this exellent article is written in french, I’ll try to highlight the interesting parts.
One of the new thing in CUDA 2.0 is, according to hardware.fr, the adding in the CUDA compiler of an optimzed profile for multicores x86 CPUs. Currently, CUDA code is splitted in two parts: one part processed by the CPU and the other one by the GPU via the CUDA compiler.
The new thing is that we can now compile the GPU code explicitly for the CPU in order to take advantage of multicores capabilities of the latest CPUs.
Another new thing is Tesla Series 10. NVIDIA has equiped all Tesla 10 products with 4Gb of graphics memory by GPU (recall that GeForce GTX 280 has 1Gb of memory). This boost in memory amount is useful in situations where dataset to be processed are very large.
A Tesla 10 card has only 6-pin PCI-Express power connector (the 8-pin is optional – a GeForce GTX 280 has one 6-pin and one 8-pin an both are required!). The reason is in GPU Computing the GPU has a lower power consumption because some transitors dedicated to 3D graphics are not used.
The article shows also some practical cases where CUDA is used: financial analysis, medical imagery (3D scans) and password recovering.
Read the complete article HERE – in french only
This work describes the implementation of a real-time visual tracker that targets the position and 3D pose of objects (specifically faces) in video sequences. The use of GPUs for the computation and efficient sparse-template-based particle filtering allows real-time processing even when tracking multiple faces simultaneously in high-resolution video frames. Using a GPU and the NVIDIA CUDA technology, performance improvements as large as ten times compared to a similar CPU-only tracker are achieved.
- Real-time Visual Tracker by Stream Processing by Oscar Mateo Lozano, and Kazuhiro Otsuka. Journal of Signal Processing Systems.
Thanks to tho for the news.
This program was born as a parody of another *-Z utilities as CPU-Z or GPU-Z. It demonstrates some basic information about CUDA-enabled GPUs and GPGPUs.
CUDA-Z’s homepage: cuda-z.sourceforge.net
CUDA was announced along with G80 in November 2006, released as a public beta in February 2007, and then finally hit the Version 1.0 milestone in June 2007 along with the launch of the G80-based Tesla solutions for the HPC market. Today, Beyond3D looks at the next stage in the CUDA/Tesla journey: GT200-based solutions, CUDA 2.0, and the overall state of NVIDIA’s HPC business.
Read the article HERE.
Do you know what CUDA and OpenCL stand for and how they could make your computer 50 times faster? If so, you can safely jump to the “Ending the mess” section below. Otherwise read on for a gentle introduction.
A computer has two important processing units: the CPU and GPU. Think of them as the two brothers in Rain Man. The GPU is the ultimate autistic savant. He’s really, really good at counting stuff and doing a lot of complex math at the same time.
The CPU is your regular guy. He can do all kinds of stuff that the savant can’t. He goes along well with everybody, as long as they speak English. If he learns to take advantage of the savant, the two of them can do amazing things like count cards at Poker.
In other words, the GPU is natural at some operations that involve repetitive calculations, like those necessary for drawing 3D graphics and doing basic image manipulation.
Read the rest of this article HERE.
Benchmark Reviews proposes an article about GPU computing with CUDA and GeForce based graphics cards.
Read the complete GPU Compute FAQ HERE.
Terms such as “heterogeneous computing” and “parallel computing” are going to be used as often as the term “video card” is used in a product review. You won’t want to miss this evolution in graphics technology, because we are witness to a pivitol moment in time when computers are going to stop being filled with familiar single-purpose hardware. Benchmark Reviews offers this FAQ to help our readers understand what is happening, and help introduce them to what is coming. We don’t want anyone to be left in the cold when the rest of the world learns how the GPU is learning to be a CPU.
The implementation of PhysX has been done using CUDA. Thanks to CUDA, NVIDIA driver team has quickly converted Ageia’s PhysX functions. All GeForce 8, 9 and GTX200 will be PhysX compliant. However, one thing won’t be GPU-accelerated: rigid bodies. According to Manju Hedge, former Ageia’s CEO, GeForce 8/9/GTX200 are more powerful and faster than the current PhysX PPU.
Read the complete article HERE (french).
L’implémentation a été faite grâce à CUDA (extension du C pour exploiter le GPU comme unité de calcul) qui a permis une conversion rapide de l’API qui devient ainsi compatible avec toutes les GeForce 8 et 9 ainsi qu’avec les futures GeForce GTX 200. D’entrée de jeu, les GeForce GTX 260 et 280, et leurs prédécesseurs, devraient donc être compatibles avec le subset de l’API PhysX global, comme l’est le PPU, avec une différence cependant: le portage des rigid bodies n’a pas encore été effectué et cette fonctionnalité ne pourra donc pas, pour le moment du moins, être accélérée par le GPU. Selon Manju Hedge, ancien CEO d’Ageia, les GeForce 8 et 9 haut de gamme (et bien évidemment les futures GeForce GTX 200) sont nettement plus véloces que le processeur PhysX.
Lire l’article complet ICI.
An ultra-sound scanner, developed by TechniScan, utilizes four Tesla C870 boards and is programmed in with CUDA.
The company continued to run tests with Pentium 4 and Core 2 generations of processors, but even with the fastest Core 2 Duo and Quad processors, the render time could not be cut under 45 minutes.
A possible solution popped up when TechniScan senior software engineer Jim Hardwick bought a GeForce 8-series card and discovered Nvidia’s CUDA SDK. Jim is an avid gamer, so he bought the card to enjoy latest and upcoming games, but a quick run of his code on a GPU apparently lit up more than just one bulb. Fast forward to 2008 – today the code is ported to CUDA and utilizes four Tesla C870 boards. The render time was cut from 45 minutes on a 16-core Core 2 cluster to only 16 minutes.
The GPU scored major speedups in calculating 2D FFT’s, in which a single 8800 GTX was eight times faster than a Core 2 Quad at 2.66 GHz, while complex exponentiation with 12 million elements ended up being accelerated by a factor of 320x. Complex Exponentiation is usually run 50-60 times, so you can see how a 4-GPU setup was able to cut the total rendering time.
Read the complete article HERE
Here is an article that explains what is CUDA, the NVIDIA technology for GPU programming. Sorry for english readers but this article is in french.
Read the complete article HERE.
Voici un article, en français, de vulgarisation de la technologie CUDA de NVIDIA.
L’article complet se trouve ICI.