Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.

Messages - JeGX

Pages: 1 ... 18 19 [20] 21 22 ... 28

Ocelot is a dynamic compilation framework for heterogeneous systems, accomplishing this by providing various backend targets for CUDA programs. Ocelot currently allows CUDA programs to be executed on NVIDIA GPUs and x86-CPUs at full speed without recompilation.

3D-Tech News Around The Web / NVIDIA delays Fermi to March 2010
« on: December 28, 2009, 05:34:10 PM »

Nvidia originally scheduled to launch Fermi in November 2009, but was delayed until CES in January 2010 due to defects, according to market rumors. However, the company recently notified graphics card makers that the official launch will now be in March 2010, the sources noted.

3D-Tech News Around The Web / Nvidia Fermi graphics architecture explained
« on: December 21, 2009, 04:48:16 PM »

There is an exception to this – high-power graphics cards, we love these. They make games sexy and that makes us sexy. At the heart of these is the GPU, and when Nvidia announces it has a new and wonderful one, it is time to take notice. It's codenamed Fermi, after renowned nuclear physicist, Enrico Fermi.


The silicon has been designed from the ground-up to match the latest concepts in parallel computing. The basic features list reads thus: 512 CUDA Cores, Parallel DataCache, Nvidia GigaThread and EEC Support.

Clear? There are three billion transistors for starters, compared to 1.4 billion in a GT200 and a mere 681 million on a G80. There's shared, configurable L1 and L2 cache and support for up to 6GB of GDDR5 memory.

The block diagram of Fermi looks like the floor plan of a dystopian holiday camp. Sixteen rectangles, each with 32 smaller ones inside, all nice and regimented in neat rows. That's your 16 SM (Streaming Multiprocessing) blocks with 512 little execution units inside, called CUDA cores.

Each SM core has local memory, register files, load/store units and thread scheduler to run the 32 associated cores. Each of these can run a floating point or an integer instruction every click. It can also run double precision floating point operations at half that, which will please the maths department.


In all honesty, we can completely understand the rampant issue of piracy and there is a worrisome number that I won't stop repeating. Crytek is a good example of a company that wanted to stay PC-only but simply could not build a business model due to multi-million dollar damages caused by prospective buyers opting for a pirated copy of the game. When Epic Games released their Unreal Tournament III game, they marked 40 million different installations trying to access online servers for multiplayer action.

In short, if those 40 million people went on and purchased the game instead of downloading it from The Pirate Bay and similar sites, Epic Games would earn approximately two billion dollars [40M times $49.95 recommended price, minus some $39.95 in US and plus some Euro 49.99/$62.44 at the time in EU lands]. Now, imagine what would Epic be able to do if they had an influx in excess over a billion dollars. Would Unreal Engine 4 need 4-6 years to develop in a limited budget or could Tim hire as much people as he needs and deliver an engine perfectly optimized for a whole spectrum of PC hardware?

3D-Tech News Around The Web / OpenCL path tracer / ray tracing demo
« on: December 21, 2009, 12:50:12 PM »

SmallptGPU is a small and simple demo written in OpenCL in order to test the performance of this new standard. It is based on Kevin Beason's Smallpt available at
SmallptGPU has been written using the ATI OpenCL SDK beta4 on Linux but it should work on any platform/implementation (i.e. NVIDIA). Some discussion about this little toy can be found at Luxrender's forum.


General Discussion / Re: Accessing the depth buffer in GLSL
« on: December 21, 2009, 09:39:28 AM »
Oh no, your english is very very fine, better than mine.
I googled your nickname and found some messages in french that's explain my question  ;)


Alors bienvenu dans la petite communauté GeeXLab. Un blog en francais existe aussi sur GeeXLab et les outils similaires:
Il est mis à jour moins souvent que geeks3d mais c'est le début (ca devrait aller mieux à partir de janvier) alors viens y faire un tour de temps en temps. Et si tu souhaites être contributeur sur le HackLAB, c'est quelque chose qui pourrait se faire...


Thank you for your nice feedback about GeeXLab

I'm preparing a big update of GeeXLab with new features, I hope to release it in early January...

General Discussion / Re: Accessing the depth buffer in GLSL
« on: December 21, 2009, 08:43:17 AM »
Ok if the same texture can be used for input and output in the same time, then your technique works fine. I dont know why but I thought we couldn't use a texture in such manner in a render texture.

I will publish a post with your demo and add it to GeeXLab code samples repository.

Do you have a blog or something like this?

PS: are you french?

General Discussion / Re: Accessing the depth buffer in GLSL
« on: December 18, 2009, 08:58:25 PM »

Maybe a little remark: I don't know if you can use a texture in the same time as source and as destination like you do here with blurBuffer:
Code: [Select]
<step name="_2" target="blurBuffer" gpu_shader="VerticalBlur" >
  <texture render_texture="blurBuffer" render_texture_type="COLOR" />
  <texture render_texture="sceneBuffer" render_texture_type="DEPTH" />

I think you should use another render texture and do some ping-pong between two render textures:
Code: [Select]
<render_texture name="blurBuffer1" type="COMPOSITE" />
<render_texture name="blurBuffer2" type="COMPOSITE" />
<step name="_1" target="blurBuffer2" gpu_shader="VerticalBlur" >
  <texture render_texture="blurBuffer1" render_texture_type="COLOR" />
  <texture render_texture="sceneBuffer" render_texture_type="DEPTH" />

<step name="_2" target="blurBuffer1" gpu_shader="HorizontalBlur" >
  <texture render_texture="blurBuffer2" render_texture_type="COLOR" />
  <texture render_texture="sceneBuffer" render_texture_type="DEPTH" />


OpenGL 3.2 and GLSL 1.5 is available but there is a lack of simple and complex example programs. On this webpage, I do want to fill this gap by providing example programs using OpenGL 3.2 and GLSL 1.5 with GLEW. Please note, that all example programs do not use any deprecated OpenGL functions.

3D-Tech News Around The Web / Texture Tools 2.07 released
« on: December 17, 2009, 12:47:36 PM »

Texture Tools homepage and downloads:

The NVIDIA Texture Tools is a collection of image processing and texture manipulation tools, designed to be integrated in game tools and asset conditioning pipelines.

The primary features of the library are mipmap and normal map generation, format conversion and DXT compression.

DXT compression is based on Simon Brown's squish library. The library also contains an alternative GPU-accelerated compressor that uses CUDA and is one order of magnitude faster.

General Discussion / Re: Accessing the depth buffer in GLSL
« on: December 17, 2009, 09:28:08 AM »
Absolutely nice work man!
I think I'll publish your demo officially on Geeks3D front page.

I recommend you to manage window resizing with a SIZE script like this:
Code: [Select]
<script name="resize" run_mode="SIZE" >
  local w, h = HYP_Scene.GetWindowSize()
  id = HYP_GPUShader.GetId("HorizontalBlur")
  HYP_GPUShader.SetConstant_1f(id, "texWidth", w)
  id = HYP_GPUShader.GetId("VerticalBlur")
  HYP_GPUShader.SetConstant_1f(id, "texHeight", h)

3D-Tech News Around The Web / Qt Graphics and Performance - An Overview
« on: December 16, 2009, 04:26:42 PM »

We have two OpenGL based graphics systems in Qt. One for OpenGL 1.x, which is primarily implemented using the fixed functionality pipeline in combination with a few ARB fragment programs. It was written for desktops back in the Qt 4.0 days (2004-2005) and has grown quite a bit since. You can enable it by writing -graphicssystem opengl1 on the command line. It is currently in life-support mode, which means that we will fix critical things like crashes, but otherwise leave it be. It is not a focus for performance from our side, though it does perform quite nicely for many scenarios.

Our primary focus is the OpenGL/ES 2.0 graphics system, which is written to run on modern graphics hardware. It does not use a fixed functionality pipeline, only vertex shaders and fragment shaders. Since Qt 4.6, this is the default paint engine used for QGLWidget. Only when the required feature set is not available will we fall back to using the 1.x engine instead. When we refer to our OpenGL paint engine, its the 2.0 engine we’re talking about.


 Last weekend, I got to play with an NVIDIA GT240 (around 100$). Having read a lot of blogs about GPU programming, I downloaded the CUDA SDK and started reading some samples.

In less than one hour, I went from my rather complex SSE inline assembly, to a simple, clear Mandelbrot implementation... that run... 15 times faster!

Let me say this again: 1500% faster. Jaw dropping. Or put a different way: I went from 147fps at 320x240... to 210fps... at 1024x768!

I only have one comment for my fellow developers: It is clear that I was lucky - the algorithm in question was perfect for a CUDA implementation. You won't always get this kind of speedups (while at the same time doing it with clearer and significantly less code).

But what I am saying, is that you must start looking into these things: CUDA, OpenCL, etc.

Code: [Select]
_global__ void CoreLoop( int *p,
  float xld, float  yld, /* Left-Down coordinates */  
  float xru, float  yru, /* Right-Up coordinates */  
  int MAXX, int  MAXY) /* Window size */
    float re,im,rez,imz;
    float t1, t2, o1, o2;
    int k;
    unsigned result =  0;
    unsigned idx =  blockIdx.x*blockDim.x + threadIdx.x;
    int y = idx / MAXX;
    int x = idx % MAXX;

    re = (float) xld + (xru-xld)*x/MAXX;
    im = (float) yld + (yru-yld)*y/MAXY;
   rez = 0.0f;
   imz = 0.0f;
   k = 0;
   while (k < ITERA)
     o1 = rez * rez;
     o2 = imz * imz;
     t2 = 2  * rez * imz;
     t1 = o1 -  o2;
     rez = t1 +  re;
     imz = t2 +  im;
     if (o1 +  o2 > 4)
        result = k;
  p[y*MAXX + x] =  lookup[result]; // Palettized lookup

3D-Tech News Around The Web / Re: How the 3D engine is changing the world
« on: December 16, 2009, 04:08:30 PM »

It's not the figurative beauty of yore – the iconic charm of Pac-Man, the elegiac simplicity of the vector-mapped space craft in Elite. Modern games are edging toward photo-realism; indeed, through technologies like mimetic interfaces and augmented reality, they are encroaching on reality itself. And at times they are breathtakingly close.

But here is the minor tragedy at the heart of modern games: no matter how astonishing they look, players will never see one of the most beautiful components: the 3D engine.

Nowadays, developers spend several years developing one engine which then powers all of their games. These technologies are so important; they have become brands in their own right. They're given exciting macho names like EGO, RAGE and OGRE and whenever a new title is announced, the 3D engine will be there listed among the key selling points.

"In other areas, we're still stuck in the Stone Age due to ingrained technologies. The C++ programming language, used in all modern games, was hastily conceived in the 1980s as an extension to the 1970s C programming language. Many of the problems that plague computers today - security vulnerabilities, viruses, and so on, can be traced to problems in this language."

General Discussion / Re: Accessing the depth buffer in GLSL
« on: December 16, 2009, 03:06:29 PM »

3D-Tech News Around The Web / How the 3D engine is changing the world
« on: December 16, 2009, 02:00:15 PM »

The Unreal Engine, created by Epic Games, contains a breathtaking 2.5m lines of code – as Tim Sweeney, technical director: "That's roughly comparable to the complexity of a whole operating system a decade ago."

"Game development is at the cutting edge in many disciplines," says Sweeney. "The physics in modern games includes rigid body dynamics and fluid simulation algorithms that are more advanced than the approaches described in research papers."

General Discussion / Re: Accessing the depth buffer in GLSL
« on: December 16, 2009, 09:38:53 AM »
I'm preparing a small demo to display the depth buffer...

3D-Tech News Around The Web / Sapphire Radeon HD 4860 in the US
« on: December 15, 2009, 01:55:03 PM »

The unofficial, unconfirmed but quite real Radeon HD 4860 made by Sapphire has debuted in the US and is currently up for grabs for $130. This model is powered by the 55nm RV790 GPU and has 640 Stream Processors, a 256-bit memory interface, a dual-slot cooler, CrossFireX support, plus DVI, HDMI and DisplayPort outputs.


Kaspersky uses NVIDIA Tesla GPU to detect new viruses and achieves 360-fold performance increase over the common CPU.

Pages: 1 ... 18 19 [20] 21 22 ... 28