« on: June 12, 2010, 08:14:41 AM »
DxProf is realtime DirectX profiling tool for easy identification of performance bottlecks on the GPU. DirectX 9, DirectX 10, and DirectX 11 are supported.
This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.
Acropora is a procedural voxel modeler for creating complex, organic mesh topologies that are useful for all types of 3D modeling applications. Acropora incorporates some of the latest advances in voxel modeling technology. Features include:
A variety of base shapes and compound shapes as primary surfaces
Splines, lofts, grooves, extrusions, chisels
* Voxelisation of user imported models
Over 50+ volume modifiers
Automatic generation of up to 15 octaves of three-dimensional noise (volumetric, Perlin and fractal)
Extensive voxel and noise editing tools
Extensive post-processing of meshes (tesselation, smoothing, simplification,...)
Multiple LOD support
Export to multiple mesh formats (FBX, OBJ, 3DS, DXF, DAE), DDS volume textures, and raw voxel data
It’s now been a week since the launch of The Wow Factor, and I’m happy to announce that thanks to YOU, I am now able to take Blender Guru FULL TIME!
What that means is you are going to be seeing a LOT more tutorials posted. In fact, I will be posting a new tutorial every Tuesday.
The unveiling of OpenCL on the ZMS processors highlights how the performance and flexibility of the underlying StemCell Computing architecture can now be leveraged by developers using an industry standard API to bring new levels of performance to applications targeting low-power platforms.
OpenCL (Open Computing Language) is the first open, royalty-free standard for general-purpose parallel programming of heterogeneous systems. OpenCL provides a uniform programming environment for software developers to write efficient, portable code for parallel processors such as the ZiiLABS ZMS processors.
"OpenCL enables developers to unlock the full potential of the underlying StemCell Computing architecture to deliver new levels of performance across a broad range of applications.” said Tim Lewis, director of marketing of ZiiLABS. "The OpenCL based ray-tracing and video filter demos we provide a glimpse of the floating-point performance and flexibility that developers can exploit on ZMS-based platforms and products." Read the full Press Release
OpenCL Early Access Program
ZiiLABS is currently inviting developers with innovative ideas to use OpenCL for consumer class handheld and connected platforms to join an OpenCL Early Access Program that will provide selected partners an early release of the ZiiLABS OpenCL SDK* for ZMS processors.
If you have a powerful OpenCL GPU available in a system, it is a good idea to understand its capabilities. You can do so by running algorithms in their CPU and GPU versions, and comparing the results. You might be surprised with the processing power that most modern software is wasting when it runs on a computer with an OpenCL GPU.
With the June 2010 DirectX SDK, one of our work items was to try out the various DirectX 11 samples against the NVIDIA DirectX 11 graphics parts (NVIDIA GeForce GTX 470/480) now that they are available. For the August 2009 and February 2010 releases, we only had the AMD/ATI DirectX 11 graphics cards available (ATI Radeon HD 5000 Series). Video cards have traditionally competed on a mix of features, performance, and price. These days they are increasingly also competing on power consumption--while this has always been true in the mobile & laptop space, it is becoming increasingly important even in desktops.
There has been a lot of focus in Direct3D 10, 10.1, and 11 to try to minimize the 'feature fragmentation' problem in the Direct3D API (best demonstrated by the "sea of caps" in the Direct3D 9 Card Capabilities spreadsheet we ship in the DirectX SDK) to help simplify the programmer's job trying to efficiently use these APIs. This effort really started with Direct3D 9 Shader Model 3.0 trying to tighten down the specificiation a bit more. This is also a lot of what the Feature Level concept introduced in Direct3D 10.1 and the '10level9' feature levels of DirectX 11 is trying to address in a more manageable way. Performance differences can still vary a great deal between vendors and will vary a lot even between the same vendor's cards at different price-points, but we hope it at least helps constrain the degrees of freedom the programmer has to concern themselves with.
Our work with the NVIDIA hardware for this release has provided insight into some areas that programmers need to pay attention to with respect to different vendor's cards. The biggest difference I noticed was that number of MSAA quality levels exposed by AMD vs. NVIDIA. This information is obtained via the CheckMultisampleQualityLevels method in Direct3D 10.x and 11. The ATI Radeon HD 5000 Series only provides one quality level per sample count, while the NVIDIA GeForce GTX 470/480 exposes a number of fine-grain quality levels per sample count. This highlighted a few UI bugs in some of the samples as well as DXUT/DXUT11 that were corrected in the June 2010 release. Be sure to test the behavior of any MSAA settings and quality levels in your DX10.x and DX11 programs on both vendor's hardware. Another area to pay close attention to is DirectCompute synchronization and timing behavior. DirectCompute as a low-level exposure of the GPU behavior is more subject to architectural differences, so be sure to test any use of DirectCompute on hardware from multiple vendors.
Basically an A-buffer is a simple list of fragments per pixel. Previous methods to implement it on DX10 generation hardware required multiple passes to capture an interesting number of fragments per pixel. They where essentially based on depth-peeling, with enhancements allowing to capture more than one layer per geometric pass, like the k-buffer and stencil routed k-buffer that suffers from read-modify-write hazards. Bucket sort depth peeling allows to capture up to 32 fragments per geometry pass but with only 32 bits per fragment (just a depth) and at the cost of potential collisions.
All these techniques were complex and basically limited by the maximum of 8 render targets that were writable by the fragment shader.
This technology preview is a snapshot of some internal research we have been working on and talking about at various conferences for the past couple years. The level of interest in GPU-accelerated AI has continued to grow, so we are making this (unsupported) snapshot available for developers who would like to experiment with the technology.
In Houdini 11, new Voronoi-based fracturing tools will make it easier to break up objects either before a simulation or automatically during a simulation...
Our particle fluids are now up to 70 times faster with the new FLIP (Fluid Implicit Particle) solver as compared to Houdini 10’s SPH solver, making it ideal for generating multiple iterations. In addition, this new solver is seamlessly integrated with existing particle operations [POPs] making the results highly directable. New buoyancy controls make it easier to float rigid objects and you can even smash up an object by combining these fluid tools with the new fracturing tools...
Hardware Rendering has also been enhanced with high quality OpenGL shading of lights and shadows as well as GPU-assisted volumes, unlimited lights and support for diffuse, specular, opacity, environment, bump and normal maps. Houdini’s Flipbook tools now support all these FX and can capture high dynamic range beauty passes.
In addition, we have improved the lighting interface for Houdini 11. We have new light types such as Global Illumination, Portal, Sky, Indirect and Geometry. The Geometry Lights let you turn any 3D object into a light emitting surface then use a surface shader to control the light emission. The geometry can also be animating or deforming for even cooler results...
With every major release of thinkingParticles new features are introduced, extending the power and flexibility of thinkingParticles by a magnitude, as compared to its predecessor. Release 4 represents a milestone in advancing the feature set.
In this class, we will introduce OpenCL™. We start with an overview of GPU compute since the desire to take advantage of modern GPU computational power in general applications was a main motivator in the development of OpenCL™. The discussion includes some of the early APIs developed to harness the increasing programmable computational power available in modern graphics processors.
We then introduce the anatomy and programming model of OpenCL™ and take you through some of the highlights of installing the ATI Stream SDK v2 which includes support for OpenCL™ 1.0 on x86 CPUs and AMD GPUs. Then, the practical portions of the OpenCL™ runtime and kernel specifications are discussed in detail.
At the end, we discussion optimization tips to help you avoid common pitfalls when coding your applications in OpenCL™. For students who may have existing code written for the proprietary interface, CUDA, we discuss the easy steps involved in porting that code to OpenCL™.