Show Posts

This section allows you to view all posts made by this member. Note that you can only see posts made in areas you currently have access to.


Topics - ljbade

Pages: [1]
1
3D-Tech News Around The Web / Internet Explorer 11 supports WebGL
« on: June 27, 2013, 03:03:03 PM »
An Internet Explorer 11 preview is included in the recently launched Windows 8.1 Preview.

One new feature of Internet Explorer 11 is support of WebGL:



This means Microsoft has backed down from their previous stance that allowing webpages programmatic access to the GPU was bad for system security.

At this stage the WebGL support is marked experimental and is only version 0.9. Unfortunately no GL extensions are listed so IE only supports stock WebGL, but hopefully Microsoft will add more features before the final release.

It is well known that IE uses Direct2D and Direct3D 11 to render webpages so it appears Microsoft have implemented something similar to the ANGLE project (used by default in Chrome and Firefox on Windows) to compile GLSL into HLSL shaders.

Microsoft have created two new WebGL demos on their IE Test Drive website:
http://ie.microsoft.com/testdrive/Performance/Levitation/
http://ie.microsoft.com/testdrive/Graphics/LookAround/




I have tested various other WebGL demos around the net and it is a bit hit or miss if they work. Some don't load, and others have graphical artefacts. It appears Microsoft has a lot of work to do to get the WebGL support complete.

2
3D-Tech News Around The Web / ClusterGL
« on: January 11, 2012, 02:27:56 AM »
Always great to see a fellow Kiwi (New Zealander) get international recognition:
http://www.stuff.co.nz/technology/digital-living/6241261/Kiwis-software-wows-Google-Nasa

Quote
What began as a side project has gone global after a New Zealand student's software was picked up by Google and NASA.

"I never expected it to take off like this," said University of Waikato student Paul Hunkin.

"It started off as little afternoon project and it's gotten reasonably popular."

Hunkin might downplay his success, but NASA called the software an "innovative solution".

ClusterGL website: http://code.google.com/p/liquid-galaxy/wiki/GSoC2011_ClusterGL

NASA's cool eight screen Google Earth viewer: http://open.nasa.gov/blog/2011/09/26/nasas-liquid-galaxy-an-overview/

Seems similar to Equalizer.

I wonder how these systems compare to AMD Eyefinity or NVIDIA Surround when running multiple GPUs with a large number of screens.

3
General Discussion / My new HP AMD Fusion laptop
« on: September 03, 2011, 01:58:56 AM »
I just replaced my old Lenovo with a new laptop from HP that was only recently released here.

It is  a HP Pavilion DV6-6102AX (might have a different model number in other countries) and the specs can be found here: http://www.harveynorman.com.au/product/1256978856446/hp-pavilion-dv-ax-laptop.

Interestingly the HP website has not yet been updated to include my model.

So far I like this laptop as it is fast with games without the weight, heat, and noise issues high end gaming laptops have.

It has one of AMD's new APU fusion processors: AMD A8-3530MX APU with Radeon(tm) HD Graphics
This APU contains four improved Phenom II/Stars architecture (codename Llano/BeaverCreek) cores running at 1.9GHz with can boost to 2.6GHz with Turbo Core. This is AMD's fastest mobile APU but is still a lot slower than the desktop A8 which can run up to 3GHz (but obviously uses more power and heat). More info http://www.notebookcheck.net/AMD-A-Series-A8-3530MX-Notebook-Processor.55760.0.html and http://en.wikipedia.org/wiki/AMD_Fusion#Current_platforms. Unfortunately the low clock speed and old K10 architecture mean the newer laptop Intel i5 processors are faster (but only come with Intel graphics).

Since DDR3 RAM prices are very cheap at the moment (and industry insiders are saying oversupply will cause RAM to prices to hit rock bottom by the end of the year) HP but 8GB!!! of DDR3-10700 (667MHz) of RAM in this laptop. That is twice the amount of RAM as my desktop gaming PC!

Now on the graphics side this latop packs two GPUs.

The first GPU is the one in the A8 APU which is a Radeon HD 6620G. This is a Radeon 6800/Barts/VLIW5 architecture GPU with the newer UVD3 MPEG decoder. It has 400 unified shader cores running at 444MHz core clock. It supports all OpenGL 4.2/Direct3D 11/OpenCL 1.1 features.

On the memory side it shares the DDR3 memory controller (and L3 cache?) with the CPU so it uses the same 667MHz DDR3 RAM. AMD allocate only 512MB of dedicated device RAM of my 8GB to the GPU (would love to change this to 1GB). One nice thing about this APU is that OpenCL (and I presume the 3D driver too) can use pinned memory with zero copy overhead to GPU as GPU can directly address the RAM allocated by a CPU program. Which should make OpenCL apps more efficient and allow games to load a lot faster.

According to various benchmarks this GPU has the best performance of an integrated laptop graphics chip available (Intel graphics is so far behind its funny) and can run most games on low graphics settings with reasonable performance.

But HP realised that 1 GPU was not enough for people with serious 3D workloads (like gamers and programmers) so they chucked in the AMD Radeon HD 6750M mobile GPU too. This is based on the same 6800 architecture as the APU so has the same features. This chip has  480 unified shader cores running at 600MHz core clock speed. The performance is similar to a desktop Radeon HD 6570 (you gotta love AMD/NVIDIA renaming there mobile parts with higher desktop part numbers) see http://www.tomshardware.com/reviews/radeon-hd-6570-radeon-hd-6670-turks,2925.html for rough performance. More info http://www.notebookcheck.net/AMD-Radeon-HD-6750M.43958.0.html.

This is GPU is the second fastest GPU HP could have picked that still supports Crossfire with the APU. The Radeon HD 6770M has s higher clock speed 725MHz. I wish HP had chosen this one! Perhaps I can overclock my chip to that. It also uses 1600MHz RAM.

On the memory side HP have coupled the chip with 1GB of GDDR5 dedicated graphics RAM running at 900MHz. This is quite a lot of RAM for a laptop (actually the same as my desktop 5850) so every game should be able to load fine even with high AA (though performance wise that might not be a good choice). Definitely good for OpenCL apps that need lots of RAM.

Now by having a APU and a GPU this latop falls under AMD's A8 VISION brand with Quad Core and Radeon Dual Graphics. This basically means the GPUs are Crossfired together to roughly double the performance (400 * 400MHz + 480 * 600MHz). When crossfire is enabled the GPU is officially called a Radeon HD 6755G2 Dual Graphics (AMD's markeitng department must be nearly out of model numbers in the 6xxx range! considering each unique APU and GPU combination has a different 6xxxG2 number).

The Crossfire seems to work OK for all recent games (I am running Catalyst 11.8 driver with 11.8 CAP2 catalyst profile). But Deus Ex: Human Revolution has a few bugs - the steam overlay flickers, and the shadows/SSAO effect flickers when you get close to walls/doors. But Google search shows this is a common issue with this game and Crossfire but I hope 11.9 will fix it. Crossfire is also the reason I wish the APU had 1GB of graphics memory as I'm pretty sure that Crossfire exposes the smallest GPU RAM as the maximum for textures (as you need a copy of the same texture data in both GPUs for alternate frame rendering). I am not sure how to verify the RAM limits.

The other nice thing about Dual Graphic is that it supports AMD's Switchable Graphics which is AMD's version of NVIDIA's Optimus. What this means is than when you are using desktop applications or low power 3D programs/2D accelerated games is that the drivers will completely disconnect and power off the dedicated GPU to reduce power consumption, heat, noise and battery life. When you run a game or 3D program for the first time the AMD Catalyst driver will pop up and ask if you want High Performance or Low Power config for this program. The driver remembers the setting will not ask again. The control panel can be used to change this very easily, or you can manually switch between High Performance or Low Power system wide (for benchmarking). High Performance enabled both GPUs in Crossfire while Low Power only uses the APU. Most older games and programs are preconfigured by AMD with correct default setting in the driver.

The transition is seamless between the two modes but I notice that it takes a few seconds for the driver to power up the second GPU after a High Performance program is started. Compared to NVIDIA Optimus AMD is far superior. AMD only needs one driver for both GPUs so you don't have the messy Intel/NVIDIA graphics driver update issues. Also both GPUs are compatible architecture wise so the driver doesn't need to play tricks to mask the fact that Intel and NVIDIA have different Direct3D and OpenGL feature sets. The performance will also be a lot higher as AMD can Crossfire the GPUs while NVIDIA can't. Also AMD has superior memory interface between the two GPUs to make sharing the front buffer across the two GPUs faster and use less power (NVIDIA have to copy the front buffer each frame from NVIDIA GPU RAM to system RAM so the Intel GPU controller can copy it to the display controller). The other bonus is that low power mode still uses a real GPU and has much better performance than low power on NVIDIA Optimus which relies on painfully slow Intel integrated graphics.

Performance wise I am able to run all the games I tested it on at native resolution (1366x768) with medium to high settings and get a playable 30-60FPS (depending on the game). Antialiasing does not seem to hurt the performance much, but anisotropic filtering can have a big impact.

My favorite game Deus Ex is not too demanding so I can run with all settings on maximum and still get 30FPS. dropping settings to medium gets me 50FPS.

The one problem I have noticed is that most games are CPU bound by the low 1.9GHz CPU clock speed as you can change graphics from low to high in a lot of games without affecting framerate very much. Deus Ex will not go above 50FPS in outdoor scenes even with everything low and low resolution. I think AMD could have put a better clock speed like 2.7-3GHz for gaming when the Switchable Graphics is in High Performance mode. I have yet to see Turbo Boost work yet either (and I found a forum post complaining about this too on another HP laptop) so I hope it is not a BIOS issue as this could make gaming a lot better if it works.

AMD provide a cool utility program called AMD System Monitor that shows realtime, graph, and log file of the CPU and GPU % usage, and clock speeds, as well as RAM usage and how much of APU GPU vs CPU resources are being used.

Other features of this laptop:
The case is very sturdy with a hard metal screen cover. It uses magnets so it doesn't have a annoying screen lock clip. It has a cool Apple like glowing white LED HP logo embedded in the top too. It is very bright and I wish I could turn it off at night time when I leave my laptop running. All the other LEDs on this laptop are also very bright. The weight is very good too reasonable light.

The battery charger is a very large 120W model so recharges the batteries very quick even when laptop is turned on. The battery life is very good for desktop use, about 5.5 hrs. Have not measured gaming battery life though.

The fan is very quiet on this laptop and can detect if you laptop is in you lap or on a table and change fan profiles automatically between balanced, quiet, or cool (you can assign the profile in HP's utility). Fan does get rather loud in a game which is to be expected with 2 GPUs and 4 cores running at full clock speeds but the speakers are loud enough to overcome this issue.

The keyboard is well laid out with full separate numpad. The arrows keys are a bit weird as the up/down arrows are squeezed into one key space (2 1/2 keys). The keys have a large gap around them to prevent dust and crumbs getting stuck beside keys and makes it a lot easier to clean. The function keys have a BIOS setting which swaps the F1-12 for the function key so you don't need to hold Fn button to change volume etc. Holding Fn button enables the F1-12 keys, but you can change the BIOS to make it work the traditional way.

The LCD is 1366x768 which normal for 15.6" screen. The LCD quality is good but some colours (like blue) produce a noticeable moire pattern at a normal viewing distance which is a bit annoying but OK after a while.

The speakers are very good for laptop speaker, Altec Lansing speakers with HP's & Dr Dre's Beats Audio branded drivers which make music sound a lot better. They are also very loud of maximum volume.

It also comes with a Bluray reader (and DVD burner) which allows you to watch HD movies (though not quite in full HD due to LCD resolution) without using much battery life thanks to AMD's UVD3 decoder. The AMD drivers have the best quality decoder I have ever seen thanks the the large amount of quality options you can tweak some of which enable OpenCL post processing filters. Youtube/Flash video also uses the UVD3 to reduce battery drain and heat too. Very low CPU load playing a Bluray.

The mouse trackpad is also very nice and large, can do multi touch for zooming and scrolling and has a swtich to diable trackpad when you are typing or using external mouse. It has a very bright white LED border which is turned on and off with Fn + Space. The LED turns faint orange when trackpad it off.

It also has a decent fingerprint reader, WiFi n, and Bluetooth 3.

Output wise you get VGA + HDMI (not sure if this can be used with Eyefinity but I suspect you can). 2x USB 3.0!!! (unfortunately I dont have any USB 3 devices to test it with) and 2x USB 2.0. 1 GB/s Ethernet. 1 microphone input, and 2!!! headphone outputs (so you and a mate can listen to same music), or maybe 4.0 surround sound (untested).

I picked this up for NZ$1800 which is a real steal from Norman Ross compared to the Intel based laptops which have lower performance for same price. An NVIDIA GPU + Intel i5 costs around NZ$2300 to 2600. In New Zealand Norman Ross and Harvey Norman are the only two shops allowed to sell AMD APU based computers (I assume other countries may have similar deals with AMD). As such this laptop does not show up at any online retailers in NZ. Harvey Norman's price for this laptop is NZ$2000 (which is a rip off).

The only other manufacturer of APU laptops so far is Toshiba (I think HP and Toshiba got an exclusive launch deal with AMD). But the Toshiba has a worse case design (all plastic), worse keyboard, and a slower GPU. And what's worse is that the Toshiba was NZ$100 more than the HP.

The other cool thing is than AMD are giving away free Steam copies of Dirt 3 with all of their products (CPUs, GPUs, and APUs) or computers that are prebuilt with their chips (such as this laptop). So that also adds to the value as Dirt 3 is NZ$90 in shops. Dirt 3 is an awesome game, even better than Dirt 2 which was already awesome. In fact I got Dirt 2 thanks to AMD with my 5850 in 2009, so now I can't wait till AMD give me a free copy of Dirt 4!

Anyway I hope you find my review useful and I recommend this laptop to anyone who wants an AMD APU based system with reasonable 3D performance.


Note to JeGX:
A lot of GPU utility programs seem to not work correctly with switchable graphics.
GPU-Z fails to get past the splash screen.
GPU Caps viewer does not list the codenames, shader core count and GPU RAM correctly.
But GPU Caps Viewer and MSI Kombuster correctly read the core frequency, RAM frequency, GPU temperature and fan speed.

4
Good news:
AMD have released the previously promised API that allows games to natively support stereoscopic  3D on recent AMD GPUs (5000/6000 series).
http://developer.amd.com/sdks/QuadBufferSDK/Pages/default.aspx

Bad news:
The API only supports Direct3D. OpenGL quad buffer requires an expensive professional card (FireGL), so we won't be seeing Quake in stereoscopic 3D anytime soon (unless you use NVIDIA's Vision 3D).
http://forums.amd.com/devforum/messageview.cfm?catid=392&threadid=153414

5
3D-Tech News Around The Web / Microsoft considers WebGL to be harmful
« on: June 16, 2011, 11:03:51 PM »
The Microsoft Security Research blog and defense today published an article stating that WebGL is too dangerous to put into a browser:
Quote
One of the functions of MSRC Engineering is to analyze various technologies in order to understand how they can potentially affect Microsoft products and customers. As part of this charter, we recently took a look at WebGL. Our analysis has led us to conclude that Microsoft products supporting WebGL would have difficulty passing Microsoft’s Security Development Lifecycle requirements.

Is this a hint that Microsoft will never support WebGL in Internet Explorer? It is not surprising considering Microsoft's frosty relationship with OpenGL in recent years. Perhaps they will develop a competing solution that uses Direct3D and force everyone to develop cross-API engines...

Full article here

6
Google have released the best WebGL demo I have seen so far: 3 Dreams of Black

Quote
“3 Dreams of Black” is our newest music experience for the web browser, written and directed by Chris Milk and developed with a few folks here at Google. The song, “Black,” comes off the album ROME, presented by Danger Mouse & Daniele Luppi, featuring Jack White and Norah Jones on vocals and soon to be released on the record label Parlophone/EMI.

Quote
In “3 Dreams in Black”, the browser is transformed into a theater for these lucid virtual dreams through WebGL, a new technology which brings hardware-accelerated 3D graphics to the browser. With WebGL in modern browsers like Google Chrome, you can interact with 3D experiences with no need for additional software. For curious web developers out there, we’ve made all the code completely open and available so that you can dig in, have a look around and try it out for yourself.

There is even a 3D editor and model viewer! The full source code is also available on Google Code along with some mini-demos of the different GL tricks used in the video.

7
Today Google announced a new Chrome Experiment that uses WebGL to render a globe that shows search volumes around the world.

Quote
Every day, people come to Google Search to ask questions. Through Google, questions become answers, and answers lead to the next set of questions. These people come from around the world and all walks of life, speaking hundreds of different languages, typing in search queries every single day. Today we’re sharing the Search Globe, a new visual display representing one day of Google searches around the world—visualizing the curiosity of people around the globe.

Google released the code behind the demo so anyone can create similar visualizations of datasets.

Quote
We’ve also open sourced this platform so that developers can build their own globes using their own data, and we look forward to seeing other globes orbiting around the web.

Full blog on the Official Google Blog.

8
3D-Tech News Around The Web / NVIDIA CUDA Toolkit 4.0 RC
« on: April 07, 2011, 02:47:57 AM »
NVIDIA have announced the public availability of the CUDA Toolkit 4.0 RC which was previously only available to registered developers.
http://developer.nvidia.com/cuda-toolkit-40

Quote
Release Highlights
Easier Application Porting

    * Share GPUs across multiple threads
    * Use all GPUs in the system concurrently from a single host thread
    * No-copy pinning of system memory, a faster alternative to cudaMallocHost()
    * C++ new/delete and support for virtual functions
    * Support for inline PTX assembly
    * Thrust library of templated performance primitives such as sort, reduce, etc.
    * NVIDIA Performance Primitives (NPP) library for image/video processing
    * Layered Textures for working with same size/format textures at larger sizes and higher performance

Faster Multi-GPU Programming

    * Unified Virtual Addressing
    * GPUDirect v2.0 support for Peer-to-Peer Communication

New & Improved Developer Tools

    * Automated Performance Analysis in Visual Profiler
    * C++ debugging in cuda-gdb
    * GPU binary disassembler for Fermi architecture (cuobjdump)

9
General Discussion / ATI Releases OpenGL 3.3 and 4.0 Drivers
« on: March 25, 2010, 07:50:21 PM »
ATI has released a beta version of Catalyst that supports OpenGl 3.3 and 4.0

Quote
The functionality introduced in OpenGL 3.3 is supported by all of our discrete graphics products – both consumer and professional graphics – released since the spring of 2007.  That means ATI Radeon™, ATI FirePro™ and ATI FireGL™ graphics cards released after that time provide hardware support for OpenGL 3.3, with today’s beta driver fully enabling the additional functionality introduced in the API. At the same time, our newest top-end graphics products, the  ATI Radeon HD 5900 and HD 5800 series, are fully compatible with the OpenGL 4.0 standard, including tessellation and integration with the OpenCL API, enabling GPU acceleration in future OpenGL applications. In addition, the driver enables all OpenGL 4.0 functionality on ATI Radeon HD 5400, HD 5500, HD 5600 and HD 5700 series graphics cards, with the exception of double precision support, a feature that will be enabled in these products at a later date. Again, the new features introduced in OpenGL 4.0 work immediately with ATI Radeon HD 5400 and higher cards, by way of today’s beta driver update.

It sounds like they are going to 'fake' double precision on the cheaper DX11 cards.

Full article here. Drivers here.

10
General Discussion / NVIDIA GTX 480 performance leaked
« on: February 24, 2010, 12:02:40 AM »
SemiAccurate has managed to get their hands on the tightly guarded GeForce GTX 480 performance numbers:
http://www.semiaccurate.com/2010/02/20/semiaccurate-gets-some-gtx480-scores/

12
3D-Tech News Around The Web / New ATI drivers - Catalyst 10.2 RC2
« on: February 10, 2010, 06:15:31 AM »
New ATI driver version: 10.2 RC2 from http://www.ati-forum.de/allgemein/downloads/treiber/p19566-catalyst-beta-8-70-rc1-rc2/#post19566

Code: [Select]
===================================================
GPU Caps Viewer v1.8.2
http://www.ozone3d.net/gpu_caps_viewer/
===================================================


===================================[ System / CPU ]
- CPU Name: AMD Phenom(tm) II X4 20 Processor
- CPU Core Speed: 3214 MHz
- CPU Num Cores: 4
- Family: 15 - Model: 4 - Stepping: 2
- Physical Memory Size: 4093 MB
- Operating System: Windows Server 2007 ver.6.1 build 7600 [No Service Pack]
- DirectX Version: 10.0
- PhysX Version: 9091112


===================================[ Graphics Adapter / GPU ]
- OpenGL Renderer: ATI Radeon HD 5800 Series
- Drivers Renderer: ATI Radeon HD 5800 Series
- DB Renderer: ATI Radeon HD 5850
- Device Description: ATI Radeon HD 5800 Series
- Adapter String: ATI Radeon HD 5800 Series
- Vendor: ATI Technologies Inc.
- Vendor ID: 0x1002
- Device ID: 0x6899
- Drivers Version: 8.700.0.0 (1-13-2010) - atig6pxx.dll
- ATI Catalyst Version String:
- ATI Catalyst Release Version String: 8.70-100113a-094252E
- GPU Codename: Cypress
- GPU Unified Shader Processors: 1440
- GPU Vertex Shader Processors: 0
- GPU Pixel Shader Processors: 0
- Video Memory Size: 1024 MB
- BIOS String: 113-585AZNB-10
- Current Display Mode: 1280x1024 @ 60 Hz - 32 bpp


===================================[ OpenGL GPU Capabilities ]
- OpenGL Version: 3.2.9405 Compatibility Profile Context
- GLSL (OpenGL Shading Language) Version: 1.50
- ARB Texture Units: 8
- Vertex Shader Texture Units: 16
- Pixel Shader Texture Units: 16
- Geometry Shader Texture Units: 32
- Max Texture Size: 16384x16384
- Max Anisotropic Filtering Value: X16.0
- Max Point Sprite Size: 8192.0
- Max Dynamic Lights: 8
- Max Viewport Size: 16384x16384
- Max Vertex Uniform Components: 1024
- Max Fragment Uniform Components: 1024
- Max Geometry Uniform Components: 4096
- Max Varying Float: 64
- Max Vertex Bindable Uniforms: 15
- Max Fragment Bindable Uniforms: 15
- Max Geometry Bindable Uniforms: 15
- Frame Buffer Objects (FBO) Support:[yes]
- Multiple Render Targets / Max draw buffers: 8
- Pixel Buffer Objects (PBO) Support:[yes]
- S3TC Texture Compression Support:[yes]
- ATI 3Dc Texture Compression Support:[yes]
- Texture Rectangle Support:[yes]
- Floating Point Textures Support:[no]
- MSAA: 1X
- MSAA: 2X
- MSAA: 4X
- MSAA: 8X
- OpenGL Extensions: 181 extensions
    <li>GL_AMDX_name_gen_delete</li>
    <li>GL_AMDX_random_access_target</li>
    <li>GL_AMDX_vertex_shader_tessellator</li>
    <li>GL_AMD_draw_buffers_blend</li>
    <li>GL_AMD_performance_monitor</li>
    <li>GL_AMD_seamless_cubemap_per_texture</li>
    <li>GL_AMD_shader_stencil_export</li>
    <li>GL_AMD_texture_compression_dxt6</li>
    <li>GL_AMD_texture_compression_dxt7</li>
    <li>GL_AMD_texture_cube_map_array</li>
    <li>GL_AMD_texture_texture4</li>
    <li>GL_AMD_vertex_shader_tessellator</li>
    <li>GL_ARB_blend_func_extended</li>
    <li>GL_ARB_color_buffer_float</li>
    <li>GL_ARB_copy_buffer</li>
    <li>GL_ARB_depth_buffer_float</li>
    <li>GL_ARB_depth_clamp</li>
    <li>GL_ARB_depth_texture</li>
    <li>GL_ARB_draw_buffers</li>
    <li>GL_ARB_draw_buffers_blend</li>
    <li>GL_ARB_draw_elements_base_vertex</li>
    <li>GL_ARB_draw_instanced</li>
    <li>GL_ARB_fragment_coord_conventions</li>
    <li>GL_ARB_fragment_program</li>
    <li>GL_ARB_fragment_program_shadow</li>
    <li>GL_ARB_fragment_shader</li>
    <li>GL_ARB_framebuffer_object</li>
    <li>GL_ARB_framebuffer_sRGB</li>
    <li>GL_ARB_geometry_shader4</li>
    <li>GL_ARB_half_float_pixel</li>
    <li>GL_ARB_half_float_vertex</li>
    <li>GL_ARB_instanced_arrays</li>
    <li>GL_ARB_map_buffer_range</li>
    <li>GL_ARB_multisample</li>
    <li>GL_ARB_multitexture</li>
    <li>GL_ARB_occlusion_query</li>
    <li>GL_ARB_pixel_buffer_object</li>
    <li>GL_ARB_point_parameters</li>
    <li>GL_ARB_point_sprite</li>
    <li>GL_ARB_provoking_vertex</li>
    <li>GL_ARB_sample_shading</li>
    <li>GL_ARB_seamless_cube_map</li>
    <li>GL_ARB_shader_objects</li>
    <li>GL_ARB_shader_texture_lod</li>
    <li>GL_ARB_shading_language_100</li>
    <li>GL_ARB_shadow</li>
    <li>GL_ARB_shadow_ambient</li>
    <li>GL_ARB_sync</li>
    <li>GL_ARB_texture_border_clamp</li>
    <li>GL_ARB_texture_buffer_object</li>
    <li>GL_ARB_texture_compression</li>
    <li>GL_ARB_texture_compression_rgtc</li>
    <li>GL_ARB_texture_cube_map</li>
    <li>GL_ARB_texture_cube_map_array</li>
    <li>GL_ARB_texture_env_add</li>
    <li>GL_ARB_texture_env_combine</li>
    <li>GL_ARB_texture_env_crossbar</li>
    <li>GL_ARB_texture_env_dot3</li>
    <li>GL_ARB_texture_float</li>
    <li>GL_ARB_texture_gather</li>
    <li>GL_ARB_texture_mirrored_repeat</li>
    <li>GL_ARB_texture_multisample</li>
    <li>GL_ARB_texture_non_power_of_two</li>
    <li>GL_ARB_texture_query_lod</li>
    <li>GL_ARB_texture_rectangle</li>
    <li>GL_ARB_texture_rg</li>
    <li>GL_ARB_texture_snorm</li>
    <li>GL_ARB_transpose_matrix</li>
    <li>GL_ARB_uniform_buffer_object</li>
    <li>GL_ARB_vertex_array_bgra</li>
    <li>GL_ARB_vertex_array_object</li>
    <li>GL_ARB_vertex_buffer_object</li>
    <li>GL_ARB_vertex_program</li>
    <li>GL_ARB_vertex_shader</li>
    <li>GL_ARB_window_pos</li>
    <li>GL_ATI_draw_buffers</li>
    <li>GL_ATI_envmap_bumpmap</li>
    <li>GL_ATI_fragment_shader</li>
    <li>GL_ATI_meminfo</li>
    <li>GL_ATI_separate_stencil</li>
    <li>GL_ATI_texture_compression_3dc</li>
    <li>GL_ATI_texture_env_combine3</li>
    <li>GL_ATI_texture_float</li>
    <li>GL_ATI_texture_mirror_once</li>
    <li>GL_EXT_abgr</li>
    <li>GL_EXT_bgra</li>
    <li>GL_EXT_bindable_uniform</li>
    <li>GL_EXT_blend_color</li>
    <li>GL_EXT_blend_equation_separate</li>
    <li>GL_EXT_blend_func_separate</li>
    <li>GL_EXT_blend_minmax</li>
    <li>GL_EXT_blend_subtract</li>
    <li>GL_EXT_compiled_vertex_array</li>
    <li>GL_EXT_copy_buffer</li>
    <li>GL_EXT_copy_texture</li>
    <li>GL_EXT_draw_buffers2</li>
    <li>GL_EXT_draw_instanced</li>
    <li>GL_EXT_draw_range_elements</li>
    <li>GL_EXT_fog_coord</li>
    <li>GL_EXT_framebuffer_blit</li>
    <li>GL_EXT_framebuffer_multisample</li>
    <li>GL_EXT_framebuffer_object</li>
    <li>GL_EXT_framebuffer_sRGB</li>
    <li>GL_EXT_geometry_shader4</li>
    <li>GL_EXT_gpu_program_parameters</li>
    <li>GL_EXT_gpu_shader4</li>
    <li>GL_EXT_histogram</li>
    <li>GL_EXT_multi_draw_arrays</li>
    <li>GL_EXT_packed_depth_stencil</li>
    <li>GL_EXT_packed_float</li>
    <li>GL_EXT_packed_pixels</li>
    <li>GL_EXT_pixel_buffer_object</li>
    <li>GL_EXT_point_parameters</li>
    <li>GL_EXT_provoking_vertex</li>
    <li>GL_EXT_rescale_normal</li>
    <li>GL_EXT_secondary_color</li>
    <li>GL_EXT_separate_specular_color</li>
    <li>GL_EXT_shadow_funcs</li>
    <li>GL_EXT_stencil_wrap</li>
    <li>GL_EXT_subtexture</li>
    <li>GL_EXT_texgen_reflection</li>
    <li>GL_EXT_texture3D</li>
    <li>GL_EXT_texture_array</li>
    <li>GL_EXT_texture_buffer_object</li>
    <li>GL_EXT_texture_buffer_object_rgb32</li>
    <li>GL_EXT_texture_compression_latc</li>
    <li>GL_EXT_texture_compression_rgtc</li>
    <li>GL_EXT_texture_compression_s3tc</li>
    <li>GL_EXT_texture_cube_map</li>
    <li>GL_EXT_texture_edge_clamp</li>
    <li>GL_EXT_texture_env_add</li>
    <li>GL_EXT_texture_env_combine</li>
    <li>GL_EXT_texture_env_dot3</li>
    <li>GL_EXT_texture_filter_anisotropic</li>
    <li>GL_EXT_texture_integer</li>
    <li>GL_EXT_texture_lod</li>
    <li>GL_EXT_texture_lod_bias</li>
    <li>GL_EXT_texture_mirror_clamp</li>
    <li>GL_EXT_texture_object</li>
    <li>GL_EXT_texture_rectangle</li>
    <li>GL_EXT_texture_sRGB</li>
    <li>GL_EXT_texture_shared_exponent</li>
    <li>GL_EXT_texture_snorm</li>
    <li>GL_EXT_texture_swizzle</li>
    <li>GL_EXT_timer_query</li>
    <li>GL_EXT_transform_feedback</li>
    <li>GL_EXT_vertex_array</li>
    <li>GL_EXT_vertex_array_bgra</li>
    <li>GL_IBM_texture_mirrored_repeat</li>
    <li>GL_KTX_buffer_region</li>
    <li>GL_NV_blend_square</li>
    <li>GL_NV_conditional_render</li>
    <li>GL_NV_copy_depth_to_color</li>
    <li>GL_NV_explicit_multisample</li>
    <li>GL_NV_primitive_restart</li>
    <li>GL_NV_texgen_reflection</li>
    <li>GL_SGIS_generate_mipmap</li>
    <li>GL_SGIS_texture_edge_clamp</li>
    <li>GL_SGIS_texture_lod</li>
    <li>GL_SUN_multi_draw_arrays</li>
    <li>GL_WIN_swap_hint</li>
    <li>WGL_ARB_extensions_string</li>
    <li>WGL_ARB_pixel_format</li>
    <li>WGL_ATI_pixel_format_float</li>
    <li>WGL_ARB_pixel_format_float</li>
    <li>WGL_ARB_multisample</li>
    <li>WGL_EXT_swap_control</li>
    <li>WGL_ARB_pbuffer</li>
    <li>WGL_ARB_render_texture</li>
    <li>WGL_ARB_make_current_read</li>
    <li>WGL_EXT_extensions_string</li>
    <li>WGL_ARB_buffer_region</li>
    <li>WGL_EXT_framebuffer_sRGB</li>
    <li>WGL_ATI_render_texture_rectangle</li>
    <li>WGL_EXT_pixel_format_packed_float</li>
    <li>WGL_I3D_genlock</li>
    <li>WGL_NV_swap_group</li>
    <li>WGL_ARB_create_context</li>
    <li>WGL_AMD_gpu_association</li>
    <li>WGL_AMDX_gpu_association</li>
    <li>WGL_ARB_create_context_profile</li>


===================================[ OpenCL Capabilities ]
- Num OpenCL platforms: 1
- Name: ATI Stream
- Version: OpenCL 1.0 ATI-Stream-v2.0.0
- Profile: FULL_PROFILE
- Vendor: Advanced Micro Devices, Inc.
- Num devices: 2

- CL_DEVICE_NAME: AMD Phenom(tm) II X4 20 Processor
- CL_DEVICE_VENDOR: AuthenticAMD
- CL_DRIVER_VERSION: 1.0
- CL_DEVICE_PROFILE: FULL_PROFILE
- CL_DEVICE_VERSION: OpenCL 1.0 ATI-Stream-v2.0.0
- CL_DEVICE_TYPE: CPU
- CL_DEVICE_VENDOR_ID: 0x1002
- CL_DEVICE_MAX_COMPUTE_UNITS: 4
- CL_DEVICE_MAX_CLOCK_FREQUENCY: 3214MHz
- CL_DEVICE_ADDRESS_BITS: 32
- CL_DEVICE_MAX_MEM_ALLOC_SIZE: 524288KB
- CL_DEVICE_GLOBAL_MEM_SIZE: 1024MB
- CL_DEVICE_MAX_PARAMETER_SIZE: 4096
- CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE: 64 Bytes
- CL_DEVICE_GLOBAL_MEM_CACHE_SIZE: 64KB
- CL_DEVICE_ERROR_CORRECTION_SUPPORT: NO
- CL_DEVICE_LOCAL_MEM_TYPE: Global
- CL_DEVICE_LOCAL_MEM_SIZE: 32KB
- CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE: 64KB
- CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS: 3
- CL_DEVICE_MAX_WORK_ITEM_SIZES: [1024 ; 1024 ; 1024]
- CL_DEVICE_MAX_WORK_GROUP_SIZE: 1024
- CL_EXEC_NATIVE_KERNEL: 4628960
- CL_DEVICE_IMAGE_SUPPORT: NO
- CL_DEVICE_MAX_READ_IMAGE_ARGS: 0
- CL_DEVICE_MAX_WRITE_IMAGE_ARGS: 0
- CL_DEVICE_IMAGE2D_MAX_WIDTH: 0
- CL_DEVICE_IMAGE2D_MAX_HEIGHT: 0
- CL_DEVICE_IMAGE3D_MAX_WIDTH: 0
- CL_DEVICE_IMAGE3D_MAX_HEIGHT: 0
- CL_DEVICE_IMAGE3D_MAX_DEPTH: 0
- CL_DEVICE_MAX_SAMPLERS: 0
- CL_DEVICE_PREFERRED_VECTOR_WIDTH_CHAR: 16
- CL_DEVICE_PREFERRED_VECTOR_WIDTH_SHORT: 8
- CL_DEVICE_PREFERRED_VECTOR_WIDTH_INT: 4
- CL_DEVICE_PREFERRED_VECTOR_WIDTH_LONG: 2
- CL_DEVICE_PREFERRED_VECTOR_WIDTH_FLOAT: 4
- CL_DEVICE_PREFERRED_VECTOR_WIDTH_DOUBLE: 0
- CL_DEVICE_EXTENSIONS: 5
- Extensions:
- cl_khr_global_int32_base_atomics
- cl_khr_global_int32_extended_atomics
- cl_khr_local_int32_base_atomics
- cl_khr_local_int32_extended_atomics
- cl_khr_byte_addressable_store

- CL_DEVICE_NAME: Cypress
- CL_DEVICE_VENDOR: Advanced Micro Devices, Inc.
- CL_DRIVER_VERSION: CAL 1.4.553
- CL_DEVICE_PROFILE: FULL_PROFILE
- CL_DEVICE_VERSION: OpenCL 1.0 ATI-Stream-v2.0.0
- CL_DEVICE_TYPE: GPU
- CL_DEVICE_VENDOR_ID: 0x1002
- CL_DEVICE_MAX_COMPUTE_UNITS: 18
- CL_DEVICE_MAX_CLOCK_FREQUENCY: 765MHz
- CL_DEVICE_ADDRESS_BITS: 32
- CL_DEVICE_MAX_MEM_ALLOC_SIZE: 262144KB
- CL_DEVICE_GLOBAL_MEM_SIZE: 256MB
- CL_DEVICE_MAX_PARAMETER_SIZE: 1024
- CL_DEVICE_GLOBAL_MEM_CACHELINE_SIZE: 0 Bytes
- CL_DEVICE_GLOBAL_MEM_CACHE_SIZE: 0KB
- CL_DEVICE_ERROR_CORRECTION_SUPPORT: NO
- CL_DEVICE_LOCAL_MEM_TYPE: Local (scratchpad)
- CL_DEVICE_LOCAL_MEM_SIZE: 32KB
- CL_DEVICE_MAX_CONSTANT_BUFFER_SIZE: 64KB
- CL_DEVICE_MAX_WORK_ITEM_DIMENSIONS: 3
- CL_DEVICE_MAX_WORK_ITEM_SIZES: [256 ; 256 ; 256]
- CL_DEVICE_MAX_WORK_GROUP_SIZE: 256
- CL_EXEC_NATIVE_KERNEL: 4628960
- CL_DEVICE_IMAGE_SUPPORT: NO
- CL_DEVICE_MAX_READ_IMAGE_ARGS: 0
- CL_DEVICE_MAX_WRITE_IMAGE_ARGS: 0
- CL_DEVICE_IMAGE2D_MAX_WIDTH: 0
- CL_DEVICE_IMAGE2D_MAX_HEIGHT: 0
- CL_DEVICE_IMAGE3D_MAX_WIDTH: 0
- CL_DEVICE_IMAGE3D_MAX_HEIGHT: 0
- CL_DEVICE_IMAGE3D_MAX_DEPTH: 0
- CL_DEVICE_MAX_SAMPLERS: 0
- CL_DEVICE_PREFERRED_VECTOR_WIDTH_CHAR: 16
- CL_DEVICE_PREFERRED_VECTOR_WIDTH_SHORT: 8
- CL_DEVICE_PREFERRED_VECTOR_WIDTH_INT: 4
- CL_DEVICE_PREFERRED_VECTOR_WIDTH_LONG: 2
- CL_DEVICE_PREFERRED_VECTOR_WIDTH_FLOAT: 4
- CL_DEVICE_PREFERRED_VECTOR_WIDTH_DOUBLE: 0
- CL_DEVICE_EXTENSIONS: 4
- Extensions:
- cl_khr_global_int32_base_atomics
- cl_khr_global_int32_extended_atomics
- cl_khr_local_int32_base_atomics
- cl_khr_local_int32_extended_atomics


===================================[ Misc. ]


===================================[ Related Graphics Drivers ]
- http://www.geeks3d.com/?page_id=752
- http://downloads.guru3d.com/download.php?id=18
- http://www.tweakguides.com/ATICAT_1.html


===================================[ Related Graphics Cards Reviews ]

13
General Discussion / Sailfish OpenCL fluid simulation
« on: January 22, 2010, 11:23:05 AM »
Hi JeGX,

Have you looked at the OpenCL fluid simulation Sailfish before?
http://sailfish.us.edu.pl/index.html

It uses pyopencl!

I have been hacking away to get this running under Windows. I have installed Python all the required modules, including compiling pyopencl against the ATI Stream 2.0 SDK.

I had to make a few modifications to Sailfish as it was designed for Linux. This included removing a hardcoded path, and adding support for the OpenCL ICD (selecting an OpenCL platform...).

I have had mixed results.
The example scripts compile fine, and run with no exceptions, yet I get black pixels, and the frames simulated per second is impossibly fast. I have the feeling the kernel is not running, but no error seems to be generated...
Also the ATI compiler always returns warnings that  the -cl-single-precision-constant and -cl-fast-relaxed-math compiler options are not supported despite being in the OpenCL specification...

But I can get the simulation to work perfectly and produce the red pixels if I run it via clprofile.exe! It probably runs slower though thanks to all the logging to stdout and the CSV file...
Why does profiling make it work?!?

Also the 3D simulations always fail with the following error:
Code: [Select]
Traceback (most recent call last):
  File "c:\sailfish\examples\lbm_ldc_3d.py", line 54, in <module>
    sim.run()
  File "C:\python26\lib\site-packages\sailfish\lbm.py", line 715, in run
    self.vis.main()
  File "C:\python26\lib\site-packages\sailfish\vis2d.py", line 312, in main
    self.sim.sim_step(self._tracers)
  File "C:\python26\lib\site-packages\sailfish\lbm.py", line 532, in sim_step
    self.backend.run_kernel(kerns[1], self.kern_grid_size)
  File "C:\python26\lib\site-packages\sailfish\backend_opencl.py", line 54, in r
un_kernel
    cl.enqueue_nd_range_kernel(self.queue, kernel, global_size, kernel.block)
pyopencl.LogicError: enqueue_nd_range_kernel failed: invalid value - global/work
 work sizes have differing dimensions

I have traced this back possibly being to line 517 in lbm.py:
Code: [Select]
self.kern_grid_size = (self.options.lat_nx/self.block_size * self.options.lat_ny, self.options.lat_nz)but if I change this to:
Code: [Select]
self.kern_grid_size = (self.options.lat_nx/self.block_size, self.options.lat_ny, self.options.lat_nz)the script runs but I get black pixels (even when inside the profiler).

Could you try this out on your setup and see what you get?
I have attached the modified python files.

In case it is any help I have attached a GPU Caps Viewer XML file.

Also:
Latest 32bit 2.6 python, and latest version of all required modules.
Compiled pyopencl with Visual Studio 2008 Professional SP1.
I am only running OpenCL on CPU as I have not got a graphics card for my new system yet.

Leith

Pages: [1]