Great news for the OpenCL community: Intel has released an alpha version of its OpenCL SDK.
Intel® OpenCL* SDK is an Alpha software release. It is an implementation of the OpenCL* 1.1 standard optimized for Intel® CoreTM processors, running on Microsoft* Windows* 7 and Windows Vista* operating systems. This Alpha software brings OpenCL* for the CPU in support of OpenCL developers desiring CPU advantages found on many OpenCL* workloads. OpenCL* language and Application Programming Interface (API) enables you to seamlessly take advantage of the Intel® CoreTM processor benefits such as Intel® Streaming SIMD Extensions (Intel® SSE) utilization and Multi-Core scalability.
The SDK comes with four samples including two graphics related demos: GodRays and MedianFilter
More information: Intel OpenCL SDK.
Download: Download Intel OpenCL SDK.
God Rays: this sample demonstrates how to use high dynamic range (HDR) rendering with God Rays (crepuscular rays) effect in OpenCL*. This implementation optimizes rendering passes by sharing intermediate data between pixels during pixel processing, improves the method performance, and reduces data loads.
Median filter: this sample demonstrates how to use median filter in OpenCL*. This implementation optimizes filtration process using implicit Single Instruction Multiple Data (SIMD) code vectorization performed by build-in OpenCL compiler vectorizer. Data-level parallelism of the underlying algorithm results in additional performance gain. The sample improves the performance of the method and reduces data loads.
Quick test: Intel OpenCL SDK tested on NVIDIA and ATI platform:
GPU Caps Viewer displaying Intel OpenCL information
well… wow! very nice!
wow is right!…i wasnt expecting this anytime soon!
but lookin at Stefans tests…i think ill wait till its actually working with nvidia OpenCL
Looks like OpenCL will spread wildly, while most of us have intel cpu ;).
it does not function in the Intel Q6600 platform because it does not have SSE 4.1
I wonder if lack of support for older processors, specifically original Merom generation Core 2 Duos with SSSE3, along with support for Windows XP, the constant elephant in the room, is only because it is still in Alpha or is this a permanent design decision? Certainly, the higher clock speed Conroe processors with 1333MHz FSB and Kentsfield Core 2 Quads still have plenty of life left and probably still amount for a large installed base. And Windows XP still has greater than 50% market share.
It is good that even though they are slow to release OpenCL drivers, they are going directly to OpenCL 1.1. Hopefully, they are looking to do some interesting CPU/IGP integration for Sandy Bridge since having the CPU and GPU on the same die and sharing the L3 cache should be beneficial for OpenCL.
Well, unlike AMD’s implementation, it supports images.
SSE4.x contains some quite useful instructions for OpenCL programming, like DOT product, SAD, blending and WC register loading.
On top of this, all the SSE4.x processors have twice the hardware DIV throughput, compared to the previous generations from Intel.
AMD got OpenCL running on SSE2…
Seems like AMD are better at compiling OpenCL than Intel…
Of course this could just be Intel stopping people using there compilers on AMD… (I bet they put a check in for Intel only CPUID too…)
Anyone done a comparison bench of the AMD and Intel compiler on the same CPU yet?
“Seems like AMD are better at compiling OpenCL than Intel…”
Intel can do auto-vectorization — how is that better than AMD’s scalar SSE output?
And when AVX lands in the consumer market, this will be of key importance.
@Leith Bade
but intel don’t hamstring it’s OpenCL implementation with artificially low Global memory amounts like AMD do look above 2GB for Intel but only 1GB for AMD they do the same thing for OpenCL on the GPU aswell 512MB WTH my card has 1GB why do I only get to use half
I do benchmark in sandra 11
intel:
SiSoftware Sandra
Benchmark Results
Aggregate Shader Performance : 14.28MPix/s
Native Float Shaders : 28MPix/s
Native Double Shaders : 7.27MPix/s
Interface : OpenCL
Results Interpretation : Higher scores are better.
Performance vs. Speed
Aggregate Shader Performance : 5.36kPix/s/MHz
Native Float Shaders : 10.53kPix/s/MHz
Native Double Shaders : 2.73kPix/s/MHz
Results Interpretation : Higher scores are better.
Performance Test Status
Result ID : GenuineIntel (4C 2.67GHz, 2GB)
Platform Compliance : x86
SMP (Multi-Processor) Benchmark : No
Total Test Threads : 1
System Timer : 2.6MHz
Rendered Image Size : 1920×1080
Graphics Processor
Model : GenuineIntel
Interface Version : 1.01
Cores per Processor : 1 Unit(s)
Shader Speed : 2.67GHz
Peak Processing Performance (PPP) : 21.33GFLOPS
Adjusted Peak Performance (APP) : 6.4WG
Total Memory : 2GB
Performance Enhancing Tips
Warning 338 : You may need to enable a monitor on secondary GPGPU adapters for multi-GPU support! This is a driver limitation.
Notice 5008 : To change benchmarks, click Options.
Notice 5004 : Synthetic benchmark. May not tally with ‘real-life’ performance.
Notice 5006 : Only compare the results with ones obtained using the same version!
Tip 2 : Double-click tip or press Enter while a tip is selected for more information about the tip.
AMD:
SiSoftware Sandra
Benchmark Results
Aggregate Shader Performance : 26.74MPix/s
Native Float Shaders : 35.86MPix/s
Native Double Shaders : 20MPix/s
Interface : OpenCL
Results Interpretation : Higher scores are better.
Performance vs. Speed
Aggregate Shader Performance : 10.03kPix/s/MHz
Native Float Shaders : 13.45kPix/s/MHz
Native Double Shaders : 7.48kPix/s/MHz
Results Interpretation : Higher scores are better.
Performance Test Status
Result ID : Intel(R) Core(TM)2 Quad CPU Q9400 @ 2.66GHz (4C 2.67GHz, 1GB)
Platform Compliance : x86
SMP (Multi-Processor) Benchmark : No
Total Test Threads : 1
System Timer : 2.67GHz
Rendered Image Size : 1920×1080
Graphics Processor
Model : Intel(R) Core(TM)2 Quad CPU Q9400 @ 2.66GHz
Interface Version : 1.01
Cores per Processor : 1 Unit(s)
Shader Speed : 2.67GHz
Peak Processing Performance (PPP) : 21.33GFLOPS
Adjusted Peak Performance (APP) : 6.4WG
Total Memory : 1GB
Performance Enhancing Tips
Warning 338 : You may need to enable a monitor on secondary GPGPU adapters for multi-GPU support! This is a driver limitation.
Notice 5008 : To change benchmarks, click Options.
Notice 5004 : Synthetic benchmark. May not tally with ‘real-life’ performance.
Notice 5006 : Only compare the results with ones obtained using the same version!
Tip 2 : Double-click tip or press Enter while a tip is selected for more information about the tip.
No can do on 64bit, stupid Windows…
Pingback: [GPU Computing] Intel Prepares Its Arrival in OpenCL World - 3D Tech News, Pixel Hacking, Data Visualization and 3D Programming - Geeks3D.com
I did some more benchmarks, tests.
in 2 devices: Intel Q9400 2.66
and Nvidia Geforce 280 GTX
on 3 implementation (imp) INTEL_CPU, AMD_CPU, Nvidia_GPU.
1) AMD opencl SAMPLE — mandelbrot
intel imp. 146FPS amd imp 160FPS
Nvidia imp 290FPS (USE ONLY (50%) OF GPU)
— constantbandwidth intel imp.
single static index 45.3816GB/S
dynamic index 30.5775
linear 7.38537GB/s
amd imp.
single static index 52,0025GB/s
dynamic index 51,9977GB/s
linear 51.5341GB/s
–nbody
nvidia 340FPS(50% gpu)
intel 390FPS
amd 290FPS
–simplegl
intel 90FPS
nvidia 360FPS
AMD 91 FPS
–URNGNoiseGL
Intel 24FPS
Nvidia 202FPS
AMD 28FPS
–pcibandwidth
Intel 1.48693
Nvidia 1.61311
Amd 1.52506
2)ratGPU Version 0.4.5e for Windows (x86)
AMD
http://www.xnormal.net/ratGPU/verify.aspx?k=DC28CF10CF442F3D46FC5F0B1670FC1CC85695B3988B32FAC5CF9BA3A74D3575433A3C0B16ACA78EF6BE95976F3C771F671B0608EEB4F9F3E6D2D1536645516614382CAABB8C9D9FE6C43122031405765556BFACECF4B7F6DB2F231405727A5747BAAD830E7C6287A4603D333EB02A311675374F
INTEL
http://www.xnormal.net/ratGPU/verify.aspx?k=C4CD5C10CF442F3D46FC5F0B0E956F1C5B5695B3800794ADC55CFAAB9D70704A1707290F15CFD18E85F0A6976D614C582211070EEEF0858AE0D2C734250B0776635F503F2D13D695B1ECC2EF817B401675266A
NVIDIA
http://www.xnormal.net/ratGPU/verify.aspx?k=1CF71F70AF248F9DA61CFFABD6AF2C7C1836F5135E3C85D32520190C052D31CFD688FA9F9FADDD343F7724140C06E6F387FF9D82CBCE352E0773614D42B7A59698F9E9DACC25A89E8C24020243519812C4D3167520B9
classic CPU symphony of c++
http://www.xnormal.net/ratGPU/verify.aspx?k=1C280264BB288389920083BFD6703148053AE1EF58A3709A19036F44262C2E2ADEBCEE95E4C0CB225A4A41233B402BA3BBEFD2BCBAD8555F0A0665071219EDBAE8EC98FECF2031730A70655657A8D98A89E2EBD8B8585B1A777B5740B1AE8689FBEBDDD7BAA89E5B1034515F6A04F6CD16753735