1 – oclHashcat Overview
A new version of oclHashcat is available. oclHashcat is the GPU accelerated version of Hashcat, a MD5 password cracker. oclHashcat is able to use up to 16 GPUs to achieve its job. oclHashcat is available in two versions: OpenCL (oclHashcat) and CUDA (cudaHashcat). It seems the OpenCL version is only limited to Radeon cards. I tried to use it on the GTX 580 and here is the error message:
Then no apple-to-apple comparison but only OpenCL / Radeon vs CUDA / GeForce… I wonder why oclHashCat OpenCL support is not enabled on GeForce boards?
Here are the main features of Hashcat:
- Free
- Multi-GPU (up to 16 gpus)
- Multi-Hash (up to 24 million hashes)
- Multi-OS (Linux and Windows native binaries)
- Multi-Platform (OpenCL and CUDA support)
- Multi-Algo (MD4, MD5, SHA1, DCC, NTLM, MySQL, …)
- Fastest multihash MD5 cracker on NVidia cards
- Fastest multihash MD5 cracker on ATI 5xxx cards
- Supports wordlists (not limited to Brute-Force / Mask-Attack)
- Combines Dictionary-Attack with Mask-Attack to launch a Hybrid-Attack
- Runs very cautious, you can still watch movies or play games while cracking
- Supports pause / resume
- The first and only GPU-based Fingerprint-Attack engine
- Includes hashcats entire rule engine to modify wordlists on start
2 – oclHashcat OpenCL / CUDA Tests
oclHashcat 0.2.4 requires ATI Stream v2.3 for Radeon HD 6000 Series support. Just install Catalyst 10.12 APP and you’re ok.
Graphics drivers used:
– Radeon driver: Catalyst 10.12 (APP version for OpenCL support)
– GeForce driver: R266.58
Graphics cards tested:
– ATI Radeon HD 5870 reference board
– EVGA GTX 580 Superclocked
– SAPPHIRE HD 6970
– NVIDIA GeForce GTX 480 reference board
Tests
For GeForce boards I launched cudaExample.cmd and oclExample.cmd for Radeon boards.
Here are the performance (GPU speed):
– one GTX 480: 1041M c/s
– one HD 5870: 1211M c/s
– one GTX 580: 1217M c/s
– two GTX 480: 1457M c/s
– one HD 6970: 1575M c/s
– two HD 6970: 2520M c/s
Radeon HD 6970 single GPU
The performance of the HD 6970 is very good (and I’m sure we’ll see performance boost with future drivers) and according to this test, one Cayman is faster than two GF100.
The Radeon HD 6970 seems to be the card of choice for password crackers 😉
Just for the sake of memory, here are some GFLOPS (source):
– GTX 580: 1581 GFLOPS
– HD 6970: 2703 GFLOPS
– HD 5870: 2720 GFLOPS
And here is the GPU usage of the HD 6970 CF:
Radeon HD 6970 CrossFire – GPU usage under oclHashCat
This test of oclHashcat was interesting because it taught me that to take advantage of several Radeon GPUs in OpenCL, CrossFire must be enabled. For regular 3D this is a normal requirement, but we are talking about GPU computing. In OpenCL, each GPU is a compute device and then a system with two Radeon HD 6970 should have two compute devices. With NVIDIA this is the case. No matter the SLI state, if you have two GPUs (let’s say two GTX 480 in our case), NVIDIA OpenCL or CUDA will see two compute devices:
GPU Caps Viewer – two GTX 480, two OpenCL compute devices, SLI disabled
With AMD OpenCL implementation, here are the compute devices detected (two HD 6970 in the rig) when CrossFire is disabled:
GPU Caps Viewer – two HD 6970, one GPU OpenCL compute device, CF disabled
As you can see, AMD’s OpenCL sees only one GPU compute device (the second device is a CPU compute device, keep in mind that AMD offers both GPU and CPU OpenCL support). To see two compute devices, CrossFire must be enabled:
GPU Caps Viewer – two HD 6970, two GPU OpenCL compute devices, CF enabled
That explains why oclHashcat uses only one GPU when CrossFire was disabled.
I also did a test with a HD 6970 + HD 5870 but as expected, only the Cayman compute device has been recognized by AMD’s OpenCL…
Got 296.5 on my office’s 8800GT! xD
This shows the RAW performance of AMD GPU cards when used correctly and not via a crippled closed API like CUDA
162,7M/s on HD4670 with BOINC doing other things on CPU 😉
doesn’t show us opencl performance on nvidia tech though
1301.8M/s with my two gtx 460
– one GTX 480: 1041M c/s
– two GTX 480: 1457M c/s
-> 1.4 ratio
This seems to be very improvable.
So I don’t think that ATi/NV comparisons can be made with such results … (esp. cause of missing OpenCL support on NV, AFAIK tests showed NV can be faster there versus ATi)
By the way… since my second gtx 460 is 10-15 degrees lower in temps than my first, I overclocked the second card a bit more.
Testing individual cards I get 675M/s on the first
(clocked at 820/1640/2020)
and 754M on the second
(clocked at 880/1760/2020)
It would be nice to have a program that maximizes computing speed using the 2 cuda cards along with 8 cpu threads 🙂
@jK – actually at password recovery/cracking algorithms Radeon stream processor architecture is the most powerful. AES, SHA-1, MD5, all those are clear AMD victory. OpenCL or not.
Getting 1305M on my 570@850Mhz SLI, but the cards operate at ~50% each, so there’s something wrong really.
psolord: use -d 1,2 😉
Zibri, I edited the .cmd file to read like this
cudaHashcat64.exe -d 1,2 example.hash ?l?l?l?l example.dict
pause
Is this correct?
It still runs at 50% per gpu! :S
DirectCompute support would be nice so we wouldn’t have to mess up with stream sdk…
Got 1134M/s on my 4850X2 @ 725MHz
try plug some monitor on the second card and expand desktop. Windows disable “unused” GPU. then you should see second card and even in 5970+6970 scenario.
Got a nice round 1.300M on my 5870 running the recently leaked Catalyst 11.1a Hotfix Update drivers. 🙂
Given Cayman XT has a combined 24,2% processor/clock advantage, 21% improved performance in this test (comparing the posted an my own results) doesn’t seem that bad but not earth-shattering also.
The app is nice however OCL on Radeon isn’t the best choice here. A lot of horsepower is lost with OCL on radeons while there’s faster app. IGHASHGPU (written by Ivan Golubev) works both with CUDA (GF) and CAL (RADEON). Radeons are more powerful there as CAL is more efficient than OCL. Same applies to IGRARGPU, E.W.S.A. (elcomsoft wifi soft) or Accent office password recovery. I guess Radeons are clear choice in cryptography tasks (and heavy arithmetics) while GF are better suited to general purpose.
yes but you compare apples and oranges. ighashgpu is a brute force only cracker.
oclHashcat is a wordlist based cracker that can optionally do brute force cracking.
modern hashing algorithms are salted and heavily iterated which means the future in password cracking is wordlist based especially in gpgpu. that is why the autor of the article did a good choice showing a modern cracker like oclHashcat.
@arturis
“yes but you compare apples and oranges.”
Just like comparing OpenCL based app for Radeon and CUDA based for NV. We all know OpenCL based code is slower than native solutions (on GPUs, not always on CPUs). All I wanted to say is that AMD architectures (even the old ones like RV670 or 770) are best suited to the tasks related with cryptography even when you compare apps written in native hw close API. The only one test which might give direct comparison is GPCBenchmarkOCL SHA-1 test (cryptography) and IIRC it clearly shows slight advantage of AMD there too. Then what in my comment above you disagree?
Check the hashcat forum for more news and support guys.
It’s http://hashcat.net/forum/
Pingback: The Zambezi CPU Performance Esimate... - Page 25 - Overclock.net - Overclocking.net
Yes AMD cards for way faster than NV cards for password cracking on GPU. My 5770 does 3.3 billion NTLM.