Download
hereThis slide contains more comprehensive infos about Kepler like:
New Instruction: SHFL
Data exchange between threads within a warp
Avoids use of shared memory
One 32-bit value per exchange
ATOM instruction enhancements
Added int64 functions to 2 – 10x performance gains
match existing int32 Shorter processing pipeline
More atomic processors
Atom Op int32 int64
Slowest 10x faster
add x x
Fastest 2x faster
cas x x
exch x x
min/max x X
and/or/xor x X