GT200: Nvidia GeForce GTX 280 analysis
bit-tech.net has published a 15-page article on the analysis of the GT200 GPU architecture, with some words about CUDA and PhysX.
Read the complete article HERE.
The 240 thread processors are split down into ten thread processing clusters (TPCs), with each broken down into three streaming multiprocessors (SMs) or thread processing arrays (TPAs). Threads are assigned by the thread scheduler, which talks directly to each streaming multiprocessor through a dedicated instruction unit; this then assigns tasks to one of eight thread (or stream) processors.