NVIDIA GF100 Architecture Details



Fermi GT100 - Architecture overview



GT100 tesselation demo


After the first global overview in September 2009, NVIDIA has released new details on its new Fermi GT100 architecture. Here is a summary of NVIDIA’s GT100 architecture features in equations:

  • The CUDA core is the primary working unit of the GF100 (Each CUDA core is fully IEEE 754-2008 compliant) – GF100 = 512 CUDA cores
  • Streaming Multiprocessor (SM) = 32 CUDA cores – GF100 = 16 SM
  • 4 SFU per SM (SFU – Special Function Unit – executes transcendental instructions such as sin, cosine, …) – GF100 = 64 SFUs
  • Graphics Processing Cluster (GPC) = 4 SM – GF100 = 4 GPC
  • 1 Raster Engine per GPC (raster engine = rasterization, z-culling). A raster engine processes 8 pixels per clock – GF100 = 32 pixels per clock
  • 1 PolyMorph Engine per SM (PolyMorph Engine: execution unit that handles geometry for GF100: vertex fetch, tessellation, viewport transform, attribute setup, and stream output) – GF100 = 16 PolyMorph Engines
  • 4 Texture Units per SM – GF100 = 48 Texture Units
  • 6 partitions of 8 ROPs (ROPs perform blending or AA) – GF100 = 48 ROPs

Fermi GT100 - Die
GT100 die

Fermi GT100 - SM (Streaming Multiprocessor) detail
SM (Streaming Multiprocessor) detail

Fermi GT100 - CUDA core detail
CUDA core detail

The tessellator of the PolyMorph Engine is THE BIG FEATURE of GT100:






References:




Geeks3D.com

↑ Grab this Headline Animator