現在PC上的顯卡架構
可以看出來常存取 framebuffer,而且pixel 有很多overdraw,導致bandwidth 需求很高,也有許多浪費。並且在行動平台上也較秏電。
ex. NVIDIA Tegra
TBR(Tile Based Rendering)
確保了 100% cache efficiency在color buffer和depth buffer,因為不是寫在ram裡,而是寫在on-Chip memory。但對於看不見的物件還是有overdraw。
ex. Arm Mali (Small tiles), Qualcomm Adreno(Large tiles)
TBDR(Tile Based Deferred Rendering)
delay或defer all texturing and shading operations until their visibility is known
HSR(Hidden Surface Removal)
ex. Imagination的PowerVR
Tiled的種類
-small (ex. 16x16) SGX, Mali
-relatively large (ex. 256K) - Adreno
memory is on chip - Fast
once GPU is done rendering tile -> tile is "resolved" - written out to slower RAM
Sort opaque geom differently for Traditional vs Tiled
Tiled - sort by material to reduce cpu drawcalls
Traditional - sort roughly front-to-back to maximize ZCull efficiency then by material
Tiled Deferred: render alpha-tested after opaque (有很高的機率expensive alpha-test 會occluded)
Use EXT_discard_framebuffer extensions on Tiled
• will avoid copying data (color/depth/stencil) you're not planning to use
Clear RenderTarget before rendering into it
• otherwise on Tiled driver will copy color/depth/stencil back from RAM
• not clearing is not an optimization! (和傳統不一樣, 傳統可能不clear buffer減少cost)
Benefits
• Tiled: MSAA is almost free (5-10% of rendering time)
• Tiled: AlphaBlending is significantly cheaper
• Tiled: less dithering artifacts for 16bit framebuffers
Caveats
• TBDR: RenderTarget switch might be more expensive
• TBDR: Too much geometry will flush whole pipeline (ParameterBuffer overflow)
reference:
http://withimagination.imgtec.com/index.php/powervr/understanding-powervr-series5xt-powervr-tbdr-and-architecture-efficiency-part-4
http://www.realtimerendering.com/downloads/MobileCrossPlatformChallenges_siggraph.pdf
沒有留言 :
張貼留言