TY - GEN
T1 - 2.5 A 28nm Physical-Based Ray-Tracing Rendering Processor for Photorealistic Augmented Reality with Inverse Rendering and Background Clustering for Mobile Devices
AU - Guo, Shiyu
AU - Sapatnekar, Sachin
AU - Gu, Jie
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - As the applications of Augmented Reality (AR) or Virtual Reality (VR) expand rapidly with the growing demands on enhanced visual realism, photorealistic image generation and insertion has become an essential feature for the emerging AR applications providing real-time workplace/household visual assistance. Physical Based Ray-Tracing (PBRT) is often used where synthesized images are generated by simulating the real environment and tracing the light transportation to achieve photorealistic effects, such as reflection, refraction, soft shadows, etc. PBRT is widely used in product design, medical visualization, video games and movie effects. To enable photorealistic rendering, there is a strong demand to support ray-tracing (RT) on mobile devices [1]. However, the challenges are: (1) unstructured memory access pattern and complex control flow lead to scheduling difficulty; (2) high memory requirements exhaust the limited SRAM space on edge devices; (3) low error tolerance requires high precision for computing; (4) complex computations, such as division and square root, require significant computing resources for the edge devices. As a result, common rendering engines such as Apple ARKit, OpenGL, are mainly based on the lower cost rasterization rendering technique. Unfortunately, rasterization rendering fails to produce photorealistic synthesis as shown in Fig. 2.5.1. Few ASICs have been fabricated so far as a mobile photorealistic rendering solution solution, however, they may not support RT [2], or may suffer from low efficiency [3]. This work has developed a ray-tracing processor, which also supports inverse rendering (IR) for background extraction [4]. The key features of this work include: (1) an ASIC rendering processor that embeds an end-to-end PBRT solution with IR for AR on mobile devices, (2) a reconfigurable mixed-precision PE design supporting diverse computing tasks for both IR and RT, (3) background clustered Field of View (FOV)-focused 3D construction reducing conventional background scene complexity from O(nlogn) to O(1), (4) scalable partitioning scheme for complex 3D objects, with an average of 13 × speed up on test scenes, (5) use of Global RT Scheduler (GRTS) and Global Memory Access Controller (GMAC) to overcome the challenges of irregular memory access pattern and varied PE run-time with overall 684 × speedup compared with the baseline design. The 28nm test chip achieves 3.95 - 28.8 × higher rendering efficiency compared with existing ASIC solutions, enabling real-time PBRT rendering on mobile edge devices.
AB - As the applications of Augmented Reality (AR) or Virtual Reality (VR) expand rapidly with the growing demands on enhanced visual realism, photorealistic image generation and insertion has become an essential feature for the emerging AR applications providing real-time workplace/household visual assistance. Physical Based Ray-Tracing (PBRT) is often used where synthesized images are generated by simulating the real environment and tracing the light transportation to achieve photorealistic effects, such as reflection, refraction, soft shadows, etc. PBRT is widely used in product design, medical visualization, video games and movie effects. To enable photorealistic rendering, there is a strong demand to support ray-tracing (RT) on mobile devices [1]. However, the challenges are: (1) unstructured memory access pattern and complex control flow lead to scheduling difficulty; (2) high memory requirements exhaust the limited SRAM space on edge devices; (3) low error tolerance requires high precision for computing; (4) complex computations, such as division and square root, require significant computing resources for the edge devices. As a result, common rendering engines such as Apple ARKit, OpenGL, are mainly based on the lower cost rasterization rendering technique. Unfortunately, rasterization rendering fails to produce photorealistic synthesis as shown in Fig. 2.5.1. Few ASICs have been fabricated so far as a mobile photorealistic rendering solution solution, however, they may not support RT [2], or may suffer from low efficiency [3]. This work has developed a ray-tracing processor, which also supports inverse rendering (IR) for background extraction [4]. The key features of this work include: (1) an ASIC rendering processor that embeds an end-to-end PBRT solution with IR for AR on mobile devices, (2) a reconfigurable mixed-precision PE design supporting diverse computing tasks for both IR and RT, (3) background clustered Field of View (FOV)-focused 3D construction reducing conventional background scene complexity from O(nlogn) to O(1), (4) scalable partitioning scheme for complex 3D objects, with an average of 13 × speed up on test scenes, (5) use of Global RT Scheduler (GRTS) and Global Memory Access Controller (GMAC) to overcome the challenges of irregular memory access pattern and varied PE run-time with overall 684 × speedup compared with the baseline design. The 28nm test chip achieves 3.95 - 28.8 × higher rendering efficiency compared with existing ASIC solutions, enabling real-time PBRT rendering on mobile edge devices.
UR - http://www.scopus.com/inward/record.url?scp=85188077121&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85188077121&partnerID=8YFLogxK
U2 - 10.1109/ISSCC49657.2024.10454394
DO - 10.1109/ISSCC49657.2024.10454394
M3 - Conference contribution
AN - SCOPUS:85188077121
T3 - Digest of Technical Papers - IEEE International Solid-State Circuits Conference
SP - 44
EP - 46
BT - 2024 IEEE International Solid-State Circuits Conference, ISSCC 2024
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2024 IEEE International Solid-State Circuits Conference, ISSCC 2024
Y2 - 18 February 2024 through 22 February 2024
ER -