We improve the interpolation accuracy and efficiency of the Delaunay tessellation field estimator (DTFE) for surface density field reconstruction by proposing an algorithm that takes advantage of the adaptive triangular mesh for lineof- sight integration. The costly computation of an intermediate 3D grid is completely avoided by our method and only optimally chosen interpolation points are computed, thus, the overall computational cost is significantly reduced. The algorithm is implemented as a parallel shared-memory kernel for large-scale grid rendered field reconstructions in our distributed-memory framework designed for N-body gravitational lensing simulations in large volumes. We also introduce a load balancing scheme to optimize the efficiency of processing a large number of field reconstructions. Our results show our kernel outperforms existing software packages for volume weighted density field reconstruction, achieving ∼10× speedup, and our load balancing algorithm gains an additional ∼3.6× speedup at scales with ∼16k processes.