Caches are essential components in embedded processors, taking up a significant fraction of the chip area and power. As a result of the relatively large size and infrequent activity, leakage power of caches is becoming an important problem. There exist a number of power density minimization schemes that distribute the activity evenly among computational entities, thereby lowering the temperature to reduce the leakage power. In this paper, we first present various power density minimization schemes for highly-associative caches in embedded processors via access distribution. It is then suggested that they should be used in conjunction with other power-down techniques to be more effective. We show that conventional power-down techniques for on-chip caches can be suboptimal if thermal effects are ignored, and propose a thermal-aware power-down technique that minimizes power density of the active parts. Simulations based on MediaBench, NetBench, and MiBench applications in a 70nm technology show that the proposed thermal-aware schemes can improve leakage power savings of a conventional power-down technique by 8.5% on average, and up to 23%.