An image of a scene with occlusions can yield only partial knowledge about disconnected fragments of the scene. If this were the only knowledge available, programs attempting to interpret the scene would have to conclude that the scene fragments would collapse in a jumble. But they won't. We describe a program that exploits commonsense knowledge of naive physics to make sense of scenes with occlusion. Our causal analysis focuses on the static stability of structures: what supports what. Occluded connections in a link-and-junction scene are inferred by determining the stability of each subassembly in the scene, and connecting parts when they are unstable. The causal explanation that is generated reflects a deeper understanding of the scene than mere model matching; it allows the seeing agent to predict what will happen next in the scene, and determine how to interact with it.