In this paper we define a deep multi-stage architecture for automated landmarking of craniomaxillofacial (CMF) CT images. Our model is composed of three subnetworks that first localize, on reduced-resolution images, areas where landmarks may be found and then refine the search, at full-resolution scale, through a hierarchical structure aiming at increasing the granularity of the investigated region. The multi-stage pipeline is designed to deal with full resolution data and does not require any additional pre-processing step to reduce search space, as opposed to existing methods that can be only adopted for searching landmarks located in well-defined anatomical structures (e.g., mandibles). The automated landmarking system is tested on identifying landmarks located in several CMF regions, achieving an average error of 0.8 mm, significantly lower than expert readings. The proposed model also outperforms baselines and is on par with existing models that employ additional upstream segmentation, on state-of-the-art benchmarks.