We address the problem of efficiently gathering correlated data from a wireless sensor network, with the aim of designing algorithms with provable optimality guarantees, and understanding how close we can get to the known theoretical lower bounds. Our proposed approach is based on finding an optimal or a near-optimal compression tree for a given sensor network: a compression tree is a directed tree over the sensor network nodes such that the value of a node is compressed using the value of its parent. We focus on broadcast communication model in this paper, but our results are more generally applicable to a unicast communication model as well. We draw connections between the data collection problem and a previously studied graph concept called weakly connected dominating sets, and we use this to develop novel approximation algorithms for the problem. We present comparative results on several synthetic and real-world datasets showing that our algorithms construct near-optimal compression trees that yield a significant reduction in the data collection cost.