We consider the capacity of a wideband fading channel with partial feedback, subject to an average power constraint. A doubly block Rayleigh fading model is assumed with finite coherence time (M channel uses) and a large number of independent, finite coherence bands. Without feedback, it is known that uniformly spreading the signal power beyond a critical number of coherence bands decreases the capacity. Here we assume that a pilot sequence is transmitted during each coherence time for channel estimation, and that feedback is used to designate a subset of coherence bands on which to transmit. Our problem is to optimize jointly the training length, average training power, and spreading bandwidth, taking into account the channel estimation error. We do this by maximizing a lower bound on the ergodic capacity. This lower bound becomes tight for large M, and we show that it increases as O(logM). The capacity of the partial feedback scheme therefore exceeds the capacity of "flash" signaling when M exceeds a (positive) threshold value.