Rehabilitation benefits from a careful understanding of people's movements, but there are few widely used tools to facilitate this beyond a goniometer or, in special cases, a gait laboratory. Easy to use and accurate kinematic tracking would allow better characterization of patient's impairments, allow quantitatively tracking improvements in response to treatment, and support research for more effective therapies. Two rapidly evolving technologies are poised to make this more available: wearable inertial sensors and video based motion tracking, which provide complementary information. However, both present challenges ranging from ease of use to the clinical relevance of their outputs. Pose estimation frequently struggles with ambiguities from the image. Wearable sensors often require precise placement and calibration to produce meaningful outputs, and often suffer from drift. This work addresses these challenges with a pipeline that fuses stereo video with inertial data to produce temporally-smooth pose estimates. Biomechanical constraints during optimization permit mapping the estimated pose trajectories to joint angles. The fusion also finds the relative rotation of the wearable sensors on the subject, which allows estimating joint angles without video. The system is tested in the rehabilitation setting on a participant with spinal cord injury using a wheelchair and is able to track shoulder elevation, elevation plane, internal rotation, elbow flexion and wrist pronation both in the fused condition and from inertial sensors alone during wheelchair propulsion.