Significant Results With Insignificant Differences in Process Inputs

Okay, okay, the title is a mouthful and probably only makes sense to researchers, statisticians and Six Sigma Black Belts. Imagine a scenario where you have two production lines that make toy ponies. They're set up the same way, have similar equipment, work with similar materials and are operated by similarly skilled people, have similar levels of variation and targeting on nominal values--all this "same" meaning there's no significant difference between the two. Off one production line comes white ponies. Off the other line comes beige ponies. The input materials are the same. The process settings are the same. The results are different.

This kind of thing should drive any improvement team or task force crazy!

This is what happened in a JAMA-published study with groups of people wearing fitness tracking gear [a la Fitbit (tm), Misfit (tm), Moov (tm) and the like] and groups who weren't. There was no significant differences in exercise and diet. Yet the non-gear-wearing groups lost significantly more weight than the gear-wearing groups.

Here's the conclusion to an article that reported on this:
Yet upon further review, what’s the most striking conclusion of the paper was that although a significant difference in weight loss was observed, “Differences between intervention groups for physical activity and dietary intake were not significant.”
Did you get that? The two groups apparently ate the same and exercised the same, and yet one group lost significantly more weight. To me, this is really the paradox of this paper, the observation that doesn’t make sense.
The authors acknowledge as much–in a somewhat understated fashion–in their conclusion:
In this study, the addition of wearable technology to a behavioral intervention was less effective for 24-month weight loss. This may be a result of the technology not being as effective for changing diet or physical activity behaviors compared with what was achieved with the standard intervention; however, the study found no significant difference in these measures between the standard intervention and enhanced intervention groups. Thus, the reason for this difference in weight loss between the standard intervention and enhanced intervention groups warrants further investigation. [Emphasis added.]
I’ll say!
The real mystery of this paper isn’t why the addition of wearable trackers doesn’t produce weight loss; rather, it’s what to make of data demonstrating differences in weight loss despite comparable amounts of physical activity and dietary intake.

