the mean function in the statistics module gives nonsensical results with boolean values in the input, e.g.:
>>> mean([True, True, False, False])
0.25
>>> mean([True, 1027])
0.5
This is an issue with the module's internal _sum function that mean relies on. Other functions relying on _sum are affected more subtly, e.g.:
>>> variance([1, 1027, 0])
351234.3333333333
>>> variance([True, 1027, 0])
351234.3333333334
The problem with _sum is that it will try to coerce its result to any non-int type found in the input (so bool in the examples), but bool(1028) is just True so information gets lost.
I've attached a patch preventing the type cast when it would be to bool.
I don't have time to write a separate test though so if somebody wants to take over .. :) |