Well, I suppose a lapsed statistician is a more accurate description of my current status in the field of statistics – I haven’t proven a theorem in a quarter of a century, the last time I tested a hypothesis was two decades ago and as for data-analysis, well for that I now have SenseMaker Explorer!
When I started out as a statistician, there were no personal computers; we made very strong assumptions just to be able to calculate the results; non-parametric methods with fewer assumptions took all-night runs on the university mainframe for a simple hypothesis test. Then came the PC; suddenly exploratory statistics became possible – not that it was considered rigorous enough to be proper statistics back then. But I loved it – it was like being a detective, looking for structure in a mass of data, using dimension-reduction techniques to squash the information down into two or three dimensions that could be visualized and interpreted, looking for patterns in graphs and finally, when the data set gave up its secrets, finding out what the patterns I saw could actually correspond to in the real world! Unfortunately, in the first stage of analysis, the pattern often corresponded to a management decision nobody bothered to tell the poor statistician about, but that’s another story.
Sounds familiar? When I first saw the scatter plot matrix in Explorer, I felt like I’d come home – even better, I could now do the same kind of analysis on “soft” data too!
I was taken aback initially when I was told that statistical analysis is best applied in the chaotic domain, though – I had always thought of it as a very rational, analytical tool to help make sense of uncertainty. Well, that’s not contradictory then, is it? Agents acting independently… independent observations as a basic assumption for statistical methods … ok, so statistics could make sense of chaos ….
To my statistical mind, fitting linear (e.g. polynomial) regression models to data is an admission that you don’t know anything about the underlying mechanism, otherwise you would have used a more explanatory model. Linear models only represent correlations and no causality can be inferred from them; they only apply to the range covered by your experiments and cannot be extrapolated to different conditions, but they can indicate promising directions in which to search for better solutions – so again, isn’t this what one does in the chaotic domain?
So I have to acknowledge, statistical methods are applicable in the chaotic domain, especially the empirical type of methods (functional models and many other techniques fit better in the complicated domain). But I’ve never ever simply discarded an outlier, they are the most interesting species of data point!
Cognitive Edge Ltd. & Cognitive Edge Pte. trading as The Cynefin Company and The Cynefin Centre.
© COPYRIGHT 2023