Overfitting the Unknown Unknowns

Warning: this page has been flagged for excessive use of jargon. Prolonged exposure may lead to spontaneous combustion of cognitive function.

In the depths of the algorithmic underworld, there lies a phenomenon known as Overfitting the Unknown Unknowns. It's a condition where a model is so convinced of its own superiority, it begins to fit the noise in the data, and forgets the signal.

Symptoms of Overfitting the Unknown Unknowns include:

Unusually high R-Squared values
Excessive use of regularization techniques
Models that are more complex than their own mothers

Causes of Overfitting the Unknown Unknowns:

Insufficient data (or a data scientist's worst nightmare)
Overly optimistic model assumptions
Too much caffeine

More on Algorithmic Anxiety Overfitting the Unknown Unknowns 3