Berkson’s Paradox: When selection creates correlation

Author

Eliott Kalfon

Published

May 20, 2025

Subscribe to my newsletter to hear about my latest posts. No spam, I promise.


You are hiring a team of Data Scientists. Doing so, you look for two critical skills:

  1. Programming skills
  2. Mathematics skills

In your recruiting criteria, both of these are equally important.

At the end of the recruitment process, you find that among the candidates you hired, people with the strongest programming skills have lower maths abilities. In other words, among your teammates, programming and mathematics skills end up being negatively correlated; abilities in one will be associated with weaknesses in the other.

How can this happen?

The Paradox

This question can be better understood through the lens of Berkson’s paradox. The idea of this paradox: two uncorrelated characteristics can appear negatively correlated following a selection process.

Let’s approach this problem with a simple simulation (yes, you are reading a blog written by a Data Scientist).

Let’s assume that programming skills and mathematics abilities are independent from each other and normally distributed in the starting population. You can score both of these with a scale from 0–100. On the chart below, each dot represents a single candidate.

Each candidate is represented as a dot on a scatter plot
Code used to generate the chart
import numpy as np
import matplotlib.pyplot as plt

np.random.seed(42)
n = 500
programming = np.random.normal(70, 10, n)
maths = np.random.normal(70, 10, n)

plt.figure(figsize=(8,6))
plt.scatter(programming, maths, alpha=0.5)
plt.title("Candidates: Programming vs Maths Skills", fontsize=18)
plt.xlabel("Programming Skills", fontsize=16)
plt.ylabel("Maths Skills", fontsize=16)
plt.xticks(fontsize=14)
plt.yticks(fontsize=14)
plt.grid(True, alpha=0.3)
plt.show()

You rate the quality as a candidate by the average between their technical and business skills:

Candidate Score=Technical Skills+Business Skills2

Hiring for a top-tier firm, you can allow yourself to be selective and only hire candidates with an average of 75 or above. This constraint can be plotted as a line in the chart below.

Plotting the constraint
Code used to generate the chart
plt.figure(figsize=(8,6))
plt.scatter(programming, maths, alpha=0.5)
x = np.linspace(40, 100, 100)
y = 150 - x  # (x + y)/2 = 85 -> x + y = 170
plt.plot(x, y, 'r--', linewidth=2, label="Candidate Score = 75")
plt.title("Selection Constraint: Only Candidates with Score ≥ 75", fontsize=18)
plt.xlabel("Programming Skills", fontsize=16)
plt.ylabel("Maths Skills", fontsize=16)
plt.xticks(fontsize=14)
plt.yticks(fontsize=14)
plt.legend(fontsize=14)
plt.grid(True, alpha=0.3)
plt.show()

Among the candidates that satisfy this constraint, we see that programming and maths skills are indeed correlated (!)

A negative correlation emerges
Code used to generate the chart
selected = (programming + maths) / 2 >= 75
plt.figure(figsize=(8,6))
plt.scatter(programming[selected], maths[selected], alpha=0.7, c='green')
plt.title("Selected Candidates: Negative Correlation Emerges", fontsize=18)
plt.xlabel("Programming Skills", fontsize=16)
plt.ylabel("Maths Skills", fontsize=16)
plt.xticks(fontsize=14)
plt.yticks(fontsize=14)
plt.grid(True, alpha=0.3)
plt.show()

Other examples

In the business world, you sometimes find that communication skills are negatively correlated with technical skills. Analysing this example with Berkson’s paradox, if an employee’s promotion depends on a combination (e.g., average) of their communication and technical skills: among the promoted individuals, communication and technical skills will be negatively correlated. If you have a doubt, have another look at the charts of the previous section.

In the dating world, you may find that humour and attractiveness (or any other two characteristics that you value) are negatively correlated among your previous partners. If you choose your partners based on any combination of these two characteristics, Berkson’s paradox strikes again. Not being aware of this paradox could lead one to unfortunate (and mistaken) generalisations about the overall population.

Caveat: Love at first sight, emotional chemistry, the heat of the moment, not everything can be reduced to numbers.

Final Thoughts

Once you have seen the Berkson Paradox, you cannot unsee it. I wouldn’t be surprised if you saw it pop up everywhere around you. Studying examples of this paradox reminds us to carefully consider the selection process behind our data before making any judgement on the general population.

Going back to the example of the Data Science hiring manager; they could think that they have hired a representative sample of candidates, and can therefore affirm that, in general, programming and maths skills are negatively correlated. These mistakes happen.

Can you think of Berkson Paradox examples around you? Have you carefully considered the selection process applied to the data you are analysing?

Like what you read? Subscribe to my newsletter to hear about my latest posts!