Most people don’t misunderstand the p-value because it’s hard.
They misunderstand it because it’s explained badly.

Usually, it’s introduced like this:

  • Choose 0.05

  • Compute p

  • If p < 0.05, reject the null

That tells you what to do, but not what is going on.

So let’s rebuild the idea slowly.

Everything starts with an assumption

In hypothesis testing, you deliberately assume something boring.

This assumption is called the null hypothesis.

Examples:

  • A coin is fair

  • A medicine has no effect

  • A student is average

  • A model is just guessing

At this stage, you are not claiming it’s true forever.
You are saying:

“Let me assume this is true and see how the data behaves.”

Now pretend this assumption is true

This step is non-negotiable.

You temporarily live in a world where the null hypothesis is true.
No doubts. No shortcuts.

Then you look at your data and ask one question:

If this assumption were correct, how often would I see data like this (or more extreme)?

That number is the p-value.

What the p-value really measures

The p-value measures surprise, but very specifically.

It measures:

How surprising the data is under the null hypothesis

  • Not surprising → large p-value

  • Very surprising → small p-value

That’s all it does.

A simple numerical example

Assume:

  • A coin is fair

You flip it 10 times.
You get 9 heads.

In a fair-coin world:

  • Probability of 9 or more heads ≈ 0.011

So the p-value is about 0.011.

Interpretation:

“If the coin were fair, I would see this result only about 1 time in 100.”

That makes the “fair coin” assumption uncomfortable.

More real-life examples with numbers

Exam scores

Assume:

  • Average score = 60

  • Standard deviation = 10

A student scores 95.

That’s 3.5 standard deviations above average.

Probability of seeing this if the student were truly average:

  • 0.0002

p-value ≈ 0.0002

Meaning:

Calling this student “average” does not explain the data well.

Medicine trial

Assume:

  • Medicine has no effect

  • Normally, 50% recover in a week

Trial:

  • 100 patients

  • 70 recover in a week

Probability of this happening by chance:

  • 0.002

Small p-value →
“No effect” struggles to explain the outcome.

Machine learning model

Assume:

  • Model is guessing randomly

  • Accuracy should be around 50%

Test set:

  • 200 samples

  • Model gets 130 correct (65%)

Probability of this under random guessing:

  • 0.0003

Interpretation:

Random guessing is not a comfortable explanation anymore.

Note:
This does not mean the model is useful.
Only that “guessing” is unlikely.

What the p-value does NOT mean

A p-value is not:

  • The probability that the null hypothesis is true

  • The probability that the result is “due to chance”

  • A measure of how big or important the effect is

It only talks about:

Compatibility between data and an assumption.

About the famous 0.05 cutoff

Suppose you fix:

α = 0.05

This means:

“I agree to start doubting my assumption if the result would occur less than 5% of the time under it.”

So formally:

  • p = 0.049 → reject

  • p = 0.051 → do not reject

This rule is mathematically valid.

But here is the part that many courses don’t say clearly.

You are allowed to not reject at 0.049

A p-value of 0.049 and 0.051 describes almost the same situation.

Nothing fundamental changes at 0.05.

The cutoff exists because:

  • Decisions need rules

  • Not because nature has sharp boundaries

In many real situations—especially exploratory analysis, early research, or high-stakes decisions—you may reasonably say:

“This is borderline evidence. I will not reject yet.”

That is not incorrect statistics.
That is responsible judgment.

The most important insight

The p-value is continuous.
The decision is discrete.

The discomfort you feel around 0.05 is not a flaw.

It means you are thinking.

How to read a p-value correctly

Whenever you see a p-value, read it like this:

“If my assumption were true, data this extreme would appear p fraction of the time.”

The p-value measures how far into the tail the observed data falls, assuming the null is true.

That sentence prevents most misunderstandings.

Final takeaway

The p-value is not a truth machine.
It does not tell you what to believe.

It is a reality check.

It asks:

“Does this assumption comfortably explain what I just saw?”

The cutoff tells you when you agreed to act on that discomfort.
Not what is absolutely true.

Once you see this, the p-value stops feeling mysterious and starts feeling honest.

Reply

Avatar

or to participate

Keep Reading