Welch’s two-sample t-test — t-statistic, degrees of freedom and p-value
Enter your two groups below to run a Welch’s two-sample t-test — the t-statistic, degrees of freedom, and two-tailed p-value, with a plain-English read on statistical significance.
Enter at least 2 values in both Group A and Group B to run the test.
The share link reproduces your exact data and settings — paste it in an email, chat, or assignment and anyone who opens it sees the same results.
Saved to this device only (browser local storage). Use a share link to move data between devices.
Add 5 Number Summary to your home screen or desktop for one-tap access — works offline, no app store needed.
A two-sample t-test asks whether the difference between two groups’ means is larger than you’d expect from random sampling variation alone. It converts the observed difference into a t-statistic, then translates that statistic into a p-value — the probability of seeing a difference this large (or larger) if the two groups actually had the same true mean.
The original (Student’s) t-test assumes both groups have equal variance. Welch’s t-test drops that assumption and adjusts the degrees of freedom accordingly (the Welch–Satterthwaite equation), which is why the degrees of freedom below is often a non-round number. Because real-world groups rarely have exactly equal variance, Welch’s version is the safer general-purpose default and is what R and most modern statistics software use by default.
A common threshold is α = 0.05: if the p-value is below 0.05, the result is usually called “statistically significant” at the 95% confidence level, meaning the observed difference is unlikely to be pure chance. A p-value above 0.05 doesn’t prove the groups are the same — it means this data doesn’t give strong enough evidence to say they differ. Statistical significance also isn’t the same as practical importance: check the actual mean difference (also shown above) alongside the p-value.
t = (mean₁ − mean₂) / SE, where SE = √(s₁²/n₁ + s₂²/n₂) using each group’s sample variance. Degrees of freedom come from the Welch–Satterthwaite equation, and the two-tailed p-value is computed from the exact Student-t cumulative distribution (via the regularized incomplete beta function), not a table lookup or approximation.
There is no hard minimum — this calculator works with as few as 2 values per group — but t-tests are more reliable with at least 10–30 values per group. Very small samples produce wide, uncertain estimates even when the math is correct.
No. A low p-value means the observed difference is unlikely under the assumption that both groups have the same true mean — it doesn’t rule out other explanations like confounding variables, non-random sampling, or measurement error.
This calculator reports the two-tailed p-value, which tests whether the groups differ in either direction. A one-tailed test (testing only whether one specific group is larger) roughly halves the p-value, but should only be used when the direction was decided before looking at the data.