T-Test Calculator

Welch’s two-sample t-test — t-statistic, degrees of freedom and p-value

Enter your two groups below to run a Welch’s two-sample t-test — the t-statistic, degrees of freedom, and two-tailed p-value, with a plain-English read on statistical significance.

Enter Your Data Points

Welch’s two-sample t-test

Enter at least 2 values in both Group A and Group B to run the test.

Five Number Summary

Minimum76

Q1 (25th)81.5

Median88.5

Q3 (75th)91

Maximum95

Interquartile range (IQR)9.5

Count (n)8

Outliers (1.5 × IQR rule)None

Additional Statistics

Mean (average)86.625

ModeNone

Range19

Sum693

Std deviation (sample)6.6319

Std deviation (population)6.2036

Variance (sample)43.9821

Mean absolute deviation5.2188

Coefficient of variation0.0766

Standard error of the mean2.3447

10th percentile77.4

90th percentile92.9

Lower inner fence (Q1 − 1.5·IQR)67.25

Upper inner fence (Q3 + 1.5·IQR)105.25

Box & Whisker Plot

Histogram

Step-by-step solution

Export & Share

The share link reproduces your exact data and settings — paste it in an email, chat, or assignment and anyone who opens it sees the same results.

Saved Datasets

Saved to this device only (browser local storage). Use a share link to move data between devices.

What does a t-test tell you?

A two-sample t-test asks whether the difference between two groups’ means is larger than you’d expect from random sampling variation alone. It converts the observed difference into a t-statistic, then translates that statistic into a p-value — the probability of seeing a difference this large (or larger) if the two groups actually had the same true mean.

Why Welch’s test instead of Student’s classic t-test?

The original (Student’s) t-test assumes both groups have equal variance. Welch’s t-test drops that assumption and adjusts the degrees of freedom accordingly (the Welch–Satterthwaite equation), which is why the degrees of freedom below is often a non-round number. Because real-world groups rarely have exactly equal variance, Welch’s version is the safer general-purpose default and is what R and most modern statistics software use by default.

Reading the result

A common threshold is α = 0.05: if the p-value is below 0.05, the result is usually called “statistically significant” at the 95% confidence level, meaning the observed difference is unlikely to be pure chance. A p-value above 0.05 doesn’t prove the groups are the same — it means this data doesn’t give strong enough evidence to say they differ. Statistical significance also isn’t the same as practical importance: check the actual mean difference (also shown above) alongside the p-value.

How it’s calculated

t = (mean₁ − mean₂) / SE, where SE = √(s₁²/n₁ + s₂²/n₂) using each group’s sample variance. Degrees of freedom come from the Welch–Satterthwaite equation, and the two-tailed p-value is computed from the exact Student-t cumulative distribution (via the regularized incomplete beta function), not a table lookup or approximation.

Frequently asked questions

What sample size do I need for a t-test?

There is no hard minimum — this calculator works with as few as 2 values per group — but t-tests are more reliable with at least 10–30 values per group. Very small samples produce wide, uncertain estimates even when the math is correct.

Does a low p-value prove my hypothesis?

No. A low p-value means the observed difference is unlikely under the assumption that both groups have the same true mean — it doesn’t rule out other explanations like confounding variables, non-random sampling, or measurement error.

What is the difference between a one-tailed and two-tailed test?

This calculator reports the two-tailed p-value, which tests whether the groups differ in either direction. A one-tailed test (testing only whether one specific group is larger) roughly halves the p-value, but should only be used when the direction was decided before looking at the data.