About the Statistics Calculator
The Statistics Calculator takes a paste-able list of numbers and returns the full descriptive-statistics summary in one pass: count, sum, mean, median, mode, range, variance (sample and population), standard deviation (sample and population), quartiles, interquartile range, skewness, and kurtosis. Z-score and percentile rank for any user-supplied value sit alongside, and probability-distribution and hypothesis-testing modes (t-test, z-test, chi-square, linear regression) layer on the inferential workflow.
It is built for students working through introductory statistics homework, data analysts doing a quick QA on a CSV column, researchers checking that a sample matches expected parametric assumptions before running a test, A/B test reviewers confirming the variance ratio between control and treatment, and educators generating worked examples on the fly.
All computation happens locally in JavaScript. Whatever list you paste — experiment results, customer data, survey responses, proprietary metrics — never leaves your device. The page issues no network call after first load. Visualizations are rendered locally with inline SVG; the underlying data stays in memory and is cleared on page reload.
Descriptive statistics describe a sample; they do not establish causation, infer a population, or test a hypothesis on their own. For inferential work pair the descriptive output with the hypothesis-test modes, and confirm distributional assumptions (normality, equal variance) before quoting a p-value. Outliers can dominate the mean, range, and variance dramatically; consider robust estimators (trimmed mean, MAD) when the histogram looks heavily skewed.
Descriptive Statistics Explained
Descriptive statistics summarize a dataset with a handful of numbers. The mean (arithmetic average) adds every value and divides by the count, giving you the balance point of the data. The median is the middle value when data is sorted; it resists the pull of outliers that can distort the mean. The mode is the most frequently occurring value and is the only measure of central tendency that works with categorical data. Knowing when to report each measure is crucial: income data is almost always summarized with the median because a few extremely high earners skew the mean upward, while exam scores are often reported as means because they tend to follow a symmetric distribution.
Understanding Standard Deviation
Standard deviation quantifies the typical distance of data points from the mean. A small standard deviation means values cluster tightly; a large one means they are spread out. For data that follows a normal distribution, the 68-95-99.7 rule (or empirical rule) provides a quick mental model: roughly 68% of values fall within one standard deviation of the mean, 95% within two, and 99.7% within three. Variance is simply the square of the standard deviation and has the advantage of being additive across independent variables, which is why it appears so often in theoretical work despite being harder to interpret directly in the units of your data.
Probability Distributions Guide
A normal distribution is the classic bell curve defined by its mean and standard deviation. It arises naturally whenever many small, independent effects combine, which is why heights, measurement errors, and test scores often follow it. The binomial distribution counts the number of successes in a fixed number of independent yes-or-no trials, such as coin flips, defective-item checks, or email open rates. The Poisson distribution models the number of rare events occurring in a fixed interval of time or space — think website hits per minute, accidents per month, or typos per page. Choosing the right distribution is the foundation of every probability calculation and hypothesis test.
Hypothesis Testing Step by Step
Hypothesis testing follows a five-step process. First, state the null hypothesis (H₀), which represents the status quo or no effect, and the alternative hypothesis (H₁), which is what you hope to demonstrate. Second, choose a significance level (α), most commonly 0.05. Third, calculate the test statistic from your sample data. Fourth, find the p-value, which is the probability of obtaining results at least as extreme as yours if H₀ were true. Fifth, compare the p-value to α: if p ≤ α, reject H₀ in favor of H₁; otherwise, fail to reject H₀. Failing to reject does not prove H₀ is true — it simply means your data did not provide sufficient evidence against it.
Linear Regression Basics
Linear regression finds the straight line that best fits a scatter plot of X-Y data. The line equation ŷ = mx + b minimizes the sum of squared residuals (the vertical distances between each data point and the line). The R² value, or coefficient of determination, tells you what fraction of the variance in Y is explained by X. An R² of 0.85 means X accounts for 85% of the variation in Y. However, a high R² does not imply causation — two variables can be strongly correlated because they share a common cause, or purely by coincidence (spurious correlation).
Which Statistical Test Should I Use?
Choosing the right test depends on your data and research question. Use a one-sample t-test when you have one group and want to compare its mean to a known value. Use a two-sample t-test to compare the means of two independent groups. Use a z-test for proportions when comparing percentages or rates between two groups (e.g., conversion rates in an A/B test). For correlation and prediction with continuous variables, use regression. When your data is not normally distributed or you have small samples, consider non-parametric alternatives. This calculator covers the most common scenarios encountered in coursework, business analytics, and research.
Looking for related tools? Our Percentage Calculator handles quick percent-of and percent-change problems, and the Unit Converter can translate between measurement systems before you run your analysis. Explore all Math & Science tools.
Frequently Asked Questions
What is the difference between mean, median, and mode?
Mean is the arithmetic average (sum divided by count). Median is the middle value when data is sorted. Mode is the most frequent value. The mean is sensitive to outliers; the median is not.
What is the formula for standard deviation?
Sample standard deviation s = sqrt(sum((x - mean)^2) / (n - 1)). Population standard deviation uses n in the denominator instead of n - 1 and is denoted sigma.
When should I use a t-test versus a z-test?
Use a z-test when the population standard deviation is known and the sample is large (n >= 30). Use a t-test when the population standard deviation is unknown or the sample is small, using the t-distribution with n - 1 degrees of freedom.
What does a p-value mean?
The p-value is the probability of observing a test statistic at least as extreme as the one obtained, assuming the null hypothesis is true. A p-value below the significance level (typically 0.05) provides evidence to reject the null hypothesis.
What does R-squared measure in regression?
R^2 is the proportion of variance in the dependent variable explained by the regression model, ranging from 0 to 1. An R^2 of 0.85 means 85% of the variation in y is explained by x. R^2 equals the square of the Pearson correlation r.