LINEST: Advanced Linear Regression Statistics in Excel

Home » pub » Tutorials » LINEST: Advanced Linear Regression Statistics in Excel

Learn how to use Excel’s LINEST function for full linear regression analysis in the given dataset. This blogpost covers slope, intercept, R², F-statistic, p-values, standard errors, multiple regression, polynomial curves, and residual analysis with 8 real world examples.

TREND gives you predictions. LINEST gives you the whole story. Where TREND returns a single column of fitted values, LINEST returns a table of statistics: slopes, intercept, R², standard errors, F-statistic, degrees of freedom, and the regression sum of squares. These are the numbers that tell you not just what the model predicts, but how reliable those predictions are. LINEST is Excel’s gateway to full linear regression analysis without leaving the spreadsheet.

This guide covers every row and column of the LINEST output array, the syntax for simple and multiple regression, and eight practical examples. Specifically, you will learn how to extract individual statistics, build a prediction with confidence, run a multiple regression with several predictors, and assess whether the model is statistically meaningful. No statistical software is required for any of these tasks.

Availability: LINEST is available in all Excel versions, including Microsoft 365, Excel 2024, 2021, 2019, 2016, and earlier. It also works in Google Sheets. In Excel 2019 and earlier, LINEST returns an array and requires Ctrl+Shift+Enter when used to return multiple statistics. In Excel 365, it spills automatically.

What Does LINEST Do?

LINEST fits a straight line through your data using ordinary least squares. For multiple predictors, it fits a hyperplane. The line minimises the sum of squared vertical distances between the actual data points and the fitted line. LINEST then returns a two-dimensional array of statistics: regression coefficients in row 1, standard errors in row 2, R² and standard error of the estimate in row 3, F-statistic and degrees of freedom in row 4, and the regression and residual sums of squares in row 5.

The key distinction from TREND is purpose. Specifically, each function serves a different analytical goal. TREND is for prediction — feed it new x-values and it returns predicted y-values. LINEST is for analysis — it returns the model parameters and their statistical properties. You can also build predictions from LINEST output, but the primary use is understanding the fit, which predictors matter, and whether results are statistically significant.

What Is the LINEST Syntax?

=LINEST(known_y, [known_x], [const], [stats])

Argument	Required?	What it does
known_y	Required	The dependent variable values (the outcome you are predicting). A single column of numbers.
known_x	Optional	The independent variable values (predictors). A single column for simple regression. Multiple columns for multiple regression. If omitted, Excel uses {1, 2, 3, ...}.
const	Optional	TRUE (default) = fit the intercept normally. FALSE = force the intercept through zero (regression without a constant).
stats	Optional	TRUE = return the full 5-row statistics table. FALSE (default) = return only the coefficients in row 1. Always use TRUE for regression analysis.

What Does the LINEST Output Table Contain?

The full LINEST output is a 5-row by (k+1)-column array, where k is the number of predictor variables. For simple regression with one predictor, the output is 5 rows × 2 columns. For multiple regression with three predictors, it is 5 rows × 4 columns. The columns are ordered from the last predictor to the first, with the intercept always in the rightmost column.

Row

Column 1 (last predictor / slope)

Column 2 (intercept b)

Row 1

Slope (m) — coefficient of x

Intercept (b) — y when x=0

Row 2

Standard error of slope (SE_m)

Standard error of intercept (SE_b)

Row 3

R² — coefficient of determination

Standard error of y estimate (SE_y)

Row 4

F-statistic — overall model significance

Degrees of freedom (df_residual)

Row 5

SS_regression — explained variance

SS_residual — unexplained variance

Always set stats = TRUE: Calling LINEST with stats = FALSE (the default) returns only row 1 — the coefficients. This is enough for making predictions, but it gives you no information about model quality. Set stats = TRUE for any serious analysis. Also use INDEX to extract individual statistics rather than working with the raw spilled array, especially when referencing LINEST results in other formulas.

Examples 1–4: Core LINEST Operations

Simple regression — slope, intercept, and R² from one predictor

The starting point for any LINEST analysis is simple regression: one predictor, one outcome. The output gives you the slope (how much y changes per unit of x), the intercept (the y value when x equals zero), and R² (what fraction of the variation in y the model explains). These three numbers summarise the entire linear relationship.

Ad spend (A)

Revenue (B)

Slope (m)

Intercept (b)

R²

10–90k

Historical

4.82

12,450

0.934

A2:A11 = ad spend (predictor x). B2:B11 = revenue (outcome y). Enter LINEST with stats = TRUE. Excel 365: enter in a free cell, spills 5 rows × 2 columns automatically. Excel 2019: select a 5×2 range, type, then Ctrl+Shift+Enter. =LINEST(B2:B11, A2:A11, TRUE, TRUE) → Spills a 5×2 array. Row 1: {slope, intercept}. Row 1, col 1 = 4.82 (slope — each £1k extra ad spend adds £4,820 revenue) Row 1, col 2 = 12,450 (intercept — baseline revenue with no ad spend) Extract individual statistics using INDEX (cleaner than raw spill): Slope: =INDEX(=LINEST(B2:B11,A2:A11,TRUE,TRUE), 1, 1) → 4.82 (slope m) Intercept: =INDEX(=LINEST(B2:B11,A2:A11,TRUE,TRUE), 1, 2) → 12,450 (intercept b) R²: =INDEX(=LINEST(B2:B11,A2:A11,TRUE,TRUE), 3, 1) → 0.934 (93.4% of revenue variation explained by ad spend)

Making predictions — using LINEST coefficients to forecast y

Once you have the slope and intercept from LINEST, prediction is a matter of applying y = mx + b. Storing the coefficients in named cells avoids recalculating LINEST every time. The result matches TREND exactly — both use the same underlying least-squares fit. However, LINEST additionally gives you the standard error of the estimate, which allows you to build a prediction interval around the forecast.

Store LINEST results in a named range or fixed cells. D2 = slope = INDEX(LINEST(B2:B11,A2:A11,TRUE,TRUE),1,1) D3 = intercept = INDEX(LINEST(B2:B11,A2:A11,TRUE,TRUE),1,2) D4 = SE_y = INDEX(LINEST(B2:B11,A2:A11,TRUE,TRUE),3,2) Predict revenue for a new ad spend of 50 (E2 = 50): y = slope × x + intercept $D$2 * E2 + $D$3 → 4.82 × 50 + 12,450 = 253,550 (predicted revenue for £50k spend) Confirm: TREND returns the same prediction: TREND uses the same least-squares fit as LINEST — always identical. =TREND(B2:B11, A2:A11, E2) → 253,550 (identical to the LINEST-based prediction) Approximate 95% prediction interval — point estimate ± 2×SE_y: SE_y is the standard error of the y estimate from row 3, col 2 of LINEST. This is a simplified interval — use FORECAST.LINEAR for exact confidence bands. Lower: ($D$2*E2+$D$3) - 2*$D$4 Upper: ($D$2*E2+$D$3) + 2*$D$4

Model significance — F-statistic and p-value from LINEST

R² tells you how much variance the model explains, and the F-statistic tests whether that is statistically meaningful. Specifically, it tests whether the explanatory power could have arisen by chance. A p-value below 0.05 means the model is significant at the 5% level. Excel’s FDIST function converts the F-statistic and degrees of freedom from LINEST into a p-value directly.

LINEST output stored as a named range called "LinStats" (5 rows × 2 cols). Or use INDEX inline as shown below. F-statistic from LINEST (row 4, col 1): =INDEX(=LINEST(B2:B11,A2:A11,TRUE,TRUE), 4, 1) → 127.4 (large F = model explains significantly more than chance) Degrees of freedom residual (row 4, col 2): df_regression = number of predictors (k). df_residual = n - k - 1. =INDEX(=LINEST(B2:B11,A2:A11,TRUE,TRUE), 4, 2) → 8 (df_residual = 10 observations - 1 predictor - 1 = 8) P-value of the F-statistic — is the model significant? F_val = row 4 col 1. df_reg = 1 (one predictor). df_res = row 4 col 2. FDIST(F, df_regression, df_residual) returns the p-value. =FDIST( INDEX(=LINEST(B2:B11,A2:A11,TRUE,TRUE),4,1), 1, INDEX(=LINEST(B2:B11,A2:A11,TRUE,TRUE),4,2) ) → 0.0000023 (p < 0.05: model is statistically significant) Interpret: p < 0.05 → model is significant (reject null hypothesis). p ≥ 0.05 → model is not significant (fail to reject null).

Standard errors — t-statistic and significance of individual coefficients

The F-statistic tests overall model significance. Standard errors from row 2, however, test whether each individual coefficient is significant. Dividing a coefficient by its standard error gives the t-statistic. Generally, an absolute t-statistic above 2 indicates significance. The TDIST function converts the t-statistic and degrees of freedom to a precise p-value.

Store LINEST output in a 5×2 range starting at D2:E6. D2 = slope (m), E2 = intercept (b) D3 = SE of slope, E3 = SE of intercept df_residual = E5 (row 4, col 2) T-statistic for the slope — is ad spend a significant predictor? t = coefficient / standard error D2 / D3 → 4.82 / 0.43 = 11.21 (t-statistic for slope) P-value of the slope t-statistic: TDIST(|t|, df_residual, tails). Use 2 tails for a two-sided test. df_residual is in row 4, col 2 of the LINEST output (cell E5 here). =TDIST(ABS(D2/D3), E5, 2) → 0.0000018 (p < 0.05: ad spend is a significant predictor of revenue) T-statistic for the intercept — is the baseline significant? E2 / E3 → t-statistic for intercept (test whether baseline revenue differs from zero) 95% confidence interval for the slope: slope ± TINV(0.05, df_residual) × SE_slope. If the interval excludes zero, the predictor is significant at 5%. D2 - TINV(0.05, E5) * D3 ← lower bound of 95% CI for slope D2 + TINV(0.05, E5) * D3 ← upper bound of 95% CI for slope

Examples 5–8: Applied Regression Analysis

Multiple Regression, Polynomial Curves, and Residuals

Multiple regression — two or more predictors in one model

Multiple regression extends the single-predictor model to include several independent variables simultaneously. For example, revenue might depend on both ad spend and headcount. LINEST handles this by passing a multi-column range as known_x. The output grows by one column per additional predictor. Importantly, coefficients appear in reverse predictor order — the rightmost column is always the intercept.

Ad spend (A)

Headcount (B)

Revenue (C)

m_headcount

m_adspend

Intercept

Historical

Outcome

3,210

4.15

8,720

A2:A11 = ad spend. B2:B11 = headcount. C2:C11 = revenue (outcome). Pass both predictors as a two-column range (A2:B11) to known_x. Multiple regression — Excel 365 spills 5 rows × 3 columns: =LINEST(C2:C11, A2:B11, TRUE, TRUE) Output column order: [m_headcount, m_adspend, intercept] — columns are in REVERSE order of the predictor columns in known_x. Row 1, col 1 = coefficient for LAST predictor (headcount = col 2 of A2:B11) Row 1, col 2 = coefficient for FIRST predictor (ad spend = col 1 of A2:B11) Row 1, col 3 = intercept Extract the ad spend coefficient (second predictor → column 2 of LINEST row 1): =INDEX(=LINEST(C2:C11,A2:B11,TRUE,TRUE), 1, 2) → 4.15 (each £1k extra ad spend adds £4,150 revenue, holding headcount fixed) Predict revenue for ad spend = 50, headcount = 12: y = m_hc × headcount + m_ad × adspend + intercept =INDEX(=LINEST(C2:C11,A2:B11,TRUE,TRUE),1,1)*12 + INDEX(=LINEST(C2:C11,A2:B11,TRUE,TRUE),1,2)*50 + INDEX(=LINEST(C2:C11,A2:B11,TRUE,TRUE),1,3) → Predicted revenue for those inputs

Polynomial regression — curved relationships using LINEST

LINEST fits linear models — but the predictors do not have to be raw x-values. By passing x² (x squared) as a second predictor alongside x, you fit a quadratic (parabolic) curve. This is polynomial regression implemented through LINEST’s multiple regression framework. The curve fits data that bends rather than following a straight line, making it useful for diminishing returns, U-shaped cost curves, and accelerating growth.

A2:A15 = x values. B2:B15 = y values (curve-shaped data). Create a helper column C2:C15 = A2:A15^2 (x squared values). Polynomial regression: y = a·x² + b·x + c Pass both x and x² as predictors. Excel 365 — enter in a free area, spills 5 rows × 3 columns: =LINEST(B2:B15, A2:B15^2, TRUE, TRUE) → Wait — this passes B2:B15 (y) with an incorrect x range. Correct approach: pass x and x² as a two-column array. Use helper column or inline array construction. Correct formula using helper column C2:C15 = A2:A15^2: =LINEST(B2:B15, HSTACK(A2:A15, A2:A15^2), TRUE, TRUE) → Returns [coeff_x², coeff_x, intercept] in row 1 Excel 2019 — use helper column then array-enter: C2:C15 = A2:A15^2 (x-squared values in a helper column) Select a 5×3 range and press Ctrl+Shift+Enter: {=LINEST(B2:B15, A2:C15, TRUE, TRUE)} Row 1 col 1 = a (x² coefficient) Row 1 col 2 = b (x coefficient) Row 1 col 3 = c (intercept) Predict y for new x value in E2: y = a·x² + b·x + c INDEX(results,1,1)*E2^2 + INDEX(results,1,2)*E2 + INDEX(results,1,3) → Quadratic prediction for x = E2

Residual analysis — checking model fit with leftover errors

Residuals are the differences between the actual y-values and the values the model predicted. Specifically, examining residuals is the primary diagnostic tool in regression. Well-behaved residuals are randomly scattered around zero with no pattern. Systematic curves in residuals indicate that a linear model is not appropriate. Furthermore, a single large residual can signal an outlier or a data entry error.

A2:A11 = x (ad spend). B2:B11 = y (revenue). D2:D3 = slope and intercept stored from LINEST (via INDEX). Fitted values — what the model predicts for each historical x: E2 = fitted value for the first observation. Copy down E2 to E11 for all fitted values. $D$2 * A2 + $D$3 → Predicted revenue for each historical ad spend value Residuals — actual minus fitted (column F): F2 = actual - fitted for the first observation. Copy down to F11. B2 - E2 → Positive = actual was above prediction. Negative = below prediction. Standardised residuals — divide by SE_y to put residuals on a z-score scale: D4 = SE_y = INDEX(LINEST(...), 3, 2) Standardised residual above 2 or below −2 is a potential outlier. F2 / $D$4 → Standardised residual for row 2 SS_residual cross-check — confirm it matches LINEST row 5 col 2: SUMPRODUCT of squared residuals should equal SS_residual from LINEST. =SUMPRODUCT(F2:F11^2) → Should equal INDEX(LINEST(B2:B11,A2:A11,TRUE,TRUE),5,2)

Full regression report — extracting every LINEST statistic in one dashboard

A self-updating regression report is one of the most practical uses of LINEST. Store the LINEST call in a named range, then extract each statistic with INDEX. This builds a self-updating dashboard. Every time the data changes, all statistics refresh simultaneously without any manual intervention.

Approach: calculate LINEST once and store in a named range called "LS". In Name Manager: name = LS, Refers to: =LINEST(Sheet1!B2:B11,Sheet1!A2:A11,TRUE,TRUE) Then extract each statistic by name using INDEX on the named range. This is far more readable than embedding the full LINEST call in each formula. Row 1 — coefficients: Slope m: =INDEX(LS, 1, 1) Intercept b: =INDEX(LS, 1, 2) Row 2 — standard errors: SE of slope: =INDEX(LS, 2, 1) SE of intercept: =INDEX(LS, 2, 2) Row 3 — model quality: R²: =INDEX(LS, 3, 1) SE of y: =INDEX(LS, 3, 2) Row 4 — significance: F-statistic: =INDEX(LS, 4, 1) df residual: =INDEX(LS, 4, 2) p-value: =FDIST(INDEX(LS,4,1), 1, INDEX(LS,4,2)) Row 5 — variance breakdown: SS_regression: =INDEX(LS, 5, 1) SS_residual: =INDEX(LS, 5, 2) Adjusted R² — penalises model for extra predictors: n = COUNT(B2:B11). k = number of predictors (1 for simple regression). Adjusted R² = 1 - (1 - R²) * (n-1) / (n-k-1) 1 - (1 - INDEX(LS,3,1)) * (COUNT(B2:B11)-1) / (COUNT(B2:B11)-1-1)

✓Store LINEST in a Name Manager named range (e.g. LS) and reference it with INDEX(LS, row, col) throughout the dashboard. This avoids recalculating LINEST dozens of times and makes the formulas readable. Every statistic updates the moment you change the source data.

Common Issues and How to Fix Them

LINEST only returns one number instead of the full table

Two things cause this. First, stats may be FALSE or omitted — set stats = TRUE to get all five rows. Second, in Excel 2019 and earlier, LINEST must be entered as an array formula: select the full output range (5 rows × (k+1) columns), type the formula, then press Ctrl+Shift+Enter. Without this, only the top-left value is returned. In Excel 365, LINEST spills automatically and no special entry is needed.

The column order in LINEST output is confusing

LINEST returns coefficients in reverse order of the predictor columns. For simple regression, column 1 is the slope and column 2 is the intercept — straightforward. For multiple regression with three predictors passed as columns A, B, C, the LINEST output columns are the coefficient for C, then B, then A, then the intercept. Always use INDEX(LINEST(...), 1, col) to extract specific coefficients rather than relying on visual column position in the spilled array.

R² is high but the model still gives poor predictions

High R² does not guarantee useful predictions. Several issues can inflate R² without producing a reliable model. Consequently, R² alone is not a reliable quality indicator. Overfitting is the most common issue. Adding more predictors always increases R², even if those predictors are irrelevant. Instead, use Adjusted R² (Example 8), which penalises extra predictors. Also check the residuals — if they show a systematic pattern, the linear model is misspecified and predictions will be unreliable regardless of R².

LINEST assumes linearity, independence, and constant variance: The ordinary least-squares method behind LINEST produces unbiased estimates only when these three assumptions hold. Specifically, the relationship between x and y must be linear, observations must be independent of each other, and the spread of residuals must be roughly constant across all fitted values (homoscedasticity). Violating any assumption degrades the reliability of p-values and confidence intervals even when R² appears acceptable. Always plot the residuals as a diagnostic step.

Frequently Asked Questions

What is the difference between LINEST and TREND?+

TREND returns predicted y-values for a set of x-values. It is a prediction tool. LINEST returns the regression coefficients and statistical diagnostics — slope, intercept, R², standard errors, F-statistic, and sums of squares. It is an analysis tool. Both use identical least-squares fitting, so a prediction from LINEST coefficients always matches TREND exactly. Use TREND when you only need predictions. Use LINEST when you need to understand and validate the model.
How do I use LINEST in Excel 2019 and earlier?+

In Excel 2019 and earlier, LINEST is an array function. To get the full 5-row output, select a 5×2 range (or 5×(k+1) for multiple regression), type the LINEST formula, and confirm with Ctrl+Shift+Enter instead of Enter. Curly braces appear in the formula bar to confirm array entry. If you only need one statistic — for example, just R² — you can wrap LINEST in INDEX and enter it in a single cell normally: =INDEX(LINEST(B2:B11,A2:A11,TRUE,TRUE),3,1) does not require Ctrl+Shift+Enter.
What is R² in LINEST and what is a good value?+

R² (the coefficient of determination) measures what proportion of the variation in y is explained by the model. It ranges from 0 to 1. An R² of 0.90 means 90% of the variation in y is explained. What counts as a good value depends entirely on the field. In physics, R² should be near 1.0. In economics or social science, 0.50 to 0.70 may be perfectly acceptable. A high R² does not mean the model is well-specified — always check residuals and use Adjusted R² when comparing models with different numbers of predictors.