Regression in machine learning predicts numbers, not categories. To evaluate such models, common metrics are used: MSE and RMSE penalize large errors, MAE showsRegression in machine learning predicts numbers, not categories. To evaluate such models, common metrics are used: MSE and RMSE penalize large errors, MAE shows

Stop Guessing AI Metrics: Regression Explained with MSE, RMSE, MAE, R² & MAPE

\

What is regression?

A regression task in machine learning is a type of AI learning where a model is trained on data with a continuous value and learns to predict that value based on one or more input features.

The key difference between regression and classification is that regression predicts continuous values (for example, house price, temperature, number of sales), while classification predicts categorical labels (for example, yes/no, red/blue/green).

In other words, a regression task predicts a number, while a classification task is like choosing an answer from multiple options in a test.


Example

Let’s imagine that we run a classifieds platform similar to Avito or CIAN. We want to suggest to users, directly in the interface, the best price at which they should list their apartment, based on many factors, such as:

  • Apartment location
  • Size
  • Floor
  • Renovation quality
  • Year the building was constructed

As a result, we show the user a recommended price in euros.

We predicted the prices for 10 apartments, and a month later we learned the actual prices at which they were sold.

\ \ Next, we will perform some simple calculations with these results:

  • Subtract the actual price from the predicted price (first column)
  • Square this difference (second column)
  • Take the square root of this value (third column)

This gives us the following results for our example:

\

MSE

If we take the second column from the green table above, add up all the values in it, and then divide by the number of values (i.e., take the average), we get MSE, or Mean Squared Error. \n In our case:

MSE = 3,353,809,295

That’s a big number! Because of its scale, it’s hard to interpret from a business perspective. MSE is more often used during model development, where it’s important to penalize large errors more heavily than small ones, since the error grows quadratically. This makes MSE sensitive to outliers. MSE is useful when large errors are unacceptable and should strongly affect the model.


RMSE

RMSE, or Root Mean Squared Error, is the younger brother of MSE. To calculate it, you simply take the square root of MSE.

In our case, it equals 57,912.

RMSE also penalizes large errors, but unlike MSE, the scale of the error is the same as the original data, which makes it easier to interpret. This makes RMSE a good choice for many practical tasks where interpretability matters.


MAE

MAE, or Mean Absolute Error, is calculated using the third column of the green table above. You take the sum of the square roots of the squared differences between the predicted and actual prices and divide it by the number of observations. In simple terms, you take the average of the third column.

In our example, MAE = 49,243.

MAE is less sensitive to outliers compared to MSE and RMSE. This makes it a preferred option when outliers exist in the data but should not have a strong impact on the overall performance of the model.


Let’s make our green table a bit more complex

To understand how R-squared and MAPE are calculated, we need to add two more columns to our green table:

  • Subtract the mean predicted price from the predicted price and square the result (the fourth green column). \n P.S. Don’t ask why this is needed or what the practical meaning is — just do it 🙂

  • Divide the third green column by the predicted apartment price from the yellow table. In other words, divide the absolute difference between the predicted and actual price by the predicted apartment price (the fifth green column).

\

Coefficient of Determination (R-squared)

To calculate it, we subtract from 1 the ratio of the sum of the second and the fourth green columns:

R-squared = 1 − (sum of column 2 / sum of column 4)

In our case, R-squared = 85.2%.

R-squared measures how much of the variability of the dependent variable is explained by the independent variables in the model. It’s a good way to evaluate how well the model fits the data: the closer the value is to 1, the better the model explains the data. R-squared is best suited for comparing models trained on the same dataset.


MAPE

Mean Absolute Percentage Error (MAPE) is simply the average of the fifth green column.

In our case, MAPE = 14.2%.

MAPE measures how far predictions deviate from actual values in percentage terms and is a good choice when you need an easily interpretable error expressed as a percentage. However, MAPE can be unreliable when the data contains zero or very small values.

Conclusion

Congratulations! You’ve learned about the core metrics used in regression problems.

Follow me — check my profile for links!

\ \

Market Opportunity
LETSTOP Logo
LETSTOP Price(STOP)
$0.01777
$0.01777$0.01777
+3.97%
USD
LETSTOP (STOP) Live Price Chart
Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact service@support.mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.