# Metrics & Overfitting

Our machine learning workflow incorporates a robust set of metrics to evaluate model performance for both regression and classification tasks. This document provides an overview of these metrics and how they are calculated and used in our system.

<figure><img src="/files/BnE31iAH7sDVdkl4VwNe" alt=""><figcaption></figcaption></figure>

The `_calculate_metrics` method is the core function responsible for computing various performance metrics. It handles both regression and classification tasks, as well as multi-output scenarios.

***

## <mark style="color:blue;">Metrics Calculation Process</mark>

1. **Data Preparation**:
   * Ensures all inputs (y\_true, y\_pred, y\_train, y\_train\_pred) are 2D numpy arrays.
   * Adjusts prediction arrays to match the shape of true values.
2. **Target-specific Metrics**:
   * Iterates through each target, calculating metrics based on the target type (categorical or numeric).
3. **Overall Metrics**:
   * Computes average metrics across all targets.
4. **Overfitting Detection**:
   * Calculates metrics to detect and quantify potential overfitting.

***

## <mark style="color:blue;">Classification Metrics</mark>

For categorical targets, the following metrics are calculated:

1. **Accuracy**:
   * Ratio of correct predictions to total predictions.
   * Calculated using `accuracy_score` from scikit-learn.
2. **Precision**:
   * Ratio of true positive predictions to total positive predictions.
   * Calculated using `precision_score` with weighted average.
3. **Recall**:
   * Ratio of true positive predictions to total actual positives.
   * Calculated using `recall_score` with weighted average.
4. **F1 Score**:
   * Harmonic mean of precision and recall.
   * Calculated using `f1_score` with weighted average.

***

## <mark style="color:blue;">Regression Metrics</mark>

For numeric targets, the following metrics are calculated:

1. **Mean Absolute Error (MAE)**:
   * Average absolute difference between predicted and actual values.
   * Calculated using `mean_absolute_error` from scikit-learn.
2. **Mean Squared Error (MSE)**:
   * Average squared difference between predicted and actual values.
   * Calculated using `mean_squared_error` from scikit-learn.
3. **R-squared (R2) Score**:
   * Proportion of variance in the dependent variable predictable from the independent variable(s).
   * Calculated using `r2_score` from scikit-learn.

***

## <mark style="color:blue;">Overfitting Detection</mark>

To detect and quantify overfitting, we calculate:

1. **Performance Difference**:
   * Difference between average train performance and average validation performance.
2. **Performance Ratio**:
   * Ratio of average train performance to average validation performance.
3. **Overfitting Flag**:
   * Set to `True` if performance difference > 0.15 or performance ratio > 1.3.
4. **Overfitting Score**:
   * Maximum of performance difference and (performance ratio - 1).


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.inverse.watch/user-guide/machine-learning/metrics-and-overfitting.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
