Back Testing
Last updated
Last updated
Backtesting allows you to evaluate the performance of your prediction models using historical data. Inverse Watch provides tools to seamlessly analyze prediction results and implement basic trading strategies.
While prediction results are available through the predictions page, it's often more convenient to access them programmatically. Inverse Watch also provides an efficient way to retrieve prediction results using the Query Results query runner.
You can access prediction results using SQL-like queries in the Query Results data source. There are two main ways to retrieve predictions:
By Prediction ID:
Replace {prediction_id}
with the specific ID of the prediction you want to retrieve.
Latest Prediction for a Model:
Replace {model_id}
with the ID of your model. This query will return the most recent prediction for that model.
Let's say your model ID is 11 (as shown in the earlier Discord notification). To get the latest prediction results, you would use:
Correspondingly if your generated prediction id was 148 you would use :
After training and evaluating your models, backtesting helps analyze the prediction results and simulate trading strategies. Inverse Watch’s data visualization capabilities allow for insightful analysis of model performance.
For each model (Linear, Random Forest, Neural Network, Gradient Boosting, and AdaBoost), we create a CTE (Common Table Expression) to prepare the prediction data:
LEAD and LAG functions fetch the next and previous prices.
This CTE fetches actual prices, predicted prices, and predicted log price changes from the prediction table.
Next, calculate predicted prices and generate trade signals for each model:
Trade Signals:
Simple buy and sell: If the predicted price is higher than the actual price, generate a buy signal (1
for buy, 0
for no action).
Long/Short strategy: If the predicted price is higher or lower than the actual price, generate long (1
) or short (-1
) signals.
You can join the results from different models into a unified result set:
full_results: This combines the predictions from all models (e.g., Linear Regression, Random Forest, Neural Network) into a single dataset for comparison.
The pnl_calculation
CTE calculates the profit and loss for each trade:
PnL is calculated only when a trade signal was generated in the previous period.
The calculation assumes a fixed position size of 10,000 units.
Keep track of account balance over time using a cumulative sum of PnL:
Starting balance is assumed to be 10,000 units.
Running total is calculated using a cumulative sum of PnL.
The final SELECT statement combines all the calculated fields and orders the results by timestamp:
This query allows us to compare the performance of different models:
Prediction Accuracy: Compare pred_price_*
and pred_price_log_*
with actual_price
and next_actual_price
.
Trade Signals: Analyze the trade_*
short_*
columns to see how often each model generates buy signals.
PnL: The pnl_*
columns show the profit or loss for each trade.
Overall Performance: The balance_*
columns provide a running total of the account balance, indicating overall performance of each model.
Prediction Accuracy: By comparing predicted prices and log price changes with actual data, you can determine the most accurate model.
Trade Signals: Analyze the frequency and success rate of trade signals generated by each model.
Profitability: Use the PnL and balance columns to evaluate the profitability of each trading strategy. This simplified backtest does not account for transaction costs, slippage, or market impact, so keep these factors in mind when interpreting results.