Inverse Watch Docs
AppLanding
  • Overview
    • Home
    • Governance
      • Proposal 7
      • Proposal 25
      • Proposal 52
      • Proposal 107
      • Proposal 147 - S1
      • Proposal 189 - S2
  • Products
    • Inverse Alerts
      • See on Twitter
    • Inverse Chatbot
      • /doc
      • /imagine
      • /data
      • /graph
    • Inverse Subgraphs
      • See inverse-subgraph on Mainnet
      • See inverse-governance-subgraph on Mainnet
    • Inverse Watch
      • Go to App
  • User Guide
    • Quickstart
    • Alerts
      • Setting Up an Alert
      • Adding New Alert Destinations
      • Customize Alert Template
      • Multiple Column Alert
    • Queries
      • Creating and Editing Queries
      • Querying Existing Query Results
      • Query Parameters
      • How to Schedule a Query
      • Favorites & Tagging
      • Query Filters
      • How To Download / Export Query Results
      • Query Snippets
    • Visualizations
      • Cohort Visualizations
      • Visualizations How-To
      • Chart Visualizations
      • Formatting Numbers in Visualizations
      • How to Make a Pivot Table
      • Funnel Visualizations
      • Table Visualization Options
      • Visualizations Types
    • Dashboards
      • Creating and Editing Dashboards
      • Favorites & Tagging
      • Sharing and Embedding Dashboards
    • Data Sources
      • CSV & Excel Files
      • Google Sheets
      • JSON (API)
      • Python
      • EVM Chain Logs
      • EVM Chain State
      • GraphQL
      • Dune API
    • Machine Learning
      • Data Engineering
      • Regressors
        • Linear Regression
        • Random Forest
        • Ada Boosting
        • Gradient Boosting
        • Neural Network (LSTM)
      • Training and Predicting
      • Metrics & Overfitting
      • Examples
        • Price Prediction
          • Data Preprocessing
          • Model Creation & Training
          • Metrics Evaluation
          • Back Testing
          • Visualizing
        • Liquidation Risk
  • Admin & Dev Guide
    • Setup
    • Redash
    • Integrations & API
    • Query Runners
    • Users
      • Adding a Profile Picture
      • Authentication Options
      • Group Management
      • Inviting Users to Use Redash
      • Permissions & Groups
    • Visualizations
  • Cheat Sheets
    • Snippets
    • Contracts
  • More
    • Deprecated Apps
    • Github : inverse-flaskbot
    • Github : inverse-subgraph
    • Github : inverse-watch
Powered by GitBook
On this page
  • Overview
  • Integration with Inverse Watch
  • Workflow Components
  • Supported Regressors
  • Detailed Regressor Documentation
  • Key Classes
  • Auto Mode
  • Multi-output Support
  • Serialization and Storage

Was this helpful?

  1. User Guide

Machine Learning

PreviousDune APINextData Engineering

Last updated 7 months ago

Was this helpful?

Overview

This gitbook provides a comprehensive overview of the machine learning workflow implemented in Inverse Watch. The workflow is designed to be flexible, supporting various types of regression and both single and multi-output scenarios.

Integration with Inverse Watch

By leveraging the capabilities of our platform, which is forked from Redash, this workflow becomes a highly versatile tool for feeding machine learning models. Our powerful data visualization and query capabilities allow users to seamlessly integrate data from various sources, making it easier to prepare and feed data into the ML models. This integration enhances the overall efficiency and effectiveness of the machine learning process, providing a robust environment for data-driven decision-making.

Workflow Components

The ML workflow consists of the following main components:

  1. Data Preparation:

    • Query execution to fetch raw data

    • Data cleaning and structuring

    • Identification of feature types (numeric, categorical, timestamp)

    • Encoding of categorical variables

    • Scaling of numeric features

    • Extraction of time-based features from timestamps

    • Dimensionality reduction using autoencoders

  2. Feature Engineering:

    • Automatic detection and transformation of feature types

    • Application of cyclical encoding for time-based features

    • Use of autoencoders for dimensionality reduction

  3. Model Initialization:

    • Selection of appropriate regressor based on configuration

    • Initialization of model with default or user-specified parameters

  4. Model Training:

    • Splitting data into training and validation sets

    • Training the model on the training data

    • Hyperparameter tuning if auto_mode is enabled

  5. Model Evaluation and Tuning:

    • Evaluation of model performance on validation data

    • Selection of best hyperparameters (if in auto_mode)

    • Saving of best model and parameters

  6. Prediction:

    • Loading of trained model

    • Preprocessing of new data

    • Generation of predictions

    • Decoding of predictions into human-readable format

Supported Regressors

The system supports multiple types of regressors, each with its own initialization and training process:

  • Linear/Logistic Regression: Offers simplicity and interpretability for both continuous and categorical targets. Key hyperparameters include fit_intercept for linear regression and C for logistic regression.

  • Random Forest: Utilizes ensemble learning to improve prediction accuracy and reduce overfitting. Key hyperparameters include n_estimators, max_depth, and max_features.

  • AdaBoost: Combines multiple weak learners to create a strong regressor, with support for both regression and classification tasks. Key hyperparameters include n_estimators and learning_rate.

  • Gradient Boosting: Sequentially builds models to correct errors from previous models, suitable for capturing complex patterns. Key hyperparameters include n_estimators, learning_rate, and max_depth.

  • Neural Networks: Employs deep learning techniques for handling complex, non-linear relationships in data. Key hyperparameters include epochs, batch_size, and learning_rate.

Detailed Regressor Documentation

For more detailed information on each regressor type and its specific workflow, please refer to the individual regressor documentation:

Key Classes

MLModel

The main class that orchestrates the entire ML workflow. It handles data preparation, feature engineering, and manages the training and prediction processes.

TunedMultiOutputEstimator

A custom estimator that supports multi-output scenarios and hyperparameter tuning for traditional machine learning models.

TuneableNNRegressor

A custom class for neural network models that supports hyperparameter tuning and handles both single and multi-output scenarios.

Auto Mode

The system supports an "auto mode" for each regressor type, enabling automated hyperparameter tuning to optimize model performance.

Multi-output Support

The workflow is designed to handle both single-output and multi-output scenarios, using appropriate wrappers or custom implementations.

Serialization and Storage

Trained models are serialized and stored in the database, facilitating easy retrieval and deployment.

Linear/Logistic Regression
Random Forest Regressor
AdaBoost Regressor
Gradient Boosting Regressor
Neural Network Regressor
ML Workflow
Drawing