How I Built a Cloud Time Series Forecaster with Datadog Toto-Open-Base-1.0

by Ayush Kumar | June 6, 2025

Ready to build cheaper?

Custom CPU plans from as little as $0.012/hour.

Toto is a powerful open-source foundation model built specifically for multivariate time-series forecasting, especially in observability scenarios like server metrics, system telemetry, and operational data.

What makes Toto special?

High-Dimensional Forecasting: It can handle multiple variables at once — like CPU, memory, disk, or any custom metrics — and predict future behavior in parallel.
Decoder-Only Transformer: Toto uses a decoder-style transformer, optimized with proportional factorized space-time attention — which simply means it’s highly efficient at dealing with both long input sequences and variable-length forecasts.
Zero-Shot Forecasting: You don’t need to fine-tune it on your data. Just plug in your time-series and it will generate point forecasts and uncertainty bands instantly.
Quantile-Aware Predictions: It returns not just a median forecast, but also lower and upper confidence intervals, so you can understand both expected values and risk.
Trained on 2 Trillion Data Points: Toto is trained on massive volumes of observability, synthetic, and public benchmark data, making it robust even in tough, real-world datasets.

Whether you’re forecasting infrastructure load, predicting IoT sensor signals, or running simulations over raw CSV logs — Toto is a foundation model purpose-built for serious time series forecasting tasks.

The average rank of Toto compared to the runner-up models on both the GIFT-Eval and BOOM benchmarks (as of May 19, 2025).

Overview of Toto-Open-Base-1.0 architecture.

Available Checkpoints

Checkpoint	Parameters	Config	Size	Notes
Toto-Open-Base-1.0	151M	Config	605 MB	Initial release with SOTA performance

GPU Configuration Table for Toto-Open-Base-1.0

GPU Model	vCPUs	RAM (GB)	VRAM (GB)	Use Case	Recommended For
RTX A6000	48	96	48	Real-time forecasting, batch inference	Best balance of speed and cost
A100 40GB	64	160	40	Concurrent batch forecasts, large data	Heavy workloads and production deployment
A100 80GB	96	192–256	80	Maximum forecast window and sample size	Enterprise-grade scalable deployments
RTX 4090	32	64	24	CSV-based forecasting, app prototyping	Interactive Gradio apps and testing
T4	16	32	16	Lightweight usage, minimal concurrency	Cost-effective development or testing

Step-by-Step Process to Built a Cloud Time Series Forecaster with Datadog Toto

For the purpose of this tutorial, we will use a GPU-powered Virtual Machine offered by NodeShift; however, you can replicate the same steps with any other cloud provider of your choice. NodeShift provides the most affordable Virtual Machines at a scale that meets GDPR, SOC2, and ISO27001 requirements.

Step 1: Sign Up and Set Up a NodeShift Cloud Account

Visit the NodeShift Platform and create an account. Once you’ve signed up, log into your account.

Follow the account setup process and provide the necessary details and information.

Step 2: Create a GPU Node (Virtual Machine)

GPU Nodes are NodeShift’s GPU Virtual Machines, on-demand resources equipped with diverse GPUs ranging from H100s to A100s. These GPU-powered VMs provide enhanced environmental control, allowing configuration adjustments for GPUs, CPUs, RAM, and Storage based on specific requirements.

Navigate to the menu on the left side. Select the GPU Nodes option, create a GPU Node in the Dashboard, click the Create GPU Node button, and create your first Virtual Machine deploy

Step 3: Select a Model, Region, and Storage

In the “GPU Nodes” tab, select a GPU Model and Storage according to your needs and the geographical region where you want to launch your model.

We will use 1 x RTX A6000 GPU for this tutorial to achieve the fastest performance. However, you can choose a more affordable GPU with less VRAM if that better suits your requirements.

Step 4: Select Authentication Method

There are two authentication methods available: Password and SSH Key. SSH keys are a more secure option. To create them, please refer to our official documentation.

Step 5: Choose an Image

Next, you will need to choose an image for your Virtual Machine. We will deploy Datadog Toto on an NVIDIA Cuda Virtual Machine. This proprietary, closed-source parallel computing platform will allow you to install Datadog Toto on your GPU Node.

After choosing the image, click the ‘Create’ button, and your Virtual Machine will be deployed.

Step 6: Virtual Machine Successfully Deployed

You will get visual confirmation that your node is up and running.

Step 7: Connect to GPUs using SSH

NodeShift GPUs can be connected to and controlled through a terminal using the SSH key provided during GPU creation.

Once your GPU Node deployment is successfully created and has reached the ‘RUNNING’ status, you can navigate to the page of your GPU Deployment Instance. Then, click the ‘Connect’ button in the top right corner.

Now open your terminal and paste the proxy SSH IP or direct SSH IP.

Step 8: Install CUDA Toolkit (if not already installed)

If your VM uses an NVIDIA image (nvidia/cuda), CUDA will be pre-installed.

Check with following commands:

nvcc --version
nvidia-smi

Step 9: Check the Available Python version and Install the new version

Run the following commands to check the available Python version.

If you check the version of the python, system has Python 3.8.1 available by default. To install a higher version of Python, you’ll need to use the deadsnakes PPA.

Run the following commands to add the deadsnakes PPA:

sudo apt update
sudo apt install -y software-properties-common
sudo add-apt-repository -y ppa:deadsnakes/ppa
sudo apt update

Step 10: Install Python 3.11

Now, run the following command to install Python 3.11 or another desired version:

sudo apt install -y python3.11 python3.11-venv python3.11-dev

Step 11: Update the Default `Python3` Version

Now, run the following command to link the new Python version as the default python3:

sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.8 1
sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.11 2
sudo update-alternatives --config python3

Then, run the following command to verify that the new Python version is active:

python3 --version

Step 12: Install and Update Pip

Run the following command to install and update the pip:

curl -O https://bootstrap.pypa.io/get-pip.py
python3.11 get-pip.py

Then, run the following command to check the version of pip:

pip --version

Step 13: Clone the DataDog Toto Repo

Run the following command to clone the datadog toto repo:

git clone https://github.com/DataDog/toto.git
cd toto

Step 14: Set Up Python Virtual Environment

Run the following command to setup the python virtual environment:

python3 -m venv venv
source venv/bin/activate

Step 15: Install Required Libraries

Run the following command to install required libraries:

pip install -r requirements.txt

Step 16: Connect to your GPU VM using Remote SSH

Open VS Code on your Mac.
Press Cmd + Shift + P, then choose Remote-SSH: Connect to Host.
Select your configured host.
Once connected, you’ll see SSH: 38.29.145.28(Your VM IP) in the bottom-left status bar (like in the image).

Step 17: Open the Project Folder on VM

Click on “Open Folder”
Choose the directory where your script is located:

/root/toto

VS Code will reload the window inside the remote environment.
In the /root/toto folder, right-click → New File
Name it:

run_toto.py

Step 18: Paste This Full Code into `run_toto.py`

import sys
import os

# Add the root path so Python can find 'toto'
sys.path.append(os.path.abspath(os.path.dirname(__file__)))

import torch
from toto.data.util.dataset import MaskedTimeseries
from toto.inference.forecaster import TotoForecaster
from toto.model.toto import Toto

DEVICE = 'cuda' if torch.cuda.is_available() else 'cpu'

# Load the model
toto = Toto.from_pretrained('Datadog/Toto-Open-Base-1.0').to(DEVICE)

# Optional: compile for better speed
toto.compile()

forecaster = TotoForecaster(toto.model)

# Dummy data: 7 variables, 4096 time steps
input_series = torch.randn(7, 4096).to(DEVICE)
timestamp_seconds = torch.zeros(7, 4096).to(DEVICE)
time_interval_seconds = torch.full((7,), 60 * 15).to(DEVICE)  # 15-minute intervals

inputs = MaskedTimeseries(
    series=input_series,
    padding_mask=torch.full_like(input_series, True, dtype=torch.bool),
    id_mask=torch.zeros_like(input_series),
    timestamp_seconds=timestamp_seconds,
    time_interval_seconds=time_interval_seconds,
)

# Forecast for 336 future timesteps
forecast = forecaster.forecast(
    inputs,
    prediction_length=336,
    num_samples=256,
    samples_per_batch=256,
)

# Output results
print("Median Forecast Shape:", forecast.median.shape)
print("Forecast Sample Shape:", forecast.samples.shape)
print("10% Quantile:", forecast.quantile(0.1).shape)
print("90% Quantile:", forecast.quantile(0.9).shape)

Step 19: Run the File

Open the VS Code Terminal (`Ctrl + “ or View → Terminal)
Type:

python3 run_toto.py

You should see output like:

Median Forecast Shape: torch.Size([1, 7, 336])
Forecast Sample Shape: torch.Size([1, 7, 336, 256])
10% Quantile: torch.Size([1, 7, 336])
90% Quantile: torch.Size([1, 7, 336])

Step by Step Process to Run the Forecasting Gradio App on Your GPU VM

Step 1: Install Required Python Packages

Run the following command to install

pip install gradio matplotlib pandas

Step 2: Create Your Gradio App File

Open VS Code (Remote-SSH into your GPU VM)
Navigate to the folder:

/root/toto

Create a new file:

gradio_toto.py

Step 3: Paste the Full Code Below into `gradio_toto.py`

import gradio as gr
import torch
import pandas as pd
import matplotlib.pyplot as plt
import tempfile
import os

from toto.model.toto import Toto
from toto.inference.forecaster import TotoForecaster
from toto.data.util.dataset import MaskedTimeseries

DEVICE = 'cuda' if torch.cuda.is_available() else 'cpu'
MODEL = Toto.from_pretrained("Datadog/Toto-Open-Base-1.0").to(DEVICE)
MODEL.compile()
FORECASTER = TotoForecaster(MODEL.model)

def forecast_from_csv(file, prediction_length, num_samples, input_length):
    # Read and clean CSV
    df = pd.read_csv(file, header=None, skiprows=1, on_bad_lines='skip')
    df = df.apply(pd.to_numeric, errors='coerce').dropna()

    if df.shape[0] == 0 or df.shape[1] == 0:
        raise gr.Error("❌ CSV has no usable numeric data. Please upload a clean CSV with only numbers.")

    # Preview (first 5 rows)
    preview = df.head()

    # Truncate to input_length
    df = df.tail(input_length)

    # Convert to tensor [n_vars, timesteps]
    series = torch.tensor(df.values.T, dtype=torch.float32).to(DEVICE)

    timestamp_seconds = torch.zeros_like(series)
    time_interval_seconds = torch.full((series.shape[0],), 60 * 15).to(DEVICE)

    inputs = MaskedTimeseries(
        series=series,
        padding_mask=torch.full_like(series, True, dtype=torch.bool),
        id_mask=torch.zeros_like(series),
        timestamp_seconds=timestamp_seconds,
        time_interval_seconds=time_interval_seconds,
    )

    forecast = FORECASTER.forecast(
        inputs,
        prediction_length=prediction_length,
        num_samples=num_samples,
        samples_per_batch=num_samples,
    )

    median = forecast.median[0].cpu().numpy()
    lower = forecast.quantile(0.1)[0].cpu().numpy()
    upper = forecast.quantile(0.9)[0].cpu().numpy()

    fig, ax = plt.subplots(figsize=(10, 4))
    for i in range(median.shape[0]):
        ax.plot(median[i], label=f"Var {i+1}")
        ax.fill_between(range(prediction_length), lower[i], upper[i], alpha=0.2)
    ax.set_title("Forecast (Median with 10%-90% Quantiles)")
    ax.set_xlabel("Timestep")
    ax.legend()

    # Save CSV
    df_out = pd.DataFrame()
    for i in range(median.shape[0]):
        df_out[f'Var_{i+1}_Median'] = median[i]
        df_out[f'Var_{i+1}_Lower'] = lower[i]
        df_out[f'Var_{i+1}_Upper'] = upper[i]

    tmp_file = tempfile.NamedTemporaryFile(delete=False, suffix=".csv")
    df_out.to_csv(tmp_file.name, index=False)

    return preview, fig, tmp_file.name

# Gradio UI
gr.Interface(
    fn=forecast_from_csv,
    inputs=[
        gr.File(label="Upload CSV File (rows = timesteps, columns = variables)", file_types=[".csv"]),
        gr.Slider(10, 336, value=96, label="Prediction Length"),
        gr.Slider(64, 512, step=64, value=256, label="Number of Forecast Samples"),
        gr.Slider(256, 4096, step=256, value=1024, label="Input Length (Last N Timesteps)"),
    ],
    outputs=[
        gr.Dataframe(label="📄 CSV Preview (first 5 rows)"),
        gr.Plot(label="📈 Forecast Plot"),
        gr.File(label="📥 Download Forecast CSV"),
    ],
    title="🧠 Toto Time Series Forecaster",
    description="Upload multivariate time series CSV and get forecasted trends with tuning + preview + CSV export.",
    theme="soft"
).launch(server_name="0.0.0.0", server_port=7860)

Step 4: Run the Gradio App

In your terminal (inside the virtual environment):

python3 gradio_toto.py

You’ll see:

Running on local URL: http://0.0.0.0:7860

Step 5: Access the Web App

Option A: Open in browser on Mac via port forwarding

If you set up SSH tunneling:

ssh -L 7860:localhost:7860 root@38.29.145.28 -p 40087

Then go to:

http://localhost:7860

Option B: Open it directly using VM’s public IP (if firewall open):

http://38.29.145.28:7860

Step 6: Generate Forecast Plot and Download CSV

Now that your app is running, here’s how to use it step by step:

Step 6.1: Upload a Time-Series CSV File

Click on the “Upload CSV File” button.
Select a .csv file structured as:

Rows   = Time steps  
Columns = Variables

Example: 4096 rows × 7 variables (e.g., sample_7x4096.csv)

Step 6.2: Configure Forecast Parameters

Adjust the sliders as needed:

Prediction Length:
Number of future steps to forecast (e.g., 96)
Number of Forecast Samples:
Controls uncertainty sampling (e.g., 256)
Input Length:
Truncates input to recent N timesteps (e.g., 1024)

Step 6.3: Click “Submit” to Run the Forecast

Once submitted:

You’ll see the first 5 rows of your CSV previewed
A forecast plot will be generated:
- Each variable forecast line is shown
- Shaded bands = 10%–90% uncertainty quantiles

Step 6.4: Download Forecast CSV

Below the plot:

Click Download Forecast CSV
This file contains:
- Var_1_Median, Var_1_Lower, Var_1_Upper
- Var_2_Median, … and so on for all variables

You’ve now:

Uploaded CSV
Tuned your forecast settings
Generated a beautiful plot
Downloaded usable forecast output for post-processing

Step 7: Preview the First 5 Rows of Your CSV

Before any forecasting happens, your app helps you verify the uploaded data by showing a clean preview table.

Step 7.1: Upload a CSV File

Click on Upload CSV File
Select your file, e.g., sample_7x4096.csv
Once uploaded, the app:
- Skips non-numeric rows if needed
- Cleans up any messy values
- Displays the first 5 rows

Step 7.2: Preview Panel

Right below the input section, you’ll see a labeled section:

CSV Preview (first 5 rows)

This helps confirm:

Your data is loaded correctly
It has the expected shape
The values are all numeric

Bonus: Auto-Cleaning Behind the Scenes

The backend automatically:

Converts all values to float
Drops rows with missing or invalid entries
Skips the first row if it contains headers

So even if your CSV isn’t perfect — the app gives you a usable preview without crashing.

Example Output:

0	1	2	3	4	5	6
-0.0842	0.221	-1.005	0.674	1.438	0.022	-0.487
…	…	…	…	…	…	…

Each column = a variable
Each row = a timestep

Step 8: Real-Time System Metrics Forecaster (CPU, RAM, Disk)

With this step, you’ll turn your app into a live system metrics forecaster, reading your VM’s real-time stats and predicting what’s about to happen.

Step 8.1: Create the Forecasting Script

Open VS Code (Remote-SSH into your GPU VM)
Navigate to the folder:

/root/toto

Create a new file:

gradio_realtime.py

Step 8.2: Paste the Full Code Below into `gradio_realtime.py`

import gradio as gr
import torch
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import tempfile
import psutil
import time
import threading

from toto.model.toto import Toto
from toto.inference.forecaster import TotoForecaster
from toto.data.util.dataset import MaskedTimeseries

DEVICE = 'cuda' if torch.cuda.is_available() else 'cpu'
MODEL = Toto.from_pretrained("Datadog/Toto-Open-Base-1.0").to(DEVICE)
MODEL.compile()
FORECASTER = TotoForecaster(MODEL.model)

# Rolling buffer
BUFFER_SIZE = 128
METRIC_BUFFER = []

def get_system_metrics():
    """Returns [CPU, RAM, Disk] in percentages."""
    return [
        psutil.cpu_percent(),
        psutil.virtual_memory().percent,
        psutil.disk_usage('/').percent
    ]

def forecast_live_metrics(prediction_length):
    # Maintain global buffer
    global METRIC_BUFFER
    METRIC_BUFFER = []

    # Fill initial buffer
    while len(METRIC_BUFFER) < BUFFER_SIZE:
        METRIC_BUFFER.append(get_system_metrics())
        time.sleep(1)

    while True:
        # Update buffer
        METRIC_BUFFER.append(get_system_metrics())
        if len(METRIC_BUFFER) > BUFFER_SIZE:
            METRIC_BUFFER = METRIC_BUFFER[-BUFFER_SIZE:]

        # Prepare input
        arr = np.array(METRIC_BUFFER, dtype=np.float32)  # shape [T, 3]
        series = torch.tensor(arr.T, dtype=torch.float32).to(DEVICE)  # shape [3, T]
        timestamp_seconds = torch.zeros_like(series)
        time_interval_seconds = torch.full((series.shape[0],), 60).to(DEVICE)

        inputs = MaskedTimeseries(
            series=series,
            padding_mask=torch.full_like(series, True, dtype=torch.bool),
            id_mask=torch.zeros_like(series),
            timestamp_seconds=timestamp_seconds,
            time_interval_seconds=time_interval_seconds,
        )

        forecast = FORECASTER.forecast(
            inputs,
            prediction_length=prediction_length,
            num_samples=256,
            samples_per_batch=256,
        )

        median = forecast.median[0].cpu().numpy()

        # Plot
        fig, ax = plt.subplots(figsize=(10, 4))
        for i in range(median.shape[0]):
            ax.plot(median[i], label=["CPU", "RAM", "Disk"][i])
        ax.set_title("Forecast from Real-Time System Metrics")
        ax.set_xlabel("Future Time Steps")
        ax.legend()

        yield fig
        time.sleep(3)  # refresh forecast every 3 seconds

# Launch Gradio interface
gr.Interface(
    fn=forecast_live_metrics,
    inputs=[
        gr.Slider(10, 100, value=30, label="Prediction Length"),
    ],
    outputs=gr.Plot(label="Live Forecast Plot"),
    title="📡 Real-Time System Metrics Forecaster",
    description="Streams CPU, RAM, and Disk usage and forecasts forward using Datadog's Toto model.",
    theme="soft",
    live=True
).launch(server_name="0.0.0.0", server_port=7860)

Step 8.3: What This App Does

Streams your live system usage:
- CPU %
- RAM %
- Disk %
Maintains a rolling time-series buffer
Sends it to Datadog’s Toto-Open-Base-1.0 model
Forecasts forward for the next N timesteps
Shows real-time predictions as a live chart

Step 8.3: Run the App

Just launch it via:

python3 gradio_realtime.py

This will expose:

Running on local URL: http://0.0.0.0:7861

Step 8.4: Open the Web App

Visit in your browser:

If SSH tunnel enabled:

http://localhost:7861

If public VM IP is open:

http://38.29.145.28:7861

Step 8.5: Interact

Use the slider to control how far into the future to forecast (e.g. 10–100 steps)
Watch the live plot update on each forecast
Observe how the model predicts your machine’s future load in real time

Conclusion: Build Your Own Cloud Time Series Forecaster

By now, you’ve built something incredible — a full GPU-powered forecasting app, capable of both:

Upload-based forecasting (with CSV preview, tuning, and downloadable output)
Real-time forecasting of live system metrics (CPU, RAM, Disk)

You’ve done it using:

A NodeShift GPU VM
A clean Python stack with Gradio + Torch + Matplotlib
And a reliable, well-trained forecasting model — Toto by Datadog

This app isn’t just a demo — it’s production-worthy. You can now:

Embed it in your dashboard
Run weekly forecasts on logs
Build alerts based on thresholds
Monitor system pressure in advance

Whether you’re an engineer, researcher, or ops-minded developer, this setup turns raw data into useful, actionable insight — without massive overhead or vendor lock-in.

Now go deploy it, share the UI with your team, and predict the future — one timestep at a time.

Relevant blog posts

June 30, 2025

How to Install ByteDance Dolphin Locally?

Dolphin is a powerful tool that reads and understands document images — whether it’s a scanned PDF, a handwritten formula, or a complex layout with tables and figures. It works in two smart steps: first, it analyzes the full structure of the page (like how we read top to bottom, left to right), then it breaks down each element (like a paragraph or equation) and makes sense of it in parallel. What makes Dolphin stand out is how lightweight and fast it is, while still handling all the messy, real-world formats we throw at it — making it perfect for researchers, developers, and document-heavy workflows.

June 27, 2025

How to Install FLUX.1-Kontext-Dev Locally?

FLUX.1 Kontext [dev] is a powerful visual editing model designed to change and transform existing images based on natural instructions. Whether it’s adding new elements like a hat to a dog or adjusting the style of a scene, this model understands the context and applies the edit with impressive consistency — all without needing additional fine-tuning. Built by Black Forest Labs, FLUX.1 Kontext is equipped to handle complex transformations while preserving the original image’s integrity. What makes it truly stand out is its ability to perform multiple edits in a row with minimal drift, allowing creators, designers, and developers to iterate smoothly. This release — the [dev] version — is open to the research and builder community under a non-commercial license, with high-quality weights and native support in tools like Diffusers and ComfyUI. If you’re looking to build the next wave of creative tools, this model gives you a serious head start.

June 25, 2025

LLMs Under Fire: Red Teaming with DeepTeam + Ollama

DeepTeam is a lightweight, easy-to-use red teaming framework designed to help you test the safety and security of your language model applications — locally and transparently. Whether you’re building a chatbot, a RAG pipeline, or a full-fledged AI agent, DeepTeam helps uncover hidden vulnerabilities like bias, PII leakage, or harmful prompts before your users ever see them. Built entirely open-source and backed by the powerful DeepEval engine, DeepTeam simulates real-world adversarial attacks using methods like prompt injection and jailbreaking. It then evaluates how well your model handles them using standardized risk metrics — all without needing a curated dataset. If you’re a developer, security engineer, or open-source contributor passionate about LLM safety — this is your playground. Dive in, run local tests, or even contribute your own custom vulnerabilities and attack types. Safety isn’t optional anymore — it’s a feature. And DeepTeam helps you build it in.

See all posts

Ready to build
with us?

The ideal way for organizations young and old to ease their way into the distributed and affordable cloud at their own pace.

Stay Tuned!

Stay up to date with the latest updates, news, and hotfixes for our product.

NodeShift creates a vital link between developers and affordable cloud.

Switch theme

English (EN)
Arabic (AR)
Chinese (ZH-CN)
German (DE)
Korean (KO)
Russian (RU)
French (FR)
Spanish (ES)
Portuguese (PT)
Japanese (JA)

JavaScript is disabled in your browser. For a better experience, please enable JavaScript.Learn how to enable JavaScript.