Toto is a powerful open-source foundation model built specifically for multivariate time-series forecasting, especially in observability scenarios like server metrics, system telemetry, and operational data.
What makes Toto special?
- High-Dimensional Forecasting: It can handle multiple variables at once — like CPU, memory, disk, or any custom metrics — and predict future behavior in parallel.
- Decoder-Only Transformer: Toto uses a decoder-style transformer, optimized with proportional factorized space-time attention — which simply means it’s highly efficient at dealing with both long input sequences and variable-length forecasts.
- Zero-Shot Forecasting: You don’t need to fine-tune it on your data. Just plug in your time-series and it will generate point forecasts and uncertainty bands instantly.
- Quantile-Aware Predictions: It returns not just a median forecast, but also lower and upper confidence intervals, so you can understand both expected values and risk.
- Trained on 2 Trillion Data Points: Toto is trained on massive volumes of observability, synthetic, and public benchmark data, making it robust even in tough, real-world datasets.
Whether you’re forecasting infrastructure load, predicting IoT sensor signals, or running simulations over raw CSV logs — Toto is a foundation model purpose-built for serious time series forecasting tasks.
The average rank of Toto compared to the runner-up models on both the GIFT-Eval and BOOM benchmarks (as of May 19, 2025).
Overview of Toto-Open-Base-1.0 architecture.
Available Checkpoints
GPU Configuration Table for Toto-Open-Base-1.0
GPU Model | vCPUs | RAM (GB) | VRAM (GB) | Use Case | Recommended For |
---|
RTX A6000 | 48 | 96 | 48 | Real-time forecasting, batch inference | Best balance of speed and cost |
A100 40GB | 64 | 160 | 40 | Concurrent batch forecasts, large data | Heavy workloads and production deployment |
A100 80GB | 96 | 192–256 | 80 | Maximum forecast window and sample size | Enterprise-grade scalable deployments |
RTX 4090 | 32 | 64 | 24 | CSV-based forecasting, app prototyping | Interactive Gradio apps and testing |
T4 | 16 | 32 | 16 | Lightweight usage, minimal concurrency | Cost-effective development or testing |
Step-by-Step Process to Built a Cloud Time Series Forecaster with Datadog Toto
For the purpose of this tutorial, we will use a GPU-powered Virtual Machine offered by NodeShift; however, you can replicate the same steps with any other cloud provider of your choice. NodeShift provides the most affordable Virtual Machines at a scale that meets GDPR, SOC2, and ISO27001 requirements.
Step 1: Sign Up and Set Up a NodeShift Cloud Account
Visit the NodeShift Platform and create an account. Once you’ve signed up, log into your account.
Follow the account setup process and provide the necessary details and information.
Step 2: Create a GPU Node (Virtual Machine)
GPU Nodes are NodeShift’s GPU Virtual Machines, on-demand resources equipped with diverse GPUs ranging from H100s to A100s. These GPU-powered VMs provide enhanced environmental control, allowing configuration adjustments for GPUs, CPUs, RAM, and Storage based on specific requirements.
Navigate to the menu on the left side. Select the GPU Nodes option, create a GPU Node in the Dashboard, click the Create GPU Node button, and create your first Virtual Machine deploy
Step 3: Select a Model, Region, and Storage
In the “GPU Nodes” tab, select a GPU Model and Storage according to your needs and the geographical region where you want to launch your model.
We will use 1 x RTX A6000 GPU for this tutorial to achieve the fastest performance. However, you can choose a more affordable GPU with less VRAM if that better suits your requirements.
Step 4: Select Authentication Method
There are two authentication methods available: Password and SSH Key. SSH keys are a more secure option. To create them, please refer to our official documentation.
Step 5: Choose an Image
Next, you will need to choose an image for your Virtual Machine. We will deploy Datadog Toto on an NVIDIA Cuda Virtual Machine. This proprietary, closed-source parallel computing platform will allow you to install Datadog Toto on your GPU Node.
After choosing the image, click the ‘Create’ button, and your Virtual Machine will be deployed.
Step 6: Virtual Machine Successfully Deployed
You will get visual confirmation that your node is up and running.
Step 7: Connect to GPUs using SSH
NodeShift GPUs can be connected to and controlled through a terminal using the SSH key provided during GPU creation.
Once your GPU Node deployment is successfully created and has reached the ‘RUNNING’ status, you can navigate to the page of your GPU Deployment Instance. Then, click the ‘Connect’ button in the top right corner.
Now open your terminal and paste the proxy SSH IP or direct SSH IP.
Step 8: Install CUDA Toolkit (if not already installed)
If your VM uses an NVIDIA image (nvidia/cuda
), CUDA will be pre-installed.
Check with following commands:
nvcc --version
nvidia-smi
Step 9: Check the Available Python version and Install the new version
Run the following commands to check the available Python version.
If you check the version of the python, system has Python 3.8.1 available by default. To install a higher version of Python, you’ll need to use the deadsnakes
PPA.
Run the following commands to add the deadsnakes
PPA:
sudo apt update
sudo apt install -y software-properties-common
sudo add-apt-repository -y ppa:deadsnakes/ppa
sudo apt update
Step 10: Install Python 3.11
Now, run the following command to install Python 3.11 or another desired version:
sudo apt install -y python3.11 python3.11-venv python3.11-dev
Step 11: Update the Default Python3
Version
Now, run the following command to link the new Python version as the default python3
:
sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.8 1
sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.11 2
sudo update-alternatives --config python3
Then, run the following command to verify that the new Python version is active:
python3 --version
Step 12: Install and Update Pip
Run the following command to install and update the pip:
curl -O https://bootstrap.pypa.io/get-pip.py
python3.11 get-pip.py
Then, run the following command to check the version of pip:
pip --version
Step 13: Clone the DataDog Toto Repo
Run the following command to clone the datadog toto repo:
git clone https://github.com/DataDog/toto.git
cd toto
Step 14: Set Up Python Virtual Environment
Run the following command to setup the python virtual environment:
python3 -m venv venv
source venv/bin/activate
Step 15: Install Required Libraries
Run the following command to install required libraries:
pip install -r requirements.txt
Step 16: Connect to your GPU VM using Remote SSH
- Open VS Code on your Mac.
- Press
Cmd + Shift + P
, then choose Remote-SSH: Connect to Host
.
- Select your configured host.
- Once connected, you’ll see
SSH: 38.29.145.28
(Your VM IP) in the bottom-left status bar (like in the image).
Step 17: Open the Project Folder on VM
- Click on “Open Folder”
- Choose the directory where your script is located:
/root/toto
- VS Code will reload the window inside the remote environment.
- In the
/root/toto
folder, right-click → New File
- Name it:
run_toto.py
Step 18: Paste This Full Code into run_toto.py
import sys
import os
# Add the root path so Python can find 'toto'
sys.path.append(os.path.abspath(os.path.dirname(__file__)))
import torch
from toto.data.util.dataset import MaskedTimeseries
from toto.inference.forecaster import TotoForecaster
from toto.model.toto import Toto
DEVICE = 'cuda' if torch.cuda.is_available() else 'cpu'
# Load the model
toto = Toto.from_pretrained('Datadog/Toto-Open-Base-1.0').to(DEVICE)
# Optional: compile for better speed
toto.compile()
forecaster = TotoForecaster(toto.model)
# Dummy data: 7 variables, 4096 time steps
input_series = torch.randn(7, 4096).to(DEVICE)
timestamp_seconds = torch.zeros(7, 4096).to(DEVICE)
time_interval_seconds = torch.full((7,), 60 * 15).to(DEVICE) # 15-minute intervals
inputs = MaskedTimeseries(
series=input_series,
padding_mask=torch.full_like(input_series, True, dtype=torch.bool),
id_mask=torch.zeros_like(input_series),
timestamp_seconds=timestamp_seconds,
time_interval_seconds=time_interval_seconds,
)
# Forecast for 336 future timesteps
forecast = forecaster.forecast(
inputs,
prediction_length=336,
num_samples=256,
samples_per_batch=256,
)
# Output results
print("Median Forecast Shape:", forecast.median.shape)
print("Forecast Sample Shape:", forecast.samples.shape)
print("10% Quantile:", forecast.quantile(0.1).shape)
print("90% Quantile:", forecast.quantile(0.9).shape)
Step 19: Run the File
- Open the VS Code Terminal (`Ctrl + “ or View → Terminal)
- Type:
python3 run_toto.py
You should see output like:
Median Forecast Shape: torch.Size([1, 7, 336])
Forecast Sample Shape: torch.Size([1, 7, 336, 256])
10% Quantile: torch.Size([1, 7, 336])
90% Quantile: torch.Size([1, 7, 336])
Step by Step Process to Run the Forecasting Gradio App on Your GPU VM
Step 1: Install Required Python Packages
Run the following command to install
pip install gradio matplotlib pandas
Step 2: Create Your Gradio App File
- Open VS Code (Remote-SSH into your GPU VM)
- Navigate to the folder:
/root/toto
gradio_toto.py
Step 3: Paste the Full Code Below into gradio_toto.py
import gradio as gr
import torch
import pandas as pd
import matplotlib.pyplot as plt
import tempfile
import os
from toto.model.toto import Toto
from toto.inference.forecaster import TotoForecaster
from toto.data.util.dataset import MaskedTimeseries
DEVICE = 'cuda' if torch.cuda.is_available() else 'cpu'
MODEL = Toto.from_pretrained("Datadog/Toto-Open-Base-1.0").to(DEVICE)
MODEL.compile()
FORECASTER = TotoForecaster(MODEL.model)
def forecast_from_csv(file, prediction_length, num_samples, input_length):
# Read and clean CSV
df = pd.read_csv(file, header=None, skiprows=1, on_bad_lines='skip')
df = df.apply(pd.to_numeric, errors='coerce').dropna()
if df.shape[0] == 0 or df.shape[1] == 0:
raise gr.Error("❌ CSV has no usable numeric data. Please upload a clean CSV with only numbers.")
# Preview (first 5 rows)
preview = df.head()
# Truncate to input_length
df = df.tail(input_length)
# Convert to tensor [n_vars, timesteps]
series = torch.tensor(df.values.T, dtype=torch.float32).to(DEVICE)
timestamp_seconds = torch.zeros_like(series)
time_interval_seconds = torch.full((series.shape[0],), 60 * 15).to(DEVICE)
inputs = MaskedTimeseries(
series=series,
padding_mask=torch.full_like(series, True, dtype=torch.bool),
id_mask=torch.zeros_like(series),
timestamp_seconds=timestamp_seconds,
time_interval_seconds=time_interval_seconds,
)
forecast = FORECASTER.forecast(
inputs,
prediction_length=prediction_length,
num_samples=num_samples,
samples_per_batch=num_samples,
)
median = forecast.median[0].cpu().numpy()
lower = forecast.quantile(0.1)[0].cpu().numpy()
upper = forecast.quantile(0.9)[0].cpu().numpy()
fig, ax = plt.subplots(figsize=(10, 4))
for i in range(median.shape[0]):
ax.plot(median[i], label=f"Var {i+1}")
ax.fill_between(range(prediction_length), lower[i], upper[i], alpha=0.2)
ax.set_title("Forecast (Median with 10%-90% Quantiles)")
ax.set_xlabel("Timestep")
ax.legend()
# Save CSV
df_out = pd.DataFrame()
for i in range(median.shape[0]):
df_out[f'Var_{i+1}_Median'] = median[i]
df_out[f'Var_{i+1}_Lower'] = lower[i]
df_out[f'Var_{i+1}_Upper'] = upper[i]
tmp_file = tempfile.NamedTemporaryFile(delete=False, suffix=".csv")
df_out.to_csv(tmp_file.name, index=False)
return preview, fig, tmp_file.name
# Gradio UI
gr.Interface(
fn=forecast_from_csv,
inputs=[
gr.File(label="Upload CSV File (rows = timesteps, columns = variables)", file_types=[".csv"]),
gr.Slider(10, 336, value=96, label="Prediction Length"),
gr.Slider(64, 512, step=64, value=256, label="Number of Forecast Samples"),
gr.Slider(256, 4096, step=256, value=1024, label="Input Length (Last N Timesteps)"),
],
outputs=[
gr.Dataframe(label="📄 CSV Preview (first 5 rows)"),
gr.Plot(label="📈 Forecast Plot"),
gr.File(label="📥 Download Forecast CSV"),
],
title="🧠 Toto Time Series Forecaster",
description="Upload multivariate time series CSV and get forecasted trends with tuning + preview + CSV export.",
theme="soft"
).launch(server_name="0.0.0.0", server_port=7860)
Step 4: Run the Gradio App
In your terminal (inside the virtual environment):
python3 gradio_toto.py
You’ll see:
Running on local URL: http://0.0.0.0:7860
Step 5: Access the Web App
Option A: Open in browser on Mac via port forwarding
If you set up SSH tunneling:
ssh -L 7860:localhost:7860 root@38.29.145.28 -p 40087
Then go to:
http://localhost:7860
Option B: Open it directly using VM’s public IP (if firewall open):
http://38.29.145.28:7860
Step 6: Generate Forecast Plot and Download CSV
Now that your app is running, here’s how to use it step by step:
Step 6.1: Upload a Time-Series CSV File
- Click on the “Upload CSV File” button.
- Select a
.csv
file structured as:
Rows = Time steps
Columns = Variables
Example: 4096 rows × 7 variables (e.g., sample_7x4096.csv
)
Step 6.2: Configure Forecast Parameters
Adjust the sliders as needed:
- Prediction Length:
Number of future steps to forecast (e.g., 96
)
- Number of Forecast Samples:
Controls uncertainty sampling (e.g., 256
)
- Input Length:
Truncates input to recent N timesteps (e.g., 1024
)
Step 6.3: Click “Submit” to Run the Forecast
Once submitted:
- You’ll see the first 5 rows of your CSV previewed
- A forecast plot will be generated:
- Each variable forecast line is shown
- Shaded bands = 10%–90% uncertainty quantiles
Step 6.4: Download Forecast CSV
Below the plot:
- Click Download Forecast CSV
- This file contains:
Var_1_Median
, Var_1_Lower
, Var_1_Upper
Var_2_Median
, … and so on for all variables
You’ve now:
- Uploaded CSV
- Tuned your forecast settings
- Generated a beautiful plot
- Downloaded usable forecast output for post-processing
Step 7: Preview the First 5 Rows of Your CSV
Before any forecasting happens, your app helps you verify the uploaded data by showing a clean preview table.
Step 7.1: Upload a CSV File
- Click on Upload CSV File
- Select your file, e.g.,
sample_7x4096.csv
- Once uploaded, the app:
- Skips non-numeric rows if needed
- Cleans up any messy values
- Displays the first 5 rows
Step 7.2: Preview Panel
Right below the input section, you’ll see a labeled section:
CSV Preview (first 5 rows)
This helps confirm:
- Your data is loaded correctly
- It has the expected shape
- The values are all numeric
Bonus: Auto-Cleaning Behind the Scenes
The backend automatically:
- Converts all values to float
- Drops rows with missing or invalid entries
- Skips the first row if it contains headers
So even if your CSV isn’t perfect — the app gives you a usable preview without crashing.
Example Output:
0 | 1 | 2 | 3 | 4 | 5 | 6 |
---|
-0.0842 | 0.221 | -1.005 | 0.674 | 1.438 | 0.022 | -0.487 |
… | … | … | … | … | … | … |
Each column = a variable
Each row = a timestep
Step 8: Real-Time System Metrics Forecaster (CPU, RAM, Disk)
With this step, you’ll turn your app into a live system metrics forecaster, reading your VM’s real-time stats and predicting what’s about to happen.
Step 8.1: Create the Forecasting Script
- Open VS Code (Remote-SSH into your GPU VM)
- Navigate to the folder:
/root/toto
gradio_realtime.py
Step 8.2: Paste the Full Code Below into gradio_realtime.py
import gradio as gr
import torch
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import tempfile
import psutil
import time
import threading
from toto.model.toto import Toto
from toto.inference.forecaster import TotoForecaster
from toto.data.util.dataset import MaskedTimeseries
DEVICE = 'cuda' if torch.cuda.is_available() else 'cpu'
MODEL = Toto.from_pretrained("Datadog/Toto-Open-Base-1.0").to(DEVICE)
MODEL.compile()
FORECASTER = TotoForecaster(MODEL.model)
# Rolling buffer
BUFFER_SIZE = 128
METRIC_BUFFER = []
def get_system_metrics():
"""Returns [CPU, RAM, Disk] in percentages."""
return [
psutil.cpu_percent(),
psutil.virtual_memory().percent,
psutil.disk_usage('/').percent
]
def forecast_live_metrics(prediction_length):
# Maintain global buffer
global METRIC_BUFFER
METRIC_BUFFER = []
# Fill initial buffer
while len(METRIC_BUFFER) < BUFFER_SIZE:
METRIC_BUFFER.append(get_system_metrics())
time.sleep(1)
while True:
# Update buffer
METRIC_BUFFER.append(get_system_metrics())
if len(METRIC_BUFFER) > BUFFER_SIZE:
METRIC_BUFFER = METRIC_BUFFER[-BUFFER_SIZE:]
# Prepare input
arr = np.array(METRIC_BUFFER, dtype=np.float32) # shape [T, 3]
series = torch.tensor(arr.T, dtype=torch.float32).to(DEVICE) # shape [3, T]
timestamp_seconds = torch.zeros_like(series)
time_interval_seconds = torch.full((series.shape[0],), 60).to(DEVICE)
inputs = MaskedTimeseries(
series=series,
padding_mask=torch.full_like(series, True, dtype=torch.bool),
id_mask=torch.zeros_like(series),
timestamp_seconds=timestamp_seconds,
time_interval_seconds=time_interval_seconds,
)
forecast = FORECASTER.forecast(
inputs,
prediction_length=prediction_length,
num_samples=256,
samples_per_batch=256,
)
median = forecast.median[0].cpu().numpy()
# Plot
fig, ax = plt.subplots(figsize=(10, 4))
for i in range(median.shape[0]):
ax.plot(median[i], label=["CPU", "RAM", "Disk"][i])
ax.set_title("Forecast from Real-Time System Metrics")
ax.set_xlabel("Future Time Steps")
ax.legend()
yield fig
time.sleep(3) # refresh forecast every 3 seconds
# Launch Gradio interface
gr.Interface(
fn=forecast_live_metrics,
inputs=[
gr.Slider(10, 100, value=30, label="Prediction Length"),
],
outputs=gr.Plot(label="Live Forecast Plot"),
title="📡 Real-Time System Metrics Forecaster",
description="Streams CPU, RAM, and Disk usage and forecasts forward using Datadog's Toto model.",
theme="soft",
live=True
).launch(server_name="0.0.0.0", server_port=7860)
Step 8.3: What This App Does
- Streams your live system usage:
- Maintains a rolling time-series buffer
- Sends it to Datadog’s Toto-Open-Base-1.0 model
- Forecasts forward for the next N timesteps
- Shows real-time predictions as a live chart
Step 8.3: Run the App
Just launch it via:
python3 gradio_realtime.py
This will expose:
Running on local URL: http://0.0.0.0:7861
Step 8.4: Open the Web App
Visit in your browser:
http://localhost:7861
http://38.29.145.28:7861
Step 8.5: Interact
- Use the slider to control how far into the future to forecast (e.g. 10–100 steps)
- Watch the live plot update on each forecast
- Observe how the model predicts your machine’s future load in real time
Conclusion: Build Your Own Cloud Time Series Forecaster
By now, you’ve built something incredible — a full GPU-powered forecasting app, capable of both:
- Upload-based forecasting (with CSV preview, tuning, and downloadable output)
- Real-time forecasting of live system metrics (CPU, RAM, Disk)
You’ve done it using:
- A NodeShift GPU VM
- A clean Python stack with Gradio + Torch + Matplotlib
- And a reliable, well-trained forecasting model — Toto by Datadog
This app isn’t just a demo — it’s production-worthy. You can now:
- Embed it in your dashboard
- Run weekly forecasts on logs
- Build alerts based on thresholds
- Monitor system pressure in advance
Whether you’re an engineer, researcher, or ops-minded developer, this setup turns raw data into useful, actionable insight — without massive overhead or vendor lock-in.
Now go deploy it, share the UI with your team, and predict the future — one timestep at a time.