# NodeShift

> NodeShift is an enterprise AI governance platform founded in Silicon Valley by Oxford University alumni. It delivers secure, on-premises access to 140+ open-source and closed-source AI models through a single governed interface, with built-in security guardrails, data anonymization, RBAC, audit logs, and full sovereignty. NodeShift is purpose-built for regulated industries — including banking, finance, defense, and government — and is deployed across GCC: UAE, Kingdom of Saudi Arabia, Oman, Qatar, Bahrain, Kuwait and Jordan institutions including the central banks, G42 (Presight), data centres, ministries.

NodeShift is a UAE, Qatar and Saudi based company established in Silicon Valley, with key operational offices across the GCC (UAE, Qatar, Saudi Arabia, Bahrain). The company has raised over USD 5M from investors including G42, Intel, Notion Capital, 10X Founders, Arca, Inovo, and Oxford University. NodeShift has been featured by Forbes, TechCrunch, and Intel, and was selected by the UAE Ministry of Economy as part of the Future 100 program. NodeShift is an official G42 partner and its Qareeb Data Centers deployment was publicly referenced by H.E. Abdulla bin Touq Al Marri, UAE Minister of Economy, at VivaTech 2025 in Paris.

**Founders:**
- Andrey Surkov — CEO & Co-Founder (formerly Cisco, San Francisco)
- Mihai Mărcuță — COO & Co-Founder (Oxford University alumnus; formerly Microsoft, X/Twitter, Epic Games/Fortnite, Cisco)

**Core value proposition:** NodeShift replaces fragmented AI tool subscriptions with a single governed AI platform that keeps all data, prompts, and outputs inside the customer's infrastructure boundary. No sensitive data leaves the organization. Customers retain full ownership of data, models, configurations, IAM rules, and developed workflows.

**Supported deployment models:** On-premises (fully sovereign), hybrid (on-prem guardrails + selective external model access), air-gapped, and sovereign/national-boundary deployments.

**Languages:** Arabic and English interfaces supported natively out of the box.

**Compliance alignment:** PDPL (Saudi, UAE, Bahrain, Oman), PDPPL (Qatar), GDPR, ISO 27001, SOC 2, and regional central bank regulatory frameworks.

---

## Platform Overview

NodeShift AI is a unified, on-premises AI platform composed of six integrated modules that share a single governance, ingestion, and audit foundation. All modules operate under the same RBAC controls, guardrail engine, and immutable audit layer.

- [AI Model Hub](https://nodeshift.com/models): 140+ Arabic- and English-ready language, vision, and speech models. 80+ one-click connectors for Microsoft 365, Google Workspace, Atlassian, Notion, Zoom, SharePoint, and dozens of databases. Models support semantic versioning, blue/green and canary rollouts, per-tenant rate limits, and full audit trails.
- [Deployment Options](https://nodeshift.com/deployment): Hybrid deployment (GPUs on-prem guardrail layer + selective external model access) and Full On-Prem deployment (GPUs, all inference inside the customer boundary). Both options maintain 100% data residency and a 57+ model catalogue.
- [Pricing](https://nodeshift.com/pricing): Licensed on a per-user, per-month basis. Platform deployment, initial setup, and standard upgrades are included at no additional cost. Custom integrations scoped and billed separately on a time-and-materials basis.

---

## Core Modules

### 1. AI Governance & Secure LLM Access

NodeShift serves as a single governed access layer through which enterprise users access all AI models. Users never interact directly with external AI providers — all interactions are mediated, governed, and audited by organization-defined policies.

- Fully white-labeled web platform deployable on-premises or within the customer's cloud tenancy
- Supports 140+ closed-source, open-source, and internal/private models through one interface including legal and safe use of closed source models like Claude by Anthropic, Gemini by Google, ChatGPT by OpenAI, compliant with UAE, Saudi, Bahrain, Qatar, Kuwait, Oman and EU regulations.
- Single Sign-On (SSO) integration with enterprise identity providers; access can be restricted to corporate VPN
- Role-Based Access Control (RBAC) enforces which users, groups, or departments can access specific models, assistants, integrations, or capabilities
- Role-specific interfaces for Compliance, Legal, Audit, IT Admin, and End Users
- Custom branding, domain, and UI configuration so the platform operates as a native enterprise system

### 2. Real-Time AI Security Guardrails

NodeShift implements a centralized guardrail engine that sits inline between every user prompt and every AI model. No request can reach any LLM unless it is first evaluated by the guardrail engine. This is a strict "no-bypass" architecture. This enables safe use of closed source models such as Claude by Anthropic, Gemini by Google, ChatGPT by OpenAI and others.

- A small guardrail LLM (~13B parameters, fine-tuned for prompt safety) is deployed entirely within the customer's environment on dedicated GPU infrastructure
- The guardrail evaluates every prompt in real time against organization-defined security probes and policy rules
- Decision outcomes: **allow / warn / sanitize / block / route** — enforced before any model invocation
- Detects and classifies: sensitive entities (IBANs, account numbers, QIDs, API keys, PII), prompt injection attempts, data-loss-prevention (DLP) violations, regulatory policy breaches, and content policy violations
- All decisions are captured as auditable events and surfaced in the monitoring dashboard
- Guardrail evaluation adds no perceptible latency to end users because it is GPU-accelerated and deployed locally
- Engineered to meet PDPL, ISO 27001, and SOC 2 standards

### 3. Multi-Model Integration (Including Internal/Private Models)

- Unified access to 140+ models: ChatGPT-class (OpenAI), Claude-class (Anthropic), Gemini-class (Google), and open-source models (Llama, Mistral, etc.)
- Customers can upload their own fine-tuned models or adapters and expose them through the same governed endpoint mechanics as catalogue models
- Models inherit autoscaling, observability, and governance features without custom engineering
- Elastic GPU backends with warm-pooling and weight caching to minimize cold starts
- NVIDIA MIG support on Ampere/Hopper for fine-grained GPU partitioning by workload

### 4. Data Anonymization, Pseudonymization, Masking & Redaction

All sensitive data is classified, masked, and anonymized before any prompt or context payload leaves the customer's environment. This applies to both closed-source and open-source external model invocations. This enables safe use of closed source models such as Claude by Anthropic, Gemini by Google, ChatGPT by OpenAI and others.

- Detects and redacts: names, IBANs, national IDs, account numbers, internal policy terms, API keys, and other configurable entity types
- Pseudonymization and masking are applied at the prompt level before external model invocation
- Outputs are de-anonymized and returned to the user within the secure environment
- Supports configurable sensitivity classification tiers aligned to organizational data policy

### 5. Meeting Intelligence / AI Call Assistants (On-Premises Transcription & Insights)

The NodeShift Meeting AI Note Taker turns organizational meetings into governed, searchable knowledge assets — fully processed on-premises.

- On-prem speech-to-text (Arabic and English)
- Automated summarization, action item extraction, compliance/risk signal detection
- Dashboards with trend analytics across meetings
- All outputs (transcripts, summaries, decisions, actions, flags) are stored under the same RBAC and audit controls as the rest of the NodeShift platform and become searchable artifacts in AI Search
- Ingestion modes:
  - **Post-meeting ingestion** (recommended for strict/air-gapped environments): recordings uploaded by authorized users or exported from approved storage
  - **Connector-based ingestion**: Microsoft Teams (via Microsoft Graph/compliance recording), Zoom (via API/webhooks), on-prem meeting platforms
  - **Local recorder workflow**: secure drop-zone for controlled-room environments
- Integration with existing meeting platforms under organizational policy constraints

### 6. Identity, Access Management, SSO & Role-Based Access Control (RBAC)

- Enterprise SSO integration (LDAP/OIDC) with the customer's existing identity provider
- RBAC enforces least-privilege at every API call, binding users or service accounts to specific roles
- Attribute-Based Access Control (ABAC) supported for fine-grained, context-aware permissions
- Document-level security and query-time filtering ensure users only receive results they are authorized to access
- All AI interactions are logged with user identity, timestamps, policy outcomes, and risk classification to support audit and regulatory review

### 7. Policy Templates & Regulatory Framework Alignment

NodeShift ships pre-built policy templates aligned to common regulatory and security frameworks relevant to GCC financial institutions:

- PDPL (Saudi, UAE, Oman, Bahrain, Kuwait, Qatar)
- UAE Central Bank regulatory guidelines
- Bahrain Central Bank (CBB) frameworks
- Qatar Financial Centre (QFC) standards
- ISO 27001
- SOC 2
- NIST AI Risk Management Framework

### 7. Policy Templates & Regulatory Framework Alignment

## 8. NodeShift Enterprise OpenClaw

### What is OpenClaw?

OpenClaw (formerly Clawdbot, then Moltbot) is a free, open-source autonomous AI agent created by Austrian developer Peter Steinberger and first published in November 2025. It is one of the fastest-growing open-source repositories in GitHub history, reaching 247,000+ stars and 47,700+ forks by March 2026. OpenClaw runs locally on a machine or server and connects to the messaging apps users already use — WhatsApp, Telegram, Signal, Discord, Slack, iMessage — turning them into an interface for an AI agent that can actually execute tasks: sending emails, managing calendars, running shell commands, controlling browsers, reading and writing files, and automating multi-step workflows. In February 2026, its creator announced he would be joining OpenAI and the project would transition to an open-source foundation.

OpenClaw is best described as an autonomous agentic AI layer that wraps around any LLM (Claude, GPT-4, Gemini, DeepSeek, or local models) and connects it to the real tools and data sources of an organization. Unlike a chatbot interface, OpenClaw executes — it takes action on behalf of the user across integrated systems, persists memory across sessions as plain Markdown files, runs proactive scheduled jobs (heartbeat cron), and is community-extensible through portable SKILL.md files shared via ClawHub.

However, the consumer/open-source version of OpenClaw was designed for individual developers and power users — not regulated enterprises. It presents significant risks in institutional contexts: broad system permissions, susceptibility to prompt injection attacks (including indirect injection via ingested emails and documents), no centralized audit trail, no RBAC, no data residency controls, and potential data exfiltration through misconfigured or malicious skills. Cisco's AI security research team confirmed a third-party OpenClaw skill performed data exfiltration without user awareness. In March 2026, Chinese authorities restricted state-run enterprises and government agencies from running OpenClaw on office computers due to these security risks.

### NodeShift Enterprise OpenClaw

NodeShift builds and delivers an **enterprise-hardened version of OpenClaw** — bringing the autonomous agentic capability of OpenClaw inside the governance, security, and data-sovereignty architecture of the NodeShift platform. This means organizations in regulated industries can deploy agentic AI workflows at scale without exposing sensitive data, violating data-residency laws, or losing auditability.

NodeShift Enterprise OpenClaw is not simply a hosted version of the open-source project. It is a fully re-architected deployment of OpenClaw's agentic runtime, embedded within NodeShift's inline guardrail engine, RBAC layer, SSO identity system, immutable audit infrastructure, and on-premises compute environment. Every action the agent takes — every tool call, file access, email sent, API invocation, shell command — passes through NodeShift's policy enforcement layer before execution, is logged with user identity and timestamps, and is subject to the same RBAC and data anonymization controls as all other NodeShift AI activity.

**Key capabilities of NodeShift Enterprise OpenClaw:**

- **Governed agentic workflows**: Agents can execute multi-step autonomous tasks (drafting and sending emails, managing calendars, querying internal databases, summarizing documents, triggering approvals) — all under organization-defined policy control
- **Inline guardrail enforcement**: The NodeShift guardrail LLM evaluates every agent action and tool invocation before it is permitted to execute — the same no-bypass architecture that governs all LLM interactions on the platform
- **Prompt injection defense**: NodeShift's guardrail engine detects and blocks direct and indirect prompt injection attempts, including malicious instructions embedded in ingested emails, documents, webpages, and other data sources — addressing the most critical security risk of agentic AI in enterprise environments
- **On-premises agent runtime**: The OpenClaw gateway, memory store, and skill execution environment run entirely within the customer's infrastructure boundary. No agent data, task history, or tool output leaves the organization
- **Persistent memory under access control**: Agent memory (conversation history, long-term context, workflow state) is stored within the NodeShift environment and subject to document-level security and RBAC — not as open Markdown files accessible to anyone on the machine
- **RBAC-governed skill access**: Administrators control which users, roles, or departments can access which OpenClaw skills, tools, and integrations — preventing unauthorized agents from accessing sensitive systems
- **Full audit trail**: Every agent action is logged with user identity, role, action type, target system, policy outcome, and timestamp — providing complete chain-of-custody for regulatory and compliance review
- **Arabic and English agent interfaces**: Native Arabic-language support for agent interaction, consistent with NodeShift's regional language-first approach
- **Messaging platform integration**: Agents accessible via WhatsApp, Microsoft Teams, and other approved enterprise messaging channels — with identity verification tied to the organization's SSO/IDP

### Compliance by Country: GCC + EU

NodeShift Enterprise OpenClaw is designed to be compliant with the specific AI, data protection, and cybersecurity regulatory frameworks of each GCC member state and the European Union. The six GCC member states are:

**1. United Arab Emirates (UAE)**
- UAE Personal Data Protection Law (PDPL) — Federal Decree-Law No. 45 of 2021
- UAE Central Bank (CBUAE) AI and technology governance guidelines
- UAE National Cybersecurity Council (NCSC) standards
- ADGM and DIFC data protection frameworks (for Abu Dhabi and Dubai financial free zones)
- All agent actions, memory, and tool outputs remain within UAE infrastructure; no cross-border data transfer without explicit approval

**2. Kingdom of Saudi Arabia (KSA)**
- Saudi Personal Data Protection Law (PDPL) — enforced by the National Data Management Office (NDMO)
- Saudi Central Bank (SAMA) cybersecurity framework
- National Cybersecurity Authority (NCA) Essential Cybersecurity Controls (ECC)
- Saudi Arabia's data residency requirements mandate all sensitive data remain within Saudi borders — NodeShift's on-prem deployment satisfies this natively

**3. Qatar**
- Qatar's Personal Data Privacy Protection Law (Law No. 13 of 2016)
- Qatar Central Bank (QCB) technology risk and data governance guidelines
- Qatar Financial Centre (QFC) data protection regulations
- Compatibility with Qatar's national AI strategy and data sovereignty requirements

**4. Bahrain**
- Bahrain Personal Data Protection Law (PDPL) — Law No. 30 of 2018, enforced by the Personal Data Protection Authority (PDPA)
- Central Bank of Bahrain (CBB) rulebook — Technology Risk Management (TRM) module
- Bahrain's Cloud Computing Policy for government and regulated entities

**5. Kuwait**
- Kuwait's data protection framework and Ministerial Decree No. 12 of 2020
- Central Bank of Kuwait (CBK) IT risk management circulars
- Kuwait's national cybersecurity strategy requirements for regulated sectors

**6. Sultanate of Oman**
- Oman's Electronic Transactions Law and Cybercrime Law
- Central Bank of Oman (CBO) IT governance and cybersecurity guidelines
- Oman's National Information Technology Authority (NITA) standards
- Oman's emerging Personal Data Protection Law requirements

**European Union (EU)**
- General Data Protection Regulation (GDPR) — Article 25 (data protection by design and by default) and Article 32 (security of processing) are satisfied by NodeShift's on-premises, guardrail-enforced architecture
- EU AI Act — NodeShift Enterprise OpenClaw is structured to meet high-risk AI system requirements including human oversight, transparency, data governance, and logging obligations
- NIS2 Directive — security controls, incident logging, and supply chain security measures align with NIS2 requirements for critical entities
- DORA (Digital Operational Resilience Act) — relevant for EU financial institutions deploying agentic AI; NodeShift's audit trails, SLA-governed HA architecture, and operational resilience controls support DORA compliance

**Cross-jurisdiction principles applied in all deployments:**
- Data never leaves the customer's national/regional infrastructure boundary without explicit written approval
- All agent actions are subject to pre-execution policy evaluation and post-execution audit logging
- RBAC and identity controls are tied to the customer's existing IDP/SSO — no shadow identity systems
- Sensitive entity detection (national IDs, IBANs, account numbers, personal data) is enforced at the guardrail layer before any agent tool call is permitted to proceed
- No telemetry, training data, or operational metadata is transmitted to NodeShift or any third party by default


---

## Enterprise Integrations

NodeShift connects to enterprise systems via secure, policy-governed connectors. One-click connectors authenticate via scoped OAuth or service accounts, perform incremental syncs, and respect source permissions so AI responses never exceed what a user could access at the origin.

- **Microsoft 365**: Outlook, Exchange (on-prem and cloud), Microsoft Teams, SharePoint, OneDrive
- **Google Workspace**: Gmail, Google Drive, Calendar
- **Collaboration & Productivity**: Zoom, Jira (Atlassian), Confluence, Notion, Slack
- **Databases**: Internal SQL/NoSQL databases, internal APIs
- **Data processing pipeline**: OCR (for scanned PDFs), text extraction, language detection, normalization
- **Indexing**: Keyword index + semantic vector index for meaning-based retrieval

---

## Technical Architecture

NodeShift On-Prem Platform is built on CNCF-proven, open-standard components with a hyper-converged philosophy. All configuration is driven by declarative GitOps.

- **Compute**: KVM/QEMU virtualization with workload-specific profiles (CPU Exclusive, Memory-Optimized, Network Realtime, Universal, Overcommitted); live migration first-class
- **GPU**: NVIDIA GPU passthrough and vGPU modes; MIG partitioning on Ampere/Hopper; AMD and Qualcomm accelerator support roadmapped
- **Storage**: DRBD-based distributed block storage with NVMe SSDs; HDD extension for capacity tiers; third-party SAN/NVMe-oF integration supported
- **Networking**: Distributed Virtual Routing (east-west); BGP-Edge for north-south with ECMP; WireGuard encryption; Geneve overlay
- **Security**: TLS 1.3 control-plane encryption; WireGuard node-to-node; eBPF runtime syscall and flow inspection; cryptographically signed artifact admission; continuous CVE scanning; zero-trust model
- **High Availability**: Five control-plane nodes (quorum ≥ 3); workload failover within 30–120s; ≥2 synchronous data replicas; BGP failover <2s; rolling non-disruptive maintenance
- **Orchestration**: Kubernetes (MKS) with HA control planes, OPA/Gatekeeper policy hooks, GitOps bootstrap
- **PaaS Services**: DBaaS (SQL/NoSQL), DCS (distributed cache), MQS (Kafka-compatible messaging), MRS (Spark/Hadoop batch)

---

## Key Differentiators

- **Sovereign by design**: All guardrail evaluation, inference, and audit happens inside the customer's infrastructure boundary. No prompt, output, or metadata leaves the organization without explicit written approval.
- **Single platform, multiple models**: Replaces fragmented AI subscriptions (ChatGPT, Claude, Gemini, etc.) with one governed interface, reducing cost, risk, and operational complexity.
- **No-bypass guardrail architecture**: The inline guardrail LLM cannot be circumvented by users or applications — enforcement is deterministic and always-on.
- **Arabic-first in the GCC**: Native Arabic and English UI with Arabic speech-to-text for meeting intelligence — purpose-built for the GCC market.
- **Regulated-sector proven**: Already deployed and operational at a G7-equivalent central bank, a defense organization, and a multi-country sovereign data center network.
- **GCC-local team**: Silicon Valley-founded with dedicated engineering and operational offices in UAE, Qatar, and KSA. Structured to meet regulated-institution delivery requirements including secure deployment, integration engineering, and operational handover.

---

NODESHIFT SUPPORTS SOVEREIGN AI ACCESS TO THE FOLLOWING MODELS:

════════════════════════════════════════════════════════════════════════════════
  SECTION A: CLOSED-SOURCE MODELS
════════════════════════════════════════════════════════════════════════════════
 
 
────────────────────────────────────────────────────────────────────────────────
  ANTHROPIC (CLAUDE) — Closed Source
────────────────────────────────────────────────────────────────────────────────
 
  1.  Claude Opus 4.6
      - Released: February 2026
      - Flagship model; adaptive thinking, frontier coding, long-horizon agents
      - 200K context (1M beta), vision, PDF input
 
  2.  Claude Sonnet 4.6
      - Released: February 2026
      - Near-Opus intelligence at Sonnet pricing; coding, computer use, agents
      - 200K context (1M beta), vision, PDF input
 
  3.  Claude Opus 4.5
      - Released: November 2025
      - Deep reasoning, multi-day coding projects, enterprise workflows
      - 200K context, vision, PDF input
 
  4.  Claude Sonnet 4.5
      - Released: October 2025
      - Balanced speed and intelligence; agents, coding, computer use
      - 200K context, vision, PDF input
 
  5.  Claude Opus 4.1
      - Opus-class; agentic search, expert-level coding, long-horizon tasks
      - 200K context, vision, PDF input
 
  6.  Claude Sonnet 4
      - Reliable general-purpose model; fast responses, solid reasoning
      - 200K context, vision
 
  7.  Claude Haiku 4.5
      - Released: October 2025
      - Fastest and cheapest; real-time responses, automation, high-volume tasks
 
  8.  Claude 3.7 Sonnet
      - Hybrid reasoning; balanced fast and thoughtful analysis
 
  9.  Claude 3.5 Haiku
      - Lightweight; content moderation, quick responses
 
 10.  Claude 3 Opus (Retired / Available by request)
      - Earlier flagship; strong analysis and reasoning
 
 
────────────────────────────────────────────────────────────────────────────────
  OPENAI — Closed Source
────────────────────────────────────────────────────────────────────────────────
 
 11.  GPT-5.4 (Thinking / Pro / Instant)
      - Latest unified flagship; routing system for optimal reasoning depth
      - Frontier coding, agentic workflows, multimodal
      - 400K context window
 
 12.  GPT-5.4 Mini
      - Strongest mini model; coding, computer use, sub-agents
      - 400K context, $0.75/M input tokens
 
 13.  GPT-5.4 Nano
      - Cheapest GPT-5.4-class; simple high-volume tasks
      - 400K context, $0.20/M input tokens
 
 14.  GPT-5.3 Codex
      - Specialized agentic coding; GitHub integration, sandbox execution
      - Optimized for multi-file edits and long-horizon coding
 
 15.  GPT-5.2 Codex
      - Advanced agentic coding; cybersecurity capabilities
      - Multi-step software engineering, defensive security
 
 16.  GPT-4o
      - Multimodal (text, image, audio); strong general performance
      - Improved coding, creative, and instruction-following
 
 17.  o4-mini / o4-mini (high)
      - Fast, cost-efficient reasoning; best on AIME 2024/2025
      - Strong math, coding, visual tasks
 
 18.  o3 / o3-pro / o3 (high)
      - Advanced reasoning models; complex scientific and mathematical tasks
 
 19.  o3-mini
      - Small reasoning model; optimized for science, math, coding
 
 20.  GPT-4o Mini
      - Compact multimodal; cost-effective general use
 
 21.  GPT-4.5
      - High EQ; creative tasks, agentic planning
 
 22.  GPT Image 1.5 / chatgpt-image-latest
      - Latest image generation and editing model
 
 23.  DALL-E 3
      - Text-to-image generation
 
 24.  Sora 2
      - Video generation model
 
 25.  Whisper
      - Speech recognition / transcription
 
 
────────────────────────────────────────────────────────────────────────────────
  GOOGLE (GEMINI) — Closed Source
────────────────────────────────────────────────────────────────────────────────
 
 26.  Gemini 3.1 Pro (Preview)
      - Latest flagship; 94.3% GPQA Diamond, 77.1% ARC-AGI-2
      - Reasoning-first, agentic workflows, coding, 1M context
 
 27.  Gemini 3.1 Flash Lite
      - Most cost-efficient; low latency, high-volume tasks
 
 28.  Gemini 3.1 Flash Image (Nano Banana 2)
      - Image generation/editing; conversational editing, character consistency
 
 29.  Gemini 3 Pro
      - State-of-the-art reasoning and multimodal understanding
      - Agentic capabilities, coding, 1M context
 
 30.  Gemini 3 Pro Image (Nano Banana Pro)
      - High-fidelity image generation with reasoning-enhanced composition
 
 31.  Gemini 3 Flash
      - Complex multimodal understanding; agentic problem-solving, coding
 
 32.  Gemini 2.5 Pro
      - Premium reasoning model; 1M token context (2M experimental)
      - Complex coding, long-context analysis
 
 33.  Gemini 2.5 Flash
      - Balanced intelligence and speed; controllable thinking budgets
 
 34.  Gemini 2.5 Flash Lite
      - Built for massive scale; cost-performance optimized
 
 35.  Gemini 2.5 Flash Image (Nano Banana)
      - Creative image workflows; multi-image fusion, character consistency
 
 36.  Gemini 2.5 Deep Think
      - Specialized deep reasoning; scientific research
 
 37.  Gemini 2.0 Flash
      - Multimodal; cost-effective general-purpose tasks
 
 38.  Gemini 2.0 Flash Lite
      - Ultra-efficient; simple, high-frequency tasks
 
 39.  Gemini Embedding 2
      - First fully multimodal embedding model
 
 40.  Veo
      - Cinematic video generation with synchronized audio
 
 41.  Imagen 3
      - Text-to-image; high clarity up to 2K resolution
 
 
────────────────────────────────────────────────────────────────────────────────
  xAI (GROK) — Closed Source (except Grok 1)
────────────────────────────────────────────────────────────────────────────────
 
 42.  Grok 4
      - xAI flagship; 75% SWE-bench, multi-agent coding
 
 43.  Grok 4.1 Fast / Grok 4 Fast
      - Speed-optimized for real-time applications
 
 44.  Grok 4.20
      - Four AI agents running in parallel simultaneously
 
 
════════════════════════════════════════════════════════════════════════════════
  SECTION B: OPEN-SOURCE / OPEN-WEIGHT MODELS
════════════════════════════════════════════════════════════════════════════════
 
 
────────────────────────────────────────────────────────────────────────────────
  META (LLAMA)
────────────────────────────────────────────────────────────────────────────────
 
 45.  Llama-4-Scout-17B-16E          — 17B (16 experts)
      128K context, multimodal, strong general reasoning
 
 46.  Llama-4-Maverick-17B-128E      — 17B (128 experts)
      High-accuracy MoE variant, extended expert routing
 
 47.  Llama-3.3-70B-Instruct         — 70B
      Best Llama 3 generation quality, multilingual, top OSS benchmark
 
 48.  Llama-3.2-90B-Vision-Instruct  — 90B
      State-of-the-art open multimodal reasoning & vision
 
 49.  Llama-3.2-11B-Vision-Instruct  — 11B
      Efficient vision-language model, strong image understanding
 
 50.  Llama-3.2-3B-Instruct          — 3B
      Lightweight edge deployment, fast inference
 
 51.  Llama-3.2-1B-Instruct          — 1B
      Ultra-light on-device / embedded AI
 
 52.  Llama-3.1-405B-Instruct        — 405B
      Largest open LLM, GPT-4-class performance, sovereign deployments
 
 53.  Llama-3.1-70B-Instruct         — 70B
      Multilingual, RAG-optimized, strong code & reasoning
 
 54.  Llama-3.1-8B-Instruct          — 8B
      Fast, efficient, ideal for fine-tuning & agentic tasks
 
 
────────────────────────────────────────────────────────────────────────────────
  ALIBABA (QWEN)
────────────────────────────────────────────────────────────────────────────────
 
 55.  Qwen3-235B-A22B                — 235B (22B active)
      #1 open-source benchmark; rivals Claude Sonnet & GPT-4.1
 
 56.  Qwen3-Next-80B-A3B             — 80B (3B active)
      Latest hybrid MoE; 262K context, top coding & reasoning
 
 57.  Qwen3.5-VL-397B                — ~400B
      Native multimodal agents; UI navigation, document understanding
 
 58.  Qwen3-32B                      — 32B
      High-quality dense model; strong at agentic tasks
 
 59.  Qwen3-14B                      — 14B
      Balanced reasoning + speed, mid-range deployment
 
 60.  Qwen3-8B                       — 8B
      Edge-optimized general-purpose instruct model
 
 61.  Qwen3-4B                       — 4B
      Compact, mobile & IoT deployments
 
 62.  Qwen3-Coder-Next-80B-A3B       — 80B (3B active)
      Agentic coding, tool use, long-horizon code generation
 
 63.  Qwen2.5-VL-72B-Instruct        — 72B
      Document OCR, chart/table understanding, 128K context
 
 64.  Qwen2.5-72B-Instruct           — 72B
      Strong multilingual (29 langs), math, code, long context
 
 65.  Qwen2.5-Coder-32B-Instruct     — 32B
      SOTA code generation across 40+ languages, GPT-4o-level
 
 66.  Qwen2.5-Math-72B-Instruct      — 72B
      Advanced mathematical reasoning, competition-level problems
 
 67.  QwQ-32B                        — 32B
      Deep chain-of-thought reasoning, math & science
 
 68.  Qwen3-Embedding                — Various
      Multi-lingual semantic embeddings for RAG pipelines
 
 
────────────────────────────────────────────────────────────────────────────────
  DEEPSEEK
────────────────────────────────────────────────────────────────────────────────
 
 69.  DeepSeek-V3                    — 671B (37B active)
      Frontier MoE; rivals GPT-4o at fraction of compute cost
 
 70.  DeepSeek-R1                    — 671B (37B active)
      O1-class reasoning, AIME/math champion, open weights
 
 71.  DeepSeek-V3.2 (latest)         — 671B (37B active)
      Enhanced long-context & tool-use; GPT-5-territory on benchmarks
 
 72.  DeepSeek-Coder-V2-Instruct     — 236B (21B active)
      300+ programming languages, SOTA on code benchmarks
 
 73.  DeepSeek-R1-Distill-Qwen-32B   — 32B
      Distilled R1 reasoning in efficient dense model
 
 74.  DeepSeek-R1-Distill-Llama-70B  — 70B
      R1 reasoning via Llama backbone, strong STEM performance
 
 75.  DeepSeek-R1-1776               — 671B (37B active)
      Uncensored R1 variant post-trained by Perplexity AI
 
 
────────────────────────────────────────────────────────────────────────────────
  MISTRAL AI
────────────────────────────────────────────────────────────────────────────────
 
 76.  Mistral-Large-3                — ~123B
      Frontier-class; top EU open model
 
 77.  Mixtral-8x22B-Instruct         — 141B (39B active)
      Apache-2.0; fast MoE rivaling GPT-3.5 Turbo
 
 78.  Mistral-Medium-3.1             — ~24B
      Multimodal image input, 128K context, enterprise-grade
 
 79.  Mistral-Small-3.2-24B-Instruct — 24B
      Function calling, multilingual, fast conversational agent
 
 80.  Magistral-24B                  — 24B
      Open reasoning model; 128K context, chain-of-thought
 
 81.  Codestral-22B                  — 22B
      Leading OSS code model, 80K context, 80+ languages
 
 82.  Devstral-2-24B                 — 24B
      Agentic coding with vision; multi-file editing, tool use
 
 83.  Mistral-NeMo-12B               — 12B
      NVIDIA co-developed; 128K context, multilingual, Apache-2.0
 
 84.  Ministral-8B                   — 8B
      Edge & IoT deployments; optimized for NVIDIA Jetson
 
 85.  Mistral-7B-Instruct-v0.3       — 7B
      Most-downloaded Mistral; fast, versatile baseline model
 
 86.  Mathstral-7B                   — 7B
      Specialized mathematics reasoning and problem-solving
 
 87.  Pixtral-12B                    — 12B
      Native multimodal; image+text reasoning, Apache-2.0
 
 
────────────────────────────────────────────────────────────────────────────────
  GOOGLE (GEMMA / DEEPMIND) — Open Source
────────────────────────────────────────────────────────────────────────────────
 
 88.  Gemma-3-27B-IT                 — 27B
      Best-in-class OSS per size; multilingual, 128K context
 
 89.  Gemma-3-12B-IT                 — 12B
      Strong instruction following; vision-capable variant available
 
 90.  Gemma-3-4B-IT                  — 4B
      Mobile-optimized, multimodal, privacy-first on-device AI
 
 91.  Gemma-3n-E4B                   — 4B equivalent
      Phone/tablet deployment; dynamic parameter activation
 
 92.  Gemma-2-27B-IT                 — 27B
      High accuracy on single GPU; MMLU & reasoning leader
 
 93.  Gemma-2-9B-IT                  — 9B
      Excellent quality-to-size ratio; Apache-2.0
 
 94.  CodeGemma-7B-IT                — 7B
      Fill-in-middle code completion, generation & NLU
 
 95.  FunctionGemma                  — Lightweight
      Foundation for function-calling specialized fine-tuning
 
 96.  Gemma-Embedding                — Various
      High-quality semantic search embeddings
 
 
────────────────────────────────────────────────────────────────────────────────
  MICROSOFT (PHI)
────────────────────────────────────────────────────────────────────────────────
 
 97.  Phi-4-14B                      — 14B
      SOTA small model; beats much larger models on STEM & code
 
 98.  Phi-4-Reasoning-14B            — 14B
      Chain-of-thought reasoning; rivals 70B+ models on math
 
 99.  Phi-4-Mini-4B                  — 4B
      Ultra-compact reasoning; strong on-device inference
 
100.  Phi-4-Mini-Reasoning           — 4B
      Lightweight reasoning-focused for edge/mobile
 
101.  Phi-3-Medium-14B-Instruct      — 14B
      Long 128K context; instruction-following, multilingual
 
102.  Phi-3-Mini-3.8B-Instruct       — 3.8B
      Phone-class deployment; surprisingly capable small model
 
103.  LLaVA-1.6 (Microsoft)          — 13B
      Visual question answering, image + text multimodal tasks
 
 
────────────────────────────────────────────────────────────────────────────────
  NVIDIA (NEMOTRON)
────────────────────────────────────────────────────────────────────────────────
 
104.  Nemotron-4-340B-Instruct       — 340B
      NVIDIA flagship; strong reasoning, Apache-2.0
 
105.  Nemotron-Mini-4B-Instruct      — 4B
      Low-latency edge inference; NIM-optimized deployment
 
106.  Nemotron-Nano-9B-v2            — 9B
      Efficient general-purpose; Mamba2-transformer hybrid
 
107.  Nemotron-2-15B                 — 15B
      Mamba2+transformer hybrid; fast long-context inference
 
 
────────────────────────────────────────────────────────────────────────────────
  IBM (GRANITE)
────────────────────────────────────────────────────────────────────────────────
 
108.  Granite-4.0-8B-Instruct        — 8B
      Enterprise-grade; strong tool-calling & instruction following
 
109.  Granite-3.3-8B-Instruct        — 8B
      Document summarization, RAG, enterprise tasks, Apache-2.0
 
110.  Granite-3.3-2B-Instruct        — 2B
      Ultra-compact enterprise model for constrained environments
 
111.  Granite-Code-20B-Instruct      — 20B
      Enterprise code generation, 116 languages, Apache-2.0
 
112.  Granite-Embedding-278M         — 278M
      High-quality dense embeddings for enterprise RAG
 
 
────────────────────────────────────────────────────────────────────────────────
  COHERE
────────────────────────────────────────────────────────────────────────────────
 
113.  Command-R+                     — 104B
      Enterprise RAG champion; 10-language multilingual, tool-use
 
114.  Command-R                      — 35B
      Conversational AI, long-context RAG, agentic workflows
 
115.  Embed-v3 (Multilingual)        — Various
      Best-in-class multilingual embeddings for retrieval
 
116.  Rerank-v3                      — Various
      Semantic reranking for RAG precision improvement
 
 
────────────────────────────────────────────────────────────────────────────────
  ZHIPU AI (GLM)
────────────────────────────────────────────────────────────────────────────────
 
117.  GLM-5                          — 744B (40B active)
      SWE-bench leader; long-horizon agentic tasks, coding SOTA
 
118.  GLM-4.7                        — ~32B active
      Chatbot Arena #1 (1445 ELO); top coding & reasoning OSS
 
119.  GLM-4.6V-Flash-9B              — 9B
      Compact VLM for local deployment, low-latency visual AI
 
120.  GLM-4.5-Air                    — ~32B
      Fast cloud/enterprise inference, accessible performance tier
 
 
────────────────────────────────────────────────────────────────────────────────
  MOONSHOT AI (KIMI)
────────────────────────────────────────────────────────────────────────────────
 
121.  Kimi-K2                        — 1T (32B active)
      Frontier agentic & coding model; top OSS agentic benchmark
 
122.  Kimi-VL-A3B-Thinking           — 16B (3B active)
      Reasoning multimodal model; compact yet frontier-class
 
 
────────────────────────────────────────────────────────────────────────────────
  TII (FALCON)
────────────────────────────────────────────────────────────────────────────────
 
123.  Falcon-3-10B-Instruct          — 10B
      Strong multilingual; trained on 14T tokens, Apache-2.0
 
124.  Falcon-3-7B-Instruct           — 7B
      Efficient open model from UAE's TII; Arabic-strong
 
125.  Falcon-3-3B-Instruct           — 3B
      Compact multilingual; suitable for edge sovereign deployments
 
126.  Falcon-3-1B-Instruct           — 1B
      Ultra-light; IoT & on-device sovereign AI use cases
 
127.  Falcon-2-11B-VLM               — 11B
      Multimodal vision + language; open weights
 
 
────────────────────────────────────────────────────────────────────────────────
  ALLEN AI (OLMO)
────────────────────────────────────────────────────────────────────────────────
 
128.  OLMo-2-32B-Instruct            — 32B
      Fully open (weights + data + code); research reproducibility
 
129.  OLMo-2-13B-Instruct            — 13B
      Transparent training stack; Apache-2.0 commercial use
 
130.  OLMo-2-7B-Instruct             — 7B
      Compact fully-open model for academic & research use
 
131.  Tülu-3-70B                     — 70B
      State-of-the-art RLHF alignment; multi-task instruction model
 
 
────────────────────────────────────────────────────────────────────────────────
  BIGSCIENCE / HUGGING FACE
────────────────────────────────────────────────────────────────────────────────
 
132.  BLOOM-176B                     — 176B
      46 natural languages + 13 code langs; pioneering open multilingual LLM
 
133.  SmolLM2-1.7B-Instruct          — 1.7B
      Best-in-class tiny model; 135M/360M variants also available
 
134.  SmolLM2-360M-Instruct          — 360M
      On-device / browser-based AI inference
 
135.  IDEFICS-80B                    — 80B
      Open multimodal few-shot; image + text tasks
 
136.  Zephyr-7B-Beta                 — 7B
      Top-rated OSS assistant via DPO fine-tuning; fast dialogue
 
 
────────────────────────────────────────────────────────────────────────────────
  SENTENCE TRANSFORMERS / BAAI
────────────────────────────────────────────────────────────────────────────────
 
137.  all-MiniLM-L6-v2               — 22M
      Fastest embedding for RAG/semantic search; universally supported
 
138.  all-MiniLM-L12-v2              — 33M
      Higher accuracy embedding; balanced speed vs quality
 
139.  BGE-M3                         — 570M
      Multi-lingual, multi-granularity; dense+sparse+ColBERT retrieval
 
140.  BGE-Reranker-v2-M3             — 570M
      Cross-encoder reranking; multilingual RAG quality boost
 
141.  E5-Mistral-7B-Instruct         — 7B
      Instruction-tuned embedding; top MTEB leaderboard performance
 
142.  GTE-Qwen3-Embedding            — Various
      Qwen3-based embedding; multilingual retrieval excellence
 
 
────────────────────────────────────────────────────────────────────────────────
  LIQUID AI
────────────────────────────────────────────────────────────────────────────────
 
143.  LFM2-1.2B                      — 1.2B
      Novel LFM architecture; edge AI speed + memory efficiency
 
144.  LFM2-7B                        — 7B
      Liquid foundation model; strong on-device generation quality
 
 
────────────────────────────────────────────────────────────────────────────────
  BYTEDANCE SEED
────────────────────────────────────────────────────────────────────────────────
 
145.  Seed-OSS-36B                   — 36B
      First capable ByteDance open-weight LLM; strong reasoning
 
 
────────────────────────────────────────────────────────────────────────────────
  BAIDU (ERNIE)
────────────────────────────────────────────────────────────────────────────────
 
146.  Ernie-4.5-MoE-424B-A47B        — 424B (47B active)
      Baidu flagship OSS; competitive with Qwen3 on Chinese tasks
 
147.  Ernie-4.5-21B-A3B              — 21B (3B active)
      Efficient Chinese-English bilingual MoE
 
 
────────────────────────────────────────────────────────────────────────────────
  MBZUAI & G42 (UAE)
────────────────────────────────────────────────────────────────────────────────
 
148.  K2-Think-72B                   — 72B
      First Arabic-first reasoning model; government/sovereign AI
 
149.  Jais-30B-v3                    — 30B
      Arabic-English bilingual; MENA region sovereign deployment
 
150.  Jais-13B-Chat                  — 13B
      Arabic conversational AI; tailored for Gulf enterprises
 
 
────────────────────────────────────────────────────────────────────────────────
  MOONDREAM
────────────────────────────────────────────────────────────────────────────────
 
151.  Moondream-3                    — ~2B
      THE small vision model; competes with closed VLMs, on-device
 
152.  Moondream-2                    — 1.9B
      Compact VQA + captioning; runs on Raspberry Pi
 
 
────────────────────────────────────────────────────────────────────────────────
  Z.AI
────────────────────────────────────────────────────────────────────────────────
 
153.  DeepCoder-14B-Preview          — 14B
      Fully open-source; O3-mini-level coding, Apache-2.0
 
154.  DeepCoder-1.5B                 — 1.5B
      Compact open code model for fast code completion
 
 
────────────────────────────────────────────────────────────────────────────────
  DOLPHIN AI (ERIC HARTFORD)
────────────────────────────────────────────────────────────────────────────────
 
155.  Dolphin-3.0-Llama3.1-70B       — 70B
      Uncensored, strong general assistant & coding fine-tune
 
156.  Dolphin-3.0-Mistral-24B        — 24B
      Versatile assistant; excels at coding tasks, Apache-2.0
 
157.  Dolphin-Phi-2.8B               — 2.8B
      Ultra-compact uncensored assistant on Phi backbone
 
 
────────────────────────────────────────────────────────────────────────────────
  xAI (GROK) — Open Weight
────────────────────────────────────────────────────────────────────────────────
 
158.  Grok-1-314B                    — 314B (86B active)
      First open-weight Grok release; Apache-2.0 commercial license
 
 
────────────────────────────────────────────────────────────────────────────────
  OPENAI — Open Weight
────────────────────────────────────────────────────────────────────────────────
 
159.  gpt-oss-120B                   — 120B
      First OpenAI open-weight LLM; beats o4-mini on AIME/MMLU
 
160.  gpt-oss-20B                    — 20B
      Fast, agentic; 250+ tokens/sec, Apache-2.0 fine-tuning
 
 
────────────────────────────────────────────────────────────────────────────────
  TINYLLAMA
────────────────────────────────────────────────────────────────────────────────
 
161.  TinyLlama-1.1B-Chat            — 1.1B
      Minimal resource chat model; IoT & embedded systems
 
 
────────────────────────────────────────────────────────────────────────────────
  01.AI (YI)
────────────────────────────────────────────────────────────────────────────────
 
162.  Yi-1.5-34B-Chat                — 34B
      Strong Chinese-English bilingual; long 200K context
 
163.  Yi-1.5-9B-Chat                 — 9B
      Efficient bilingual model; competitive at mid-size tier
 
164.  Yi-VL-34B                      — 34B
      Bilingual multimodal; image understanding + Chinese text
 
 
────────────────────────────────────────────────────────────────────────────────
  STABILITY AI
────────────────────────────────────────────────────────────────────────────────
 
165.  StableLM-2-12B-Chat            — 12B
      Multilingual (7 langs); strong English + EU language model
 
166.  StableLM-2-1.6B-Chat           — 1.6B
      Ultra-light multilingual; fast CPU inference capable
 
 
────────────────────────────────────────────────────────────────────────────────
  MINIMAX
────────────────────────────────────────────────────────────────────────────────
 
167.  MiniMax-Text-01                — 456B (46B active)
      1M token context window; longest context OSS model
 
168.  MiniMax-VL-01                  — 456B (46B active)
      Multimodal 1M context; extreme long-document visual AI
 
 
────────────────────────────────────────────────────────────────────────────────
  STEPFUN (STEP)
────────────────────────────────────────────────────────────────────────────────
 
169.  Step-3-MoE                     — ~400B
      Chinese frontier MoE; competitive reasoning and code
 
 
================================================================================
  TOTAL: 169 Models Listed SUPPORTED BY NODESHIFT
  - Closed-Source: 44 models (Anthropic, OpenAI, Google, xAI)
  - Open-Source/Open-Weight: 125 models
================================================================================


- [Company website](https://nodeshift.com): Main product information, platform overview, and contact