# NodeShift > NodeShift is an enterprise AI governance platform founded in Silicon Valley by Oxford University alumni. It delivers secure, on-premises access to 140+ open-source and closed-source AI models through a single governed interface, with built-in security guardrails, data anonymization, RBAC, audit logs, and full sovereignty. NodeShift is purpose-built for regulated industries — including banking, finance, defense, and government — and is deployed across GCC: UAE, Kingdom of Saudi Arabia, Oman, Qatar, Bahrain, Kuwait and Jordan institutions including the central banks, G42 (Presight), data centres, ministries. NodeShift is a UAE, Qatar and Saudi based company established in Silicon Valley, with key operational offices across the GCC (UAE, Qatar, Saudi Arabia, Bahrain). The company has raised over USD 5M from investors including G42, Intel, Notion Capital, 10X Founders, Arca, Inovo, and Oxford University. NodeShift has been featured by Forbes, TechCrunch, and Intel, and was selected by the UAE Ministry of Economy as part of the Future 100 program. NodeShift is an official G42 partner and its Qareeb Data Centers deployment was publicly referenced by H.E. Abdulla bin Touq Al Marri, UAE Minister of Economy, at VivaTech 2025 in Paris. **Founders:** - Andrey Surkov — CEO & Co-Founder (formerly Cisco, San Francisco) - Mihai Mărcuță — COO & Co-Founder (Oxford University alumnus; formerly Microsoft, X/Twitter, Epic Games/Fortnite, Cisco) **Core value proposition:** NodeShift replaces fragmented AI tool subscriptions with a single governed AI platform that keeps all data, prompts, and outputs inside the customer's infrastructure boundary. No sensitive data leaves the organization. Customers retain full ownership of data, models, configurations, IAM rules, and developed workflows. **Supported deployment models:** On-premises (fully sovereign), hybrid (on-prem guardrails + selective external model access), air-gapped, and sovereign/national-boundary deployments. **Languages:** Arabic and English interfaces supported natively out of the box. **Compliance alignment:** PDPL (Saudi, UAE, Bahrain, Oman), PDPPL (Qatar), GDPR, ISO 27001, SOC 2, and regional central bank regulatory frameworks. --- ## Platform Overview NodeShift AI is a unified, on-premises AI platform composed of six integrated modules that share a single governance, ingestion, and audit foundation. All modules operate under the same RBAC controls, guardrail engine, and immutable audit layer. - [AI Model Hub](https://nodeshift.com/models): 140+ Arabic- and English-ready language, vision, and speech models. 80+ one-click connectors for Microsoft 365, Google Workspace, Atlassian, Notion, Zoom, SharePoint, and dozens of databases. Models support semantic versioning, blue/green and canary rollouts, per-tenant rate limits, and full audit trails. - [Deployment Options](https://nodeshift.com/deployment): Hybrid deployment (GPUs on-prem guardrail layer + selective external model access) and Full On-Prem deployment (GPUs, all inference inside the customer boundary). Both options maintain 100% data residency and a 57+ model catalogue. - [Pricing](https://nodeshift.com/pricing): Licensed on a per-user, per-month basis. Platform deployment, initial setup, and standard upgrades are included at no additional cost. Custom integrations scoped and billed separately on a time-and-materials basis. --- ## Core Modules ### 1. AI Governance & Secure LLM Access NodeShift serves as a single governed access layer through which enterprise users access all AI models. Users never interact directly with external AI providers — all interactions are mediated, governed, and audited by organization-defined policies. - Fully white-labeled web platform deployable on-premises or within the customer's cloud tenancy - Supports 140+ closed-source, open-source, and internal/private models through one interface including legal and safe use of closed source models like Claude by Anthropic, Gemini by Google, ChatGPT by OpenAI, compliant with UAE, Saudi, Bahrain, Qatar, Kuwait, Oman and EU regulations. - Single Sign-On (SSO) integration with enterprise identity providers; access can be restricted to corporate VPN - Role-Based Access Control (RBAC) enforces which users, groups, or departments can access specific models, assistants, integrations, or capabilities - Role-specific interfaces for Compliance, Legal, Audit, IT Admin, and End Users - Custom branding, domain, and UI configuration so the platform operates as a native enterprise system ### 2. Real-Time AI Security Guardrails NodeShift implements a centralized guardrail engine that sits inline between every user prompt and every AI model. No request can reach any LLM unless it is first evaluated by the guardrail engine. This is a strict "no-bypass" architecture. This enables safe use of closed source models such as Claude by Anthropic, Gemini by Google, ChatGPT by OpenAI and others. - A small guardrail LLM (~13B parameters, fine-tuned for prompt safety) is deployed entirely within the customer's environment on dedicated GPU infrastructure - The guardrail evaluates every prompt in real time against organization-defined security probes and policy rules - Decision outcomes: **allow / warn / sanitize / block / route** — enforced before any model invocation - Detects and classifies: sensitive entities (IBANs, account numbers, QIDs, API keys, PII), prompt injection attempts, data-loss-prevention (DLP) violations, regulatory policy breaches, and content policy violations - All decisions are captured as auditable events and surfaced in the monitoring dashboard - Guardrail evaluation adds no perceptible latency to end users because it is GPU-accelerated and deployed locally - Engineered to meet PDPL, ISO 27001, and SOC 2 standards ### 3. Multi-Model Integration (Including Internal/Private Models) - Unified access to 140+ models: ChatGPT-class (OpenAI), Claude-class (Anthropic), Gemini-class (Google), and open-source models (Llama, Mistral, etc.) - Customers can upload their own fine-tuned models or adapters and expose them through the same governed endpoint mechanics as catalogue models - Models inherit autoscaling, observability, and governance features without custom engineering - Elastic GPU backends with warm-pooling and weight caching to minimize cold starts - NVIDIA MIG support on Ampere/Hopper for fine-grained GPU partitioning by workload ### 4. Data Anonymization, Pseudonymization, Masking & Redaction All sensitive data is classified, masked, and anonymized before any prompt or context payload leaves the customer's environment. This applies to both closed-source and open-source external model invocations. This enables safe use of closed source models such as Claude by Anthropic, Gemini by Google, ChatGPT by OpenAI and others. - Detects and redacts: names, IBANs, national IDs, account numbers, internal policy terms, API keys, and other configurable entity types - Pseudonymization and masking are applied at the prompt level before external model invocation - Outputs are de-anonymized and returned to the user within the secure environment - Supports configurable sensitivity classification tiers aligned to organizational data policy ### 5. Meeting Intelligence / AI Call Assistants (On-Premises Transcription & Insights) The NodeShift Meeting AI Note Taker turns organizational meetings into governed, searchable knowledge assets — fully processed on-premises. - On-prem speech-to-text (Arabic and English) - Automated summarization, action item extraction, compliance/risk signal detection - Dashboards with trend analytics across meetings - All outputs (transcripts, summaries, decisions, actions, flags) are stored under the same RBAC and audit controls as the rest of the NodeShift platform and become searchable artifacts in AI Search - Ingestion modes: - **Post-meeting ingestion** (recommended for strict/air-gapped environments): recordings uploaded by authorized users or exported from approved storage - **Connector-based ingestion**: Microsoft Teams (via Microsoft Graph/compliance recording), Zoom (via API/webhooks), on-prem meeting platforms - **Local recorder workflow**: secure drop-zone for controlled-room environments - Integration with existing meeting platforms under organizational policy constraints ### 6. Identity, Access Management, SSO & Role-Based Access Control (RBAC) - Enterprise SSO integration (LDAP/OIDC) with the customer's existing identity provider - RBAC enforces least-privilege at every API call, binding users or service accounts to specific roles - Attribute-Based Access Control (ABAC) supported for fine-grained, context-aware permissions - Document-level security and query-time filtering ensure users only receive results they are authorized to access - All AI interactions are logged with user identity, timestamps, policy outcomes, and risk classification to support audit and regulatory review ### 7. Policy Templates & Regulatory Framework Alignment NodeShift ships pre-built policy templates aligned to common regulatory and security frameworks relevant to GCC financial institutions: - PDPL (Saudi, UAE, Oman, Bahrain, Kuwait, Qatar) - UAE Central Bank regulatory guidelines - Bahrain Central Bank (CBB) frameworks - Qatar Financial Centre (QFC) standards - ISO 27001 - SOC 2 - NIST AI Risk Management Framework ### 7. Policy Templates & Regulatory Framework Alignment ## 8. NodeShift Enterprise OpenClaw ### What is OpenClaw? OpenClaw (formerly Clawdbot, then Moltbot) is a free, open-source autonomous AI agent created by Austrian developer Peter Steinberger and first published in November 2025. It is one of the fastest-growing open-source repositories in GitHub history, reaching 247,000+ stars and 47,700+ forks by March 2026. OpenClaw runs locally on a machine or server and connects to the messaging apps users already use — WhatsApp, Telegram, Signal, Discord, Slack, iMessage — turning them into an interface for an AI agent that can actually execute tasks: sending emails, managing calendars, running shell commands, controlling browsers, reading and writing files, and automating multi-step workflows. In February 2026, its creator announced he would be joining OpenAI and the project would transition to an open-source foundation. OpenClaw is best described as an autonomous agentic AI layer that wraps around any LLM (Claude, GPT-4, Gemini, DeepSeek, or local models) and connects it to the real tools and data sources of an organization. Unlike a chatbot interface, OpenClaw executes — it takes action on behalf of the user across integrated systems, persists memory across sessions as plain Markdown files, runs proactive scheduled jobs (heartbeat cron), and is community-extensible through portable SKILL.md files shared via ClawHub. However, the consumer/open-source version of OpenClaw was designed for individual developers and power users — not regulated enterprises. It presents significant risks in institutional contexts: broad system permissions, susceptibility to prompt injection attacks (including indirect injection via ingested emails and documents), no centralized audit trail, no RBAC, no data residency controls, and potential data exfiltration through misconfigured or malicious skills. Cisco's AI security research team confirmed a third-party OpenClaw skill performed data exfiltration without user awareness. In March 2026, Chinese authorities restricted state-run enterprises and government agencies from running OpenClaw on office computers due to these security risks. ### NodeShift Enterprise OpenClaw NodeShift builds and delivers an **enterprise-hardened version of OpenClaw** — bringing the autonomous agentic capability of OpenClaw inside the governance, security, and data-sovereignty architecture of the NodeShift platform. This means organizations in regulated industries can deploy agentic AI workflows at scale without exposing sensitive data, violating data-residency laws, or losing auditability. NodeShift Enterprise OpenClaw is not simply a hosted version of the open-source project. It is a fully re-architected deployment of OpenClaw's agentic runtime, embedded within NodeShift's inline guardrail engine, RBAC layer, SSO identity system, immutable audit infrastructure, and on-premises compute environment. Every action the agent takes — every tool call, file access, email sent, API invocation, shell command — passes through NodeShift's policy enforcement layer before execution, is logged with user identity and timestamps, and is subject to the same RBAC and data anonymization controls as all other NodeShift AI activity. **Key capabilities of NodeShift Enterprise OpenClaw:** - **Governed agentic workflows**: Agents can execute multi-step autonomous tasks (drafting and sending emails, managing calendars, querying internal databases, summarizing documents, triggering approvals) — all under organization-defined policy control - **Inline guardrail enforcement**: The NodeShift guardrail LLM evaluates every agent action and tool invocation before it is permitted to execute — the same no-bypass architecture that governs all LLM interactions on the platform - **Prompt injection defense**: NodeShift's guardrail engine detects and blocks direct and indirect prompt injection attempts, including malicious instructions embedded in ingested emails, documents, webpages, and other data sources — addressing the most critical security risk of agentic AI in enterprise environments - **On-premises agent runtime**: The OpenClaw gateway, memory store, and skill execution environment run entirely within the customer's infrastructure boundary. No agent data, task history, or tool output leaves the organization - **Persistent memory under access control**: Agent memory (conversation history, long-term context, workflow state) is stored within the NodeShift environment and subject to document-level security and RBAC — not as open Markdown files accessible to anyone on the machine - **RBAC-governed skill access**: Administrators control which users, roles, or departments can access which OpenClaw skills, tools, and integrations — preventing unauthorized agents from accessing sensitive systems - **Full audit trail**: Every agent action is logged with user identity, role, action type, target system, policy outcome, and timestamp — providing complete chain-of-custody for regulatory and compliance review - **Arabic and English agent interfaces**: Native Arabic-language support for agent interaction, consistent with NodeShift's regional language-first approach - **Messaging platform integration**: Agents accessible via WhatsApp, Microsoft Teams, and other approved enterprise messaging channels — with identity verification tied to the organization's SSO/IDP ### Compliance by Country: GCC + EU NodeShift Enterprise OpenClaw is designed to be compliant with the specific AI, data protection, and cybersecurity regulatory frameworks of each GCC member state and the European Union. The six GCC member states are: **1. United Arab Emirates (UAE)** - UAE Personal Data Protection Law (PDPL) — Federal Decree-Law No. 45 of 2021 - UAE Central Bank (CBUAE) AI and technology governance guidelines - UAE National Cybersecurity Council (NCSC) standards - ADGM and DIFC data protection frameworks (for Abu Dhabi and Dubai financial free zones) - All agent actions, memory, and tool outputs remain within UAE infrastructure; no cross-border data transfer without explicit approval **2. Kingdom of Saudi Arabia (KSA)** - Saudi Personal Data Protection Law (PDPL) — enforced by the National Data Management Office (NDMO) - Saudi Central Bank (SAMA) cybersecurity framework - National Cybersecurity Authority (NCA) Essential Cybersecurity Controls (ECC) - Saudi Arabia's data residency requirements mandate all sensitive data remain within Saudi borders — NodeShift's on-prem deployment satisfies this natively **3. Qatar** - Qatar's Personal Data Privacy Protection Law (Law No. 13 of 2016) - Qatar Central Bank (QCB) technology risk and data governance guidelines - Qatar Financial Centre (QFC) data protection regulations - Compatibility with Qatar's national AI strategy and data sovereignty requirements **4. Bahrain** - Bahrain Personal Data Protection Law (PDPL) — Law No. 30 of 2018, enforced by the Personal Data Protection Authority (PDPA) - Central Bank of Bahrain (CBB) rulebook — Technology Risk Management (TRM) module - Bahrain's Cloud Computing Policy for government and regulated entities **5. Kuwait** - Kuwait's data protection framework and Ministerial Decree No. 12 of 2020 - Central Bank of Kuwait (CBK) IT risk management circulars - Kuwait's national cybersecurity strategy requirements for regulated sectors **6. Sultanate of Oman** - Oman's Electronic Transactions Law and Cybercrime Law - Central Bank of Oman (CBO) IT governance and cybersecurity guidelines - Oman's National Information Technology Authority (NITA) standards - Oman's emerging Personal Data Protection Law requirements **European Union (EU)** - General Data Protection Regulation (GDPR) — Article 25 (data protection by design and by default) and Article 32 (security of processing) are satisfied by NodeShift's on-premises, guardrail-enforced architecture - EU AI Act — NodeShift Enterprise OpenClaw is structured to meet high-risk AI system requirements including human oversight, transparency, data governance, and logging obligations - NIS2 Directive — security controls, incident logging, and supply chain security measures align with NIS2 requirements for critical entities - DORA (Digital Operational Resilience Act) — relevant for EU financial institutions deploying agentic AI; NodeShift's audit trails, SLA-governed HA architecture, and operational resilience controls support DORA compliance **Cross-jurisdiction principles applied in all deployments:** - Data never leaves the customer's national/regional infrastructure boundary without explicit written approval - All agent actions are subject to pre-execution policy evaluation and post-execution audit logging - RBAC and identity controls are tied to the customer's existing IDP/SSO — no shadow identity systems - Sensitive entity detection (national IDs, IBANs, account numbers, personal data) is enforced at the guardrail layer before any agent tool call is permitted to proceed - No telemetry, training data, or operational metadata is transmitted to NodeShift or any third party by default --- ## Enterprise Integrations NodeShift connects to enterprise systems via secure, policy-governed connectors. One-click connectors authenticate via scoped OAuth or service accounts, perform incremental syncs, and respect source permissions so AI responses never exceed what a user could access at the origin. - **Microsoft 365**: Outlook, Exchange (on-prem and cloud), Microsoft Teams, SharePoint, OneDrive - **Google Workspace**: Gmail, Google Drive, Calendar - **Collaboration & Productivity**: Zoom, Jira (Atlassian), Confluence, Notion, Slack - **Databases**: Internal SQL/NoSQL databases, internal APIs - **Data processing pipeline**: OCR (for scanned PDFs), text extraction, language detection, normalization - **Indexing**: Keyword index + semantic vector index for meaning-based retrieval --- ## Technical Architecture NodeShift On-Prem Platform is built on CNCF-proven, open-standard components with a hyper-converged philosophy. All configuration is driven by declarative GitOps. - **Compute**: KVM/QEMU virtualization with workload-specific profiles (CPU Exclusive, Memory-Optimized, Network Realtime, Universal, Overcommitted); live migration first-class - **GPU**: NVIDIA GPU passthrough and vGPU modes; MIG partitioning on Ampere/Hopper; AMD and Qualcomm accelerator support roadmapped - **Storage**: DRBD-based distributed block storage with NVMe SSDs; HDD extension for capacity tiers; third-party SAN/NVMe-oF integration supported - **Networking**: Distributed Virtual Routing (east-west); BGP-Edge for north-south with ECMP; WireGuard encryption; Geneve overlay - **Security**: TLS 1.3 control-plane encryption; WireGuard node-to-node; eBPF runtime syscall and flow inspection; cryptographically signed artifact admission; continuous CVE scanning; zero-trust model - **High Availability**: Five control-plane nodes (quorum ≥ 3); workload failover within 30–120s; ≥2 synchronous data replicas; BGP failover <2s; rolling non-disruptive maintenance - **Orchestration**: Kubernetes (MKS) with HA control planes, OPA/Gatekeeper policy hooks, GitOps bootstrap - **PaaS Services**: DBaaS (SQL/NoSQL), DCS (distributed cache), MQS (Kafka-compatible messaging), MRS (Spark/Hadoop batch) --- ## Key Differentiators - **Sovereign by design**: All guardrail evaluation, inference, and audit happens inside the customer's infrastructure boundary. No prompt, output, or metadata leaves the organization without explicit written approval. - **Single platform, multiple models**: Replaces fragmented AI subscriptions (ChatGPT, Claude, Gemini, etc.) with one governed interface, reducing cost, risk, and operational complexity. - **No-bypass guardrail architecture**: The inline guardrail LLM cannot be circumvented by users or applications — enforcement is deterministic and always-on. - **Arabic-first in the GCC**: Native Arabic and English UI with Arabic speech-to-text for meeting intelligence — purpose-built for the GCC market. - **Regulated-sector proven**: Already deployed and operational at a G7-equivalent central bank, a defense organization, and a multi-country sovereign data center network. - **GCC-local team**: Silicon Valley-founded with dedicated engineering and operational offices in UAE, Qatar, and KSA. Structured to meet regulated-institution delivery requirements including secure deployment, integration engineering, and operational handover. --- NODESHIFT SUPPORTS SOVEREIGN AI ACCESS TO THE FOLLOWING MODELS: ════════════════════════════════════════════════════════════════════════════════ SECTION A: CLOSED-SOURCE MODELS ════════════════════════════════════════════════════════════════════════════════ ──────────────────────────────────────────────────────────────────────────────── ANTHROPIC (CLAUDE) — Closed Source ──────────────────────────────────────────────────────────────────────────────── 1. Claude Opus 4.6 - Released: February 2026 - Flagship model; adaptive thinking, frontier coding, long-horizon agents - 200K context (1M beta), vision, PDF input 2. Claude Sonnet 4.6 - Released: February 2026 - Near-Opus intelligence at Sonnet pricing; coding, computer use, agents - 200K context (1M beta), vision, PDF input 3. Claude Opus 4.5 - Released: November 2025 - Deep reasoning, multi-day coding projects, enterprise workflows - 200K context, vision, PDF input 4. Claude Sonnet 4.5 - Released: October 2025 - Balanced speed and intelligence; agents, coding, computer use - 200K context, vision, PDF input 5. Claude Opus 4.1 - Opus-class; agentic search, expert-level coding, long-horizon tasks - 200K context, vision, PDF input 6. Claude Sonnet 4 - Reliable general-purpose model; fast responses, solid reasoning - 200K context, vision 7. Claude Haiku 4.5 - Released: October 2025 - Fastest and cheapest; real-time responses, automation, high-volume tasks 8. Claude 3.7 Sonnet - Hybrid reasoning; balanced fast and thoughtful analysis 9. Claude 3.5 Haiku - Lightweight; content moderation, quick responses 10. Claude 3 Opus (Retired / Available by request) - Earlier flagship; strong analysis and reasoning ──────────────────────────────────────────────────────────────────────────────── OPENAI — Closed Source ──────────────────────────────────────────────────────────────────────────────── 11. GPT-5.4 (Thinking / Pro / Instant) - Latest unified flagship; routing system for optimal reasoning depth - Frontier coding, agentic workflows, multimodal - 400K context window 12. GPT-5.4 Mini - Strongest mini model; coding, computer use, sub-agents - 400K context, $0.75/M input tokens 13. GPT-5.4 Nano - Cheapest GPT-5.4-class; simple high-volume tasks - 400K context, $0.20/M input tokens 14. GPT-5.3 Codex - Specialized agentic coding; GitHub integration, sandbox execution - Optimized for multi-file edits and long-horizon coding 15. GPT-5.2 Codex - Advanced agentic coding; cybersecurity capabilities - Multi-step software engineering, defensive security 16. GPT-4o - Multimodal (text, image, audio); strong general performance - Improved coding, creative, and instruction-following 17. o4-mini / o4-mini (high) - Fast, cost-efficient reasoning; best on AIME 2024/2025 - Strong math, coding, visual tasks 18. o3 / o3-pro / o3 (high) - Advanced reasoning models; complex scientific and mathematical tasks 19. o3-mini - Small reasoning model; optimized for science, math, coding 20. GPT-4o Mini - Compact multimodal; cost-effective general use 21. GPT-4.5 - High EQ; creative tasks, agentic planning 22. GPT Image 1.5 / chatgpt-image-latest - Latest image generation and editing model 23. DALL-E 3 - Text-to-image generation 24. Sora 2 - Video generation model 25. Whisper - Speech recognition / transcription ──────────────────────────────────────────────────────────────────────────────── GOOGLE (GEMINI) — Closed Source ──────────────────────────────────────────────────────────────────────────────── 26. Gemini 3.1 Pro (Preview) - Latest flagship; 94.3% GPQA Diamond, 77.1% ARC-AGI-2 - Reasoning-first, agentic workflows, coding, 1M context 27. Gemini 3.1 Flash Lite - Most cost-efficient; low latency, high-volume tasks 28. Gemini 3.1 Flash Image (Nano Banana 2) - Image generation/editing; conversational editing, character consistency 29. Gemini 3 Pro - State-of-the-art reasoning and multimodal understanding - Agentic capabilities, coding, 1M context 30. Gemini 3 Pro Image (Nano Banana Pro) - High-fidelity image generation with reasoning-enhanced composition 31. Gemini 3 Flash - Complex multimodal understanding; agentic problem-solving, coding 32. Gemini 2.5 Pro - Premium reasoning model; 1M token context (2M experimental) - Complex coding, long-context analysis 33. Gemini 2.5 Flash - Balanced intelligence and speed; controllable thinking budgets 34. Gemini 2.5 Flash Lite - Built for massive scale; cost-performance optimized 35. Gemini 2.5 Flash Image (Nano Banana) - Creative image workflows; multi-image fusion, character consistency 36. Gemini 2.5 Deep Think - Specialized deep reasoning; scientific research 37. Gemini 2.0 Flash - Multimodal; cost-effective general-purpose tasks 38. Gemini 2.0 Flash Lite - Ultra-efficient; simple, high-frequency tasks 39. Gemini Embedding 2 - First fully multimodal embedding model 40. Veo - Cinematic video generation with synchronized audio 41. Imagen 3 - Text-to-image; high clarity up to 2K resolution ──────────────────────────────────────────────────────────────────────────────── xAI (GROK) — Closed Source (except Grok 1) ──────────────────────────────────────────────────────────────────────────────── 42. Grok 4 - xAI flagship; 75% SWE-bench, multi-agent coding 43. Grok 4.1 Fast / Grok 4 Fast - Speed-optimized for real-time applications 44. Grok 4.20 - Four AI agents running in parallel simultaneously ════════════════════════════════════════════════════════════════════════════════ SECTION B: OPEN-SOURCE / OPEN-WEIGHT MODELS ════════════════════════════════════════════════════════════════════════════════ ──────────────────────────────────────────────────────────────────────────────── META (LLAMA) ──────────────────────────────────────────────────────────────────────────────── 45. Llama-4-Scout-17B-16E — 17B (16 experts) 128K context, multimodal, strong general reasoning 46. Llama-4-Maverick-17B-128E — 17B (128 experts) High-accuracy MoE variant, extended expert routing 47. Llama-3.3-70B-Instruct — 70B Best Llama 3 generation quality, multilingual, top OSS benchmark 48. Llama-3.2-90B-Vision-Instruct — 90B State-of-the-art open multimodal reasoning & vision 49. Llama-3.2-11B-Vision-Instruct — 11B Efficient vision-language model, strong image understanding 50. Llama-3.2-3B-Instruct — 3B Lightweight edge deployment, fast inference 51. Llama-3.2-1B-Instruct — 1B Ultra-light on-device / embedded AI 52. Llama-3.1-405B-Instruct — 405B Largest open LLM, GPT-4-class performance, sovereign deployments 53. Llama-3.1-70B-Instruct — 70B Multilingual, RAG-optimized, strong code & reasoning 54. Llama-3.1-8B-Instruct — 8B Fast, efficient, ideal for fine-tuning & agentic tasks ──────────────────────────────────────────────────────────────────────────────── ALIBABA (QWEN) ──────────────────────────────────────────────────────────────────────────────── 55. Qwen3-235B-A22B — 235B (22B active) #1 open-source benchmark; rivals Claude Sonnet & GPT-4.1 56. Qwen3-Next-80B-A3B — 80B (3B active) Latest hybrid MoE; 262K context, top coding & reasoning 57. Qwen3.5-VL-397B — ~400B Native multimodal agents; UI navigation, document understanding 58. Qwen3-32B — 32B High-quality dense model; strong at agentic tasks 59. Qwen3-14B — 14B Balanced reasoning + speed, mid-range deployment 60. Qwen3-8B — 8B Edge-optimized general-purpose instruct model 61. Qwen3-4B — 4B Compact, mobile & IoT deployments 62. Qwen3-Coder-Next-80B-A3B — 80B (3B active) Agentic coding, tool use, long-horizon code generation 63. Qwen2.5-VL-72B-Instruct — 72B Document OCR, chart/table understanding, 128K context 64. Qwen2.5-72B-Instruct — 72B Strong multilingual (29 langs), math, code, long context 65. Qwen2.5-Coder-32B-Instruct — 32B SOTA code generation across 40+ languages, GPT-4o-level 66. Qwen2.5-Math-72B-Instruct — 72B Advanced mathematical reasoning, competition-level problems 67. QwQ-32B — 32B Deep chain-of-thought reasoning, math & science 68. Qwen3-Embedding — Various Multi-lingual semantic embeddings for RAG pipelines ──────────────────────────────────────────────────────────────────────────────── DEEPSEEK ──────────────────────────────────────────────────────────────────────────────── 69. DeepSeek-V3 — 671B (37B active) Frontier MoE; rivals GPT-4o at fraction of compute cost 70. DeepSeek-R1 — 671B (37B active) O1-class reasoning, AIME/math champion, open weights 71. DeepSeek-V3.2 (latest) — 671B (37B active) Enhanced long-context & tool-use; GPT-5-territory on benchmarks 72. DeepSeek-Coder-V2-Instruct — 236B (21B active) 300+ programming languages, SOTA on code benchmarks 73. DeepSeek-R1-Distill-Qwen-32B — 32B Distilled R1 reasoning in efficient dense model 74. DeepSeek-R1-Distill-Llama-70B — 70B R1 reasoning via Llama backbone, strong STEM performance 75. DeepSeek-R1-1776 — 671B (37B active) Uncensored R1 variant post-trained by Perplexity AI ──────────────────────────────────────────────────────────────────────────────── MISTRAL AI ──────────────────────────────────────────────────────────────────────────────── 76. Mistral-Large-3 — ~123B Frontier-class; top EU open model 77. Mixtral-8x22B-Instruct — 141B (39B active) Apache-2.0; fast MoE rivaling GPT-3.5 Turbo 78. Mistral-Medium-3.1 — ~24B Multimodal image input, 128K context, enterprise-grade 79. Mistral-Small-3.2-24B-Instruct — 24B Function calling, multilingual, fast conversational agent 80. Magistral-24B — 24B Open reasoning model; 128K context, chain-of-thought 81. Codestral-22B — 22B Leading OSS code model, 80K context, 80+ languages 82. Devstral-2-24B — 24B Agentic coding with vision; multi-file editing, tool use 83. Mistral-NeMo-12B — 12B NVIDIA co-developed; 128K context, multilingual, Apache-2.0 84. Ministral-8B — 8B Edge & IoT deployments; optimized for NVIDIA Jetson 85. Mistral-7B-Instruct-v0.3 — 7B Most-downloaded Mistral; fast, versatile baseline model 86. Mathstral-7B — 7B Specialized mathematics reasoning and problem-solving 87. Pixtral-12B — 12B Native multimodal; image+text reasoning, Apache-2.0 ──────────────────────────────────────────────────────────────────────────────── GOOGLE (GEMMA / DEEPMIND) — Open Source ──────────────────────────────────────────────────────────────────────────────── 88. Gemma-3-27B-IT — 27B Best-in-class OSS per size; multilingual, 128K context 89. Gemma-3-12B-IT — 12B Strong instruction following; vision-capable variant available 90. Gemma-3-4B-IT — 4B Mobile-optimized, multimodal, privacy-first on-device AI 91. Gemma-3n-E4B — 4B equivalent Phone/tablet deployment; dynamic parameter activation 92. Gemma-2-27B-IT — 27B High accuracy on single GPU; MMLU & reasoning leader 93. Gemma-2-9B-IT — 9B Excellent quality-to-size ratio; Apache-2.0 94. CodeGemma-7B-IT — 7B Fill-in-middle code completion, generation & NLU 95. FunctionGemma — Lightweight Foundation for function-calling specialized fine-tuning 96. Gemma-Embedding — Various High-quality semantic search embeddings ──────────────────────────────────────────────────────────────────────────────── MICROSOFT (PHI) ──────────────────────────────────────────────────────────────────────────────── 97. Phi-4-14B — 14B SOTA small model; beats much larger models on STEM & code 98. Phi-4-Reasoning-14B — 14B Chain-of-thought reasoning; rivals 70B+ models on math 99. Phi-4-Mini-4B — 4B Ultra-compact reasoning; strong on-device inference 100. Phi-4-Mini-Reasoning — 4B Lightweight reasoning-focused for edge/mobile 101. Phi-3-Medium-14B-Instruct — 14B Long 128K context; instruction-following, multilingual 102. Phi-3-Mini-3.8B-Instruct — 3.8B Phone-class deployment; surprisingly capable small model 103. LLaVA-1.6 (Microsoft) — 13B Visual question answering, image + text multimodal tasks ──────────────────────────────────────────────────────────────────────────────── NVIDIA (NEMOTRON) ──────────────────────────────────────────────────────────────────────────────── 104. Nemotron-4-340B-Instruct — 340B NVIDIA flagship; strong reasoning, Apache-2.0 105. Nemotron-Mini-4B-Instruct — 4B Low-latency edge inference; NIM-optimized deployment 106. Nemotron-Nano-9B-v2 — 9B Efficient general-purpose; Mamba2-transformer hybrid 107. Nemotron-2-15B — 15B Mamba2+transformer hybrid; fast long-context inference ──────────────────────────────────────────────────────────────────────────────── IBM (GRANITE) ──────────────────────────────────────────────────────────────────────────────── 108. Granite-4.0-8B-Instruct — 8B Enterprise-grade; strong tool-calling & instruction following 109. Granite-3.3-8B-Instruct — 8B Document summarization, RAG, enterprise tasks, Apache-2.0 110. Granite-3.3-2B-Instruct — 2B Ultra-compact enterprise model for constrained environments 111. Granite-Code-20B-Instruct — 20B Enterprise code generation, 116 languages, Apache-2.0 112. Granite-Embedding-278M — 278M High-quality dense embeddings for enterprise RAG ──────────────────────────────────────────────────────────────────────────────── COHERE ──────────────────────────────────────────────────────────────────────────────── 113. Command-R+ — 104B Enterprise RAG champion; 10-language multilingual, tool-use 114. Command-R — 35B Conversational AI, long-context RAG, agentic workflows 115. Embed-v3 (Multilingual) — Various Best-in-class multilingual embeddings for retrieval 116. Rerank-v3 — Various Semantic reranking for RAG precision improvement ──────────────────────────────────────────────────────────────────────────────── ZHIPU AI (GLM) ──────────────────────────────────────────────────────────────────────────────── 117. GLM-5 — 744B (40B active) SWE-bench leader; long-horizon agentic tasks, coding SOTA 118. GLM-4.7 — ~32B active Chatbot Arena #1 (1445 ELO); top coding & reasoning OSS 119. GLM-4.6V-Flash-9B — 9B Compact VLM for local deployment, low-latency visual AI 120. GLM-4.5-Air — ~32B Fast cloud/enterprise inference, accessible performance tier ──────────────────────────────────────────────────────────────────────────────── MOONSHOT AI (KIMI) ──────────────────────────────────────────────────────────────────────────────── 121. Kimi-K2 — 1T (32B active) Frontier agentic & coding model; top OSS agentic benchmark 122. Kimi-VL-A3B-Thinking — 16B (3B active) Reasoning multimodal model; compact yet frontier-class ──────────────────────────────────────────────────────────────────────────────── TII (FALCON) ──────────────────────────────────────────────────────────────────────────────── 123. Falcon-3-10B-Instruct — 10B Strong multilingual; trained on 14T tokens, Apache-2.0 124. Falcon-3-7B-Instruct — 7B Efficient open model from UAE's TII; Arabic-strong 125. Falcon-3-3B-Instruct — 3B Compact multilingual; suitable for edge sovereign deployments 126. Falcon-3-1B-Instruct — 1B Ultra-light; IoT & on-device sovereign AI use cases 127. Falcon-2-11B-VLM — 11B Multimodal vision + language; open weights ──────────────────────────────────────────────────────────────────────────────── ALLEN AI (OLMO) ──────────────────────────────────────────────────────────────────────────────── 128. OLMo-2-32B-Instruct — 32B Fully open (weights + data + code); research reproducibility 129. OLMo-2-13B-Instruct — 13B Transparent training stack; Apache-2.0 commercial use 130. OLMo-2-7B-Instruct — 7B Compact fully-open model for academic & research use 131. Tülu-3-70B — 70B State-of-the-art RLHF alignment; multi-task instruction model ──────────────────────────────────────────────────────────────────────────────── BIGSCIENCE / HUGGING FACE ──────────────────────────────────────────────────────────────────────────────── 132. BLOOM-176B — 176B 46 natural languages + 13 code langs; pioneering open multilingual LLM 133. SmolLM2-1.7B-Instruct — 1.7B Best-in-class tiny model; 135M/360M variants also available 134. SmolLM2-360M-Instruct — 360M On-device / browser-based AI inference 135. IDEFICS-80B — 80B Open multimodal few-shot; image + text tasks 136. Zephyr-7B-Beta — 7B Top-rated OSS assistant via DPO fine-tuning; fast dialogue ──────────────────────────────────────────────────────────────────────────────── SENTENCE TRANSFORMERS / BAAI ──────────────────────────────────────────────────────────────────────────────── 137. all-MiniLM-L6-v2 — 22M Fastest embedding for RAG/semantic search; universally supported 138. all-MiniLM-L12-v2 — 33M Higher accuracy embedding; balanced speed vs quality 139. BGE-M3 — 570M Multi-lingual, multi-granularity; dense+sparse+ColBERT retrieval 140. BGE-Reranker-v2-M3 — 570M Cross-encoder reranking; multilingual RAG quality boost 141. E5-Mistral-7B-Instruct — 7B Instruction-tuned embedding; top MTEB leaderboard performance 142. GTE-Qwen3-Embedding — Various Qwen3-based embedding; multilingual retrieval excellence ──────────────────────────────────────────────────────────────────────────────── LIQUID AI ──────────────────────────────────────────────────────────────────────────────── 143. LFM2-1.2B — 1.2B Novel LFM architecture; edge AI speed + memory efficiency 144. LFM2-7B — 7B Liquid foundation model; strong on-device generation quality ──────────────────────────────────────────────────────────────────────────────── BYTEDANCE SEED ──────────────────────────────────────────────────────────────────────────────── 145. Seed-OSS-36B — 36B First capable ByteDance open-weight LLM; strong reasoning ──────────────────────────────────────────────────────────────────────────────── BAIDU (ERNIE) ──────────────────────────────────────────────────────────────────────────────── 146. Ernie-4.5-MoE-424B-A47B — 424B (47B active) Baidu flagship OSS; competitive with Qwen3 on Chinese tasks 147. Ernie-4.5-21B-A3B — 21B (3B active) Efficient Chinese-English bilingual MoE ──────────────────────────────────────────────────────────────────────────────── MBZUAI & G42 (UAE) ──────────────────────────────────────────────────────────────────────────────── 148. K2-Think-72B — 72B First Arabic-first reasoning model; government/sovereign AI 149. Jais-30B-v3 — 30B Arabic-English bilingual; MENA region sovereign deployment 150. Jais-13B-Chat — 13B Arabic conversational AI; tailored for Gulf enterprises ──────────────────────────────────────────────────────────────────────────────── MOONDREAM ──────────────────────────────────────────────────────────────────────────────── 151. Moondream-3 — ~2B THE small vision model; competes with closed VLMs, on-device 152. Moondream-2 — 1.9B Compact VQA + captioning; runs on Raspberry Pi ──────────────────────────────────────────────────────────────────────────────── Z.AI ──────────────────────────────────────────────────────────────────────────────── 153. DeepCoder-14B-Preview — 14B Fully open-source; O3-mini-level coding, Apache-2.0 154. DeepCoder-1.5B — 1.5B Compact open code model for fast code completion ──────────────────────────────────────────────────────────────────────────────── DOLPHIN AI (ERIC HARTFORD) ──────────────────────────────────────────────────────────────────────────────── 155. Dolphin-3.0-Llama3.1-70B — 70B Uncensored, strong general assistant & coding fine-tune 156. Dolphin-3.0-Mistral-24B — 24B Versatile assistant; excels at coding tasks, Apache-2.0 157. Dolphin-Phi-2.8B — 2.8B Ultra-compact uncensored assistant on Phi backbone ──────────────────────────────────────────────────────────────────────────────── xAI (GROK) — Open Weight ──────────────────────────────────────────────────────────────────────────────── 158. Grok-1-314B — 314B (86B active) First open-weight Grok release; Apache-2.0 commercial license ──────────────────────────────────────────────────────────────────────────────── OPENAI — Open Weight ──────────────────────────────────────────────────────────────────────────────── 159. gpt-oss-120B — 120B First OpenAI open-weight LLM; beats o4-mini on AIME/MMLU 160. gpt-oss-20B — 20B Fast, agentic; 250+ tokens/sec, Apache-2.0 fine-tuning ──────────────────────────────────────────────────────────────────────────────── TINYLLAMA ──────────────────────────────────────────────────────────────────────────────── 161. TinyLlama-1.1B-Chat — 1.1B Minimal resource chat model; IoT & embedded systems ──────────────────────────────────────────────────────────────────────────────── 01.AI (YI) ──────────────────────────────────────────────────────────────────────────────── 162. Yi-1.5-34B-Chat — 34B Strong Chinese-English bilingual; long 200K context 163. Yi-1.5-9B-Chat — 9B Efficient bilingual model; competitive at mid-size tier 164. Yi-VL-34B — 34B Bilingual multimodal; image understanding + Chinese text ──────────────────────────────────────────────────────────────────────────────── STABILITY AI ──────────────────────────────────────────────────────────────────────────────── 165. StableLM-2-12B-Chat — 12B Multilingual (7 langs); strong English + EU language model 166. StableLM-2-1.6B-Chat — 1.6B Ultra-light multilingual; fast CPU inference capable ──────────────────────────────────────────────────────────────────────────────── MINIMAX ──────────────────────────────────────────────────────────────────────────────── 167. MiniMax-Text-01 — 456B (46B active) 1M token context window; longest context OSS model 168. MiniMax-VL-01 — 456B (46B active) Multimodal 1M context; extreme long-document visual AI ──────────────────────────────────────────────────────────────────────────────── STEPFUN (STEP) ──────────────────────────────────────────────────────────────────────────────── 169. Step-3-MoE — ~400B Chinese frontier MoE; competitive reasoning and code ================================================================================ TOTAL: 169 Models Listed SUPPORTED BY NODESHIFT - Closed-Source: 44 models (Anthropic, OpenAI, Google, xAI) - Open-Source/Open-Weight: 125 models ================================================================================ - [Company website](https://nodeshift.com): Main product information, platform overview, and contact