How to deploy QwQ-32B-Preview-abliterated in the Cloud?

by Ayush Kumar | December 11, 2024

Ready to build cheaper?

Custom CPU plans from as little as $0.012/hour.

QwQ-32B-Preview-Abliterated is an advanced language model developed by the Qwen team, designed to enhance reasoning capabilities in artificial intelligence. This model is a variant of the original QwQ-32B-Preview, modified to reduce restrictions on responses, allowing for more open-ended interactions. With its architecture featuring 32 billion parameters, it aims to tackle complex analytical tasks while maintaining a focus on philosophical inquiry and deep reasoning.

The abliterated version removes certain limitations present in the standard model, enabling it to provide more direct answers without the usual censorship. This makes it particularly useful for applications that require a high degree of flexibility in language processing. Despite its strengths, users should be aware of potential issues such as unexpected language mixing and recursive reasoning loops, which can affect the clarity of its outputs. Overall, QwQ-32B-Preview-Abliterated represents a significant step forward in AI technology, offering both enhanced capabilities and challenges in practical use.

Model Resource

Hugging Face

Link: https://huggingface.co/huihui-ai/QwQ-32B-Preview-abliterated

Ollama

Link: https://ollama.com/huihui_ai/qwq-abliterated

Prerequisites for deploying QwQ-32B-Preview-abliterated Model

GPUs: 1xRTXA6000 (for smooth execution).
Disk Space: 40 GB free.
RAM: 48 GB(24 Also works) but we use 48 for smooth execution
CPU: 48 Cores(24 Also works)but we use 48 for smooth execution

Note: The prerequisites for this are highly variable across use cases. For a large-scale deployment, one could use a high-end configuration.

Step 1: Sign Up and Set Up a NodeShift Cloud Account

Visit the NodeShift Platform and create an account. Once you’ve signed up, log into your account.

Follow the account setup process and provide the necessary details and information.

Step 2: Create a GPU Node (Virtual Machine)

GPU Nodes are NodeShift’s GPU Virtual Machines, on-demand resources equipped with diverse GPUs ranging from H100s to A100s. These GPU-powered VMs provide enhanced environmental control, allowing configuration adjustments for GPUs, CPUs, RAM, and Storage based on specific requirements.

Navigate to the menu on the left side. Select the GPU Nodes option, create a GPU Node in the Dashboard, click the Create GPU Node button, and create your first Virtual Machine deployment.

Step 3: Select a Model, Region, and Storage

In the “GPU Nodes” tab, select a GPU Model and Storage according to your needs and the geographical region where you want to launch your model.

We will use 1x RTX A6000 GPU for this tutorial to achieve the fastest performance. However, you can choose a more affordable GPU with less VRAM if that better suits your requirements.

Step 4: Select Authentication Method

There are two authentication methods available: Password and SSH Key. SSH keys are a more secure option. To create them, please refer to our official documentation.

Step 5: Choose an Image

Next, you will need to choose an image for your Virtual Machine. We will deploy QwQ-32B-Preview-abliterated on an NVIDIA Cuda Virtual Machine. This proprietary, closed-source parallel computing platform will allow you to install QwQ-32B-Preview-abliterated Model on your GPU Node.

After choosing the image, click the ‘Create’ button, and your Virtual Machine will be deployed.

Step 6: Virtual Machine Successfully Deployed

You will get visual confirmation that your node is up and running.

Step 7: Connect to GPUs using SSH

NodeShift GPUs can be connected to and controlled through a terminal using the SSH key provided during GPU creation.

Once your GPU Node deployment is successfully created and has reached the ‘RUNNING’ status, you can navigate to the page of your GPU Deployment Instance. Then, click the ‘Connect’ button in the top right corner.

Now open your terminal and paste the proxy SSH IP or direct SSH IP.

Next, if you want to check the GPU details, run the command below:

nvidia-smi

Note: We will be running QwQ-32B-Preview-Abliterated on both Open WebUI and Terminal.

What is Open WebUI?

Open WebUI is a versatile web-based platform designed to integrate smoothly with a range of language processing interfaces, like Ollama and other tools compatible with OpenAI-style APIs. It offers a suite of features that streamline managing and interacting with language models, adaptable for both server and personal use, transforming your setup into an advanced workstation for language tasks.

This platform lets you manage and communicate with language models through an easy-to-use graphical interface, accessible on both desktops and mobile devices. It even incorporates a voice interaction feature, making it as natural as having a conversation.

How to set up Open WebUI?

We have a separate blog post on Open WebUI. In this blog post, we provide a step-by-step and detailed guide on setting up Open WebUI. If you want to run this model on Open WebUI, check out the blog using the link below:

Link: https://nodeshift.com/blog/running-ai-models-with-open-webui

Step 8: Install Ollama

After setting up Open WebUI, now its time to install Ollama from the Ollama website.

Website Link: https://ollama.com/

Run the following command to install the Ollama:

curl -fsSL https://ollama.com/install.sh | sh

Step 9: Serve Ollama

Run the following command to host the Ollama so that it can be accessed and utilized efficiently:

ollama serve

Now, both Open WebUI and Ollama are running.

Step 10: Pull QwQ-32B-Preview-abliterated Model

Run the following command to pull the QwQ-32B-Preview-abliterated model:

ollama run huihui_ai/qwq-abliterated

Step 11: Check Available Models

Run the following command to check if the downloaded models are available:

ollama list

Step 12: Run Model on Open WebUI

Now, refresh the Open WebUI interface to ensure that the QwQ-32B-Preview-abliterated model are available.

Then, select the model and start interacting with it.

Check the screenshots below for the output.

Note: This is a step-by-step guide for interacting with your model on Open WebUI. If you prefer not to set up Open WebUI, you can simply install Ollama, pull the model, and start interacting with it directly in the terminal.

Step-by-Step Guide for Using QwQ-32B-Preview-Abliterated Model

Option 1: Using Open WebUI

Set Up Open WebUI: Follow the setup guide to configure Open WebUI.
Download the Model: Ensure QwQ-32B-Preview-Abliterated is downloaded.
Refresh the Interface: Refresh the Open WebUI interface to make the model visible.
Select the Model: Choose QwQ-32B-Preview-Abliterated from the list.
Start Interaction: Enter your prompts and interact with the model through the interface.

Option 2: Using Terminal

Download the Model: Use the appropriate command to download QwQ-32B-Preview-Abliterated.

ollama pull qwq-32b-preview-abliterated

Run the Model: Start the model in the terminal with:

ollama run qwq-32b-preview-abliterated

Start Interaction: Input your prompts in the terminal to begin interacting with the model.

Conclusion

The QwQ-32B-Preview-abliterated model is a groundbreaking model from huihui-ai that offers advanced capabilities to developers and researchers. By following this step-by-step guide, you can easily deploy QwQ-32B-Preview-abliterated on a cloud-based virtual machine using a GPU-powered setup from NodeShift to maximize its potential. NodeShift provides a user-friendly, secure, and cost-effective platform to run your models efficiently. It’s an ideal choice for those exploring QwQ-32B-Preview-abliterated and other cutting-edge models.

Relevant blog posts

June 18, 2025

How to Install Fanar-1 9B Arabic-English LLM Locally?

Fanar-1-9B-Instruct is a bilingual Arabic-English large language model developed by Qatar Computing Research Institute (QCRI) at HBKU, fine-tuned from Google’s Gemma-2-9B. Trained on 1 trillion tokens with a strong focus on Arabic dialects (Gulf, Levantine, Egyptian) and Modern Standard Arabic (MSA), it’s designed for culturally aware conversations, government/civic applications, and educational tools. With 4.5M instructions and 250K DPO preference pairs, it’s aligned with Islamic values and excels in multilingual Q&A, dialogue, and cultural understanding.

June 16, 2025

How to Install Jan-Nano Locally?

Jan-Nano is a lightweight yet powerful 4-billion parameter language model tailored for deep, research-heavy workloads. Built by Menlo Research, this model isn’t just about generating text — it’s designed to think clearly, answer precisely, and connect well with research tools thanks to its MCP (Model Context Protocol) compatibility. It’s fine-tuned from Qwen3-4B, which means it benefits from a solid foundation, but Jan-Nano takes it a step further with performance tweaks specifically for structured queries and fact-first tasks.

June 11, 2025

How to Install Mistral Magistral Locally?

Magistral-Small-2506 is the latest evolution in Mistral AI’s line of efficient reasoning models, fine-tuned from the base Mistral-Small-3.1-2503. While it retains its compact size and agility, this model steps into deeper waters—bringing long-form reasoning, step-by-step deduction, and multilingual support into a package that runs comfortably on consumer-grade GPUs. What makes Magistral stand out is its clarity of thought. The model doesn’t just answer questions—it takes time to think. It writes out its reasoning process like a person solving a math problem on paper, making it especially useful for logic, science, code, and educational tasks. Whether you’re building a chatbot that needs to explain itself or a backend service for research-style outputs, Magistral-Small brings structure and depth with minimal overhead.

See all posts

Ready to build
with us?

The ideal way for organizations young and old to ease their way into the distributed and affordable cloud at their own pace.

Stay Tuned!

Stay up to date with the latest updates, news, and hotfixes for our product.

NodeShift creates a vital link between developers and affordable cloud.

Switch theme

English (EN)
Arabic (AR)
Chinese (ZH-CN)
German (DE)
Korean (KO)
Russian (RU)
French (FR)
Spanish (ES)
Portuguese (PT)
Japanese (JA)

JavaScript is disabled in your browser. For a better experience, please enable JavaScript.Learn how to enable JavaScript.