Set up a Machine Learning Web App in minutes with Gradio

by Aditi Bindal | January 23, 2025

Ready to build cheaper?

Custom CPU plans from as little as $0.012/hour.

Machine learning has transformed numerous industries, however, integrating machine learning models into user-friendly web applications remains a daunting task for many developers. Traditional methods often involve juggling with web frameworks, backend logic, and frontend interfaces, where each requires specialized knowledge. Fortunately, Gradio simplifies this process by allowing you to create and deploy fully interactive machine-learning web apps in just a few minutes. Whether you’re showcasing a model, collecting user inputs, or building a proof-of-concept, Gradio streamlines the workflow, helping developers and non-technical users.

With Gradio, you can go from model to web app with minimal effort, focusing more on innovation and less on infrastructure. Let’s explore how you can set up your own machine-learning web app in no time.

Prerequisites

A virtual machine (GPU or CPU, such as the ones provided by NodeShift) with at least:
- 2 vCPUs
- 10 GB RAM
- 50 GB SSD
Ubuntu 22.04 VM

Note: The prerequisites for this are highly variable across use cases. A high-end configuration could be used for a large-scale deployment.

Step-by-step process to set up a Gradio web app on Ubuntu

For this tutorial, we’ll use a CPU-powered Virtual Machine by NodeShift, which provides high-compute Virtual Machines at a very affordable cost on a scale that meets GDPR, SOC2, and ISO27001 requirements. It also offers an intuitive and user-friendly interface, making it easier for beginners to get started with Cloud deployments. However, feel free to use any cloud provider you choose and follow the same steps for the rest of the tutorial.

Step 1: Setting up a NodeShift Account

Visit app.nodeshift.com and create an account by filling in basic details, or continue signing up with your Google/GitHub account.

If you already have an account, login straight to your dashboard.

Step 2: Create a Compute Node (CPU Virtual Machine)

After accessing your account, you should see a dashboard (see image), now:

Navigate to the menu on the left side.
Click on the Compute Nodes option.

Click on Start to start creating your very first compute node.

These Compute nodes are CPU-powered virtual machines by NodeShift. These nodes are highly customizable and let you control different environmental configurations, such as vCPUs, RAM, and storage, according to your needs.

Step 3: Select configuration for VM

The first option you see is the Reliability dropdown. This option lets you choose the uptime guarantee level you seek for your VM (e.g., 99.9%).

Next, select a geographical region from the Region dropdown where you want to launch your VM (e.g., United States).

Most importantly, select the correct specifications for your VM according to your workload requirements by sliding the bars for each option.

Step 4: Choose VM Configuration and Image

After selecting your required configuration options, you’ll see the available VMs in your region and as per (or very close to) your configuration. In our case, we’ll choose a ‘4vCPUs/16GB/100GB SSD’ Compute node.
Next, you’ll need to choose an image for your Virtual Machine. For the scope of this tutorial, we’ll select Ubuntu.

Step 5: Choose the Billing cycle and Authentication Method

Two billing cycle options are available: Hourly, ideal for short-term usage, offering pay-as-you-go flexibility, and Monthly for long-term projects with a consistent usage rate and potentially lower cost.

Next, you’ll need to select an authentication method. Two methods are available: Password and SSH Key. We recommend using SSH keys, as they are a more secure option. To create one, head over to our official documentation.

Step 6: Finalize Details and Create Deployment

Finally, you can also add a VPC (Virtual Private Cloud), which provides an isolated section to launch your cloud resources (Virtual machine, storage, etc.) in a secure, private environment. We’re keeping this option as the default for now, but feel free to create a VPC according to your needs.

Also, you can deploy multiple nodes at once using the Quantity option.

That’s it! You are now ready to deploy the node. Finalize the configuration summary; if it looks good, go ahead and click Create to deploy the node.

Step 7: Connect to active Compute Node using SSH

As soon as you create the node, it will be deployed in a few seconds or a minute. Once deployed, you will see a status Running in green, meaning that our Compute node is ready to use!

Once your node shows this status, follow the below steps to connect to the running VM via SSH:

Open your terminal and run the below SSH command:

(replace root with your username and paste the IP of your VM in place of ip after copying it from the dashboard)

ssh root@ip

2. In some cases, your terminal may take your consent before connecting. Enter ‘yes’.

3. A prompt will request a password. Type the SSH password, and you should be connected.

Output:

Step 8: Install libraries and dependencies

Update the Ubuntu package source-list.

apt update

Output:

2. Upgrade all the packages.

apt upgrade -y

Output:

3. Confirm the Python version installed.

python3 -V

Output:

4. Install pip.

apt install -y python3-pip

Output:

5. Confirm jinja2 version.

pip show jinja2

Output:

If the version is < 3.1.x, upgrade jinja2 with the following command.

pip install --upgrade jinja2

Output:

6. Finally, install all the required libraries one by one.

You can install all of them in one command, but this can take a lot of time and can get stuck if the connection is not stable.

pip3 install realesrgan
pip3 install gfpgan
pip3 install basicsr
pip3 install gradio

realesrgan: A package for high-quality image super-resolution using Real-ESRGAN.
gfpgan: A package for face restoration in images using GFPGAN.
basicsr: A foundational library for image processing tasks, supporting super-resolution and restoration.
gradio: A library to create interactive web-based interfaces for machine learning models with minimal coding.

7. Install some other dependencies.

apt install libgl1-mesa-glx libglib2.0-0 -y

Output:

Step 9: Create a Gradio web application

Create a project directory.

mkdir -p /opt/gradio-app/

2. Next, change the permission and ownership of the directory.

chown -R  root:root /opt/gradio-app/
chmod -R 775 /opt/gradio-app/

3. Navigate to the project directory and create a Python file named app.py to write the code.

cd /opt/gradio-app/
nano app.py

4. Write the code similar to the following snippet in app.py.

import gradio as gr
from gfpgan import GFPGANer
from basicsr.archs.rrdbnet_arch import RRDBNet
from realesrgan import RealESRGANer
import numpy as np
import cv2
import requests
def enhance_image(input_image):
    arch = 'clean'
    model_name = 'GFPGANv1.4'
    gfpgan_checkpoint = 'https://github.com/TencentARC/GFPGAN/releases/download/v1.3.4/GFPGANv1.4.pth'
    realersgan_checkpoint = 'https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.1/RealESRGAN_x2plus.pth'

    rrdbnet = RRDBNet(num_in_ch=3, num_out_ch=3, num_feat=64, num_block=23, num_grow_ch=32, scale=2)

    bg_upsampler = RealESRGANer(
        scale=2,
        model_path=realersgan_checkpoint,
        model=rrdbnet,
        tile=400,
        tile_pad=10,
        pre_pad=0,
        half=True
    )

    restorer = GFPGANer(
        model_path=gfpgan_checkpoint,
        upscale=2,
        arch=arch,
        channel_multiplier=2,
        bg_upsampler=bg_upsampler
    )

    input_image = input_image.astype(np.uint8)
    cropped_faces, restored_faces, restored_img = restorer.enhance(input_image)

    return restored_faces[0], restored_img

interface = gr.Interface(
    fn=enhance_image,
    inputs=gr.Image(),
    outputs=[gr.Image(), gr.Image()],
    live=True,
    title="Enhance your image with GFPGAN",
    description="Upload an image of a face and see it enhanced using GFPGAN. Two outputs will be displayed: restored_faces and restored_img."
)

interface.launch(server_name="0.0.0.0", server_port=8080)

This code looks like this in the file:

Save and close the editor (Ctrl+O > ENTER > Ctrl+X).

5. Run the code with the following command.

python3 app.py

Troubleshooting Errors

Now, when you hit the above command, you may come across an error like this:

Which essentially says that the current torchvision version is incompatible with the code. So, let’s change that with the below command:

pip3 install torch==1.12.0+cu113 torchvision==0.13.0+cu113 --extra-index-url https://download.pytorch.org/whl/cu113

After this as we hit the “python3 app.py” command again, one more error came into the picture:

It’s about the incompatible version of Numpy, so we’ll need to downgrade it from 2.x.x to 1.x.x. Run the commands below one by one to downgrade Numpy.

pip3 uninstall numpy
pip3 install numpy==1.23.5

Run the “python3 app.py” command again, and this time, the app should launch without any issues:

Press Ctrl+C to stop the process.

Step 10: Set up the Gradio app as a system service

To manage the Gradio app as a system service, we will create a Systemd file and set up a new system service.

Create a gradio.service file using Nano.

nano /etc/systemd/system/gradio.service

2. Add the following system configuration to the file.

[Unit]
Description=My Gradio Web Application
[Service]
ExecStart=/usr/bin/python3 /opt/gradio-app/app.py
WorkingDirectory=/opt/gradio-app/
Restart=always
User=root
Environment=PATH=/usr/bin:/usr/local/bin
Environment=PYTHONUNBUFFERED=1
[Install]
WantedBy=multi-user.target

Here’s how the file looks like:

3. Reload the system daemon to save the changes.

systemctl daemon-reload

4. Start and enable the Gradio service.

systemctl start gradio
systemctl enable gradio

5. Confirm if the Gradio service is running correctly.

systemctl status gradio

Output:

Step 11: Configure NGINX to expose the app

To expose the Gradio web interface on the internet and make it accessible to others, we’ll configure an NGINX reverse proxy to handle requests from the HTTP port 80.

Install the NGINX package.

apt install nginx -y

2. Create an NGINX configuration file for the app.

nano /etc/nginx/conf.d/gradio.conf

3. Add the configuration in the file similar to the below snippet.

(replace gradio.<YOUR_DOMAIN>.com with your original domain name or your server’s IP address and 127.0.0.1:8080 with the address where Gradio service is running.)

server {
    listen 80;
    server_name gradio.<YOUR_DOMAIN>.com;

    location / {
        proxy_pass http://127.0.0.1:8080/;
    }
}

This is how the configuration looks:

4. Check the file syntax.

nginx -t

Output:

If it shows the above output, the configuration is good to proceed with.

5. Now, restart NGINX.

systemctl restart nginx

Step 12: Enable Ubuntu Firewall

To accept incoming HTTP connections, allow the following ports with ufw.

ufw allow 80/tcp
ufw allow 443/tcp
ufw allow ssh

Enable the firewall and check the status.

ufw enable
ufw status

Step 13: Access the web application

Finally, you can access the Gradio Web app on the following URL in the browser

http://<YOUR_SERVER_IP> or https://gradio.your_domain.com if you want to launch the app with an SSL certificate.

As you can see, our Image Enhancer web app is successfully up and running using Gradio.

Conclusion

We explored how Gradio makes it incredibly easy to set up interactive machine-learning web apps in minutes, enabling developers to focus on showcasing their models rather than dealing with complex web development. By combining Gradio’s simplicity with robust computing like those offered by NodeShift, you can streamline your development process even further, ensuring efficient deployments and scalable solutions. NodeShift supports developers in managing modern applications in production, making it perfect for deploying web apps seamlessly in cloud-native environments.

Relevant blog posts

July 5, 2025

Build Real-Time Voice Streaming with Kyutai TTS: A Complete Installation Guide

Imagine a text-to-speech model so fast and modern, it starts generating high-quality audio as soon as you feed it the first few words, means, it doesn’t wait for the full sentence unlike other models. That’s exactly what Kyutai TTS delivers. Built for streaming TTS, Kyutai TTS is a very new model that combines low-latency generation with remarkable voice quality. It uses a powerful hierarchical Transformer architecture with over 1.6 billion parameters and leverages Moshi’s multistream framework to align and predict audio tokens efficiently. The model supports voice conditioning via pre-computed embeddings, enabling realistic character dialog, emotion-rich narration, and real-time applications. With native support for English and French, and a throughput of up to 75x real-time, Kyutai TTS is ideal for both research and production use cases. Plus, it’s completely open source under a permissive CC-BY 4.0 license, making it an attractive alternative to commercial black-box solutions.

July 3, 2025

How to Install ERNIE-4.5-VL-28B-A3B-PT Locally?

ERNIE-4.5-VL-28B-A3B is a large-scale vision-language model crafted to understand and reason across both text and images. With 28 billion total parameters and 3 billion activated per token, it combines high efficiency with strong multimodal capabilities. What sets it apart is its thoughtful mixture-of-experts design. By routing inputs through specialized pathways for text and vision, the model delivers accurate, context-aware responses — whether you’re analyzing an image, generating descriptions, or solving reasoning tasks that require both visual and textual understanding. Optimized during post-training using techniques like RLVR (Reinforcement Learning with Verifiable Rewards), this model offers two modes: thinking and non-thinking. You can control how deeply the model reasons based on the task — from lightweight visual description to detailed interpretation. It runs best on high-end GPUs and is deployable via FastDeploy or Jupyter environments.

July 3, 2025

Generate Multimodal, Multilingual & Multivector Embeddings with Jina Embeddings v4

We’re living in an era where content is no longer just textual and users speak more than one language, retrieval models need to understand documents the way humans do, across both language and modality. Meet Jina Embeddings v4, a groundbreaking open-source universal embedding model that redefines what’s possible in search, retrieval, and semantic understanding. With 3.8B parameter backbone built on Qwen2.5-VL, v4 bridges the gap between text and images using a shared encoder that processes visually rich content like tables, charts, diagrams, screenshots, and even long documents with up to 32,768 tokens or 20MP images. It supports both single-vector and multi-vector embeddings, giving you flexibility between fast search and deep semantic matching. What sets it apart are its LoRA adapters trained for real-world tasks, from multilingual retrieval (outperforming OpenAI’s embedding models by 12%) to code search (15% better than Voyage-3), and even visual document retrieval (scoring 90.2 on ViDoRe). If you’re building a document search engine, a multi-language chatbot, or a visual search tool, this is the embedding model that can make your AI apps an all rounder for diverse users and usecases.

See all posts

Ready to build
with us?

The ideal way for organizations young and old to ease their way into the distributed and affordable cloud at their own pace.

Stay Tuned!

Stay up to date with the latest updates, news, and hotfixes for our product.

NodeShift creates a vital link between developers and affordable cloud.

Switch theme

English (EN)
Arabic (AR)
Chinese (ZH-CN)
German (DE)
Korean (KO)
Russian (RU)
French (FR)
Spanish (ES)
Portuguese (PT)
Japanese (JA)

JavaScript is disabled in your browser. For a better experience, please enable JavaScript.Learn how to enable JavaScript.