Browser-Use WebUI is a user-friendly interface built on Gradio, designed to simplify interactions with AI agents directly from your browser. It supports a wide range of functionalities, including custom browser integration for seamless authentication, persistent browser sessions to maintain task history, and even high-definition screen recording for enhanced usability.
With expanded support for multiple Large Language Models (LLMs) like Gemini, OpenAI, Azure OpenAI, Anthropic, DeepSeek, Ollama, and more, Browser-Use WebUI ensures flexibility and scalability for various applications.
Fully open-source under the MIT License, Browser-Use WebUI is an excellent opportunity for open-source contributors to make an impact. Dive in, contribute, and experience the power of running AI agents effortlessly in your browser!
Installation Options for Browser-Use WebUI
There are multiple ways to install and set up the Browser-Use WebUI, making it versatile and accessible for users with different preferences. You can opt for a local installation using Python 3.11 or higher, set up dependencies, and install Playwright manually. Alternatively, you can use Docker for a more streamlined experience. With Docker, you can quickly clone the repository, configure environment variables, and build the container to get started. Docker also offers flexible options like persistent browser sessions or default mode to cater to different usage needs.
Why Docker Installation?
While several installation methods are available, using Docker simplifies the process and ensures a consistent environment across systems. With Docker, you don’t need to worry about dependency conflicts or manual configurations, making it a reliable choice. Additionally, Docker’s persistent browser session feature enables seamless task management, retaining interaction history, and state for a smoother experience. It’s an efficient, user-friendly way to deploy the Browser-Use WebUI.
Resource
GitHub
Link: https://github.com/browser-use/browser-use
Prerequisites
- A Virtual Machine (such as the ones provided by NodeShift) with at least:
- 16 vCPUs
- 64GB RAM
- 250GB SSD
- Ubuntu 22.04 VM
- Access to your server via SSH
“We chose this configuration for smooth execution. You can also use a lower configuration for this tool, but the performance will be slower.”
Step-by-Step Process to Install Browser-Use Web UI
For the purpose of this tutorial, we will use a CPU-powered Virtual Machine offered by NodeShift; however, you can replicate the same steps with any other cloud provider of your choice. NodeShift provides the most affordable Virtual Machines at a scale that meets GDPR, SOC2, and ISO27001 requirements.
However, if you prefer to use a GPU-powered Virtual Machine, you can still follow this guide. Browser-Use Web UI works on GPU-based VMs as well, performance is better and faster than CPU VM on GPU VM. The installation process remains largely the same, allowing you to achieve similar functionality on a GPU-powered machine. NodeShift’s infrastructure is versatile, enabling you to choose between GPU or CPU configurations based on your specific needs and budget.
Let’s dive into the setup and installation steps to get Browser-Use Web UI running efficiently on your chosen virtual machine.
Step 1: Sign Up and Set Up a NodeShift Cloud Account
Visit the NodeShift Platform and create an account. Once you’ve signed up, log into your account.
Follow the account setup process and provide the necessary details and information.
Step 2: Create a Compute Node (CPU Virtual Machine)
NodeShift Compute Nodes offers flexible and scalable on-demand resources like NodeShift Virtual Machines which are easily deployed and come with general-purpose, CPU-powered, or storage-optimized nodes.
- Navigate to the menu on the left side.
- Select the Compute Nodes option.
- Click the Create Compute Nodes button in the Dashboard to create your first deployment.
Step 3: Select Virtual Machine Uptime Guarantee
- Choose the Virtual Machine Uptime Guarantee option based on your needs. NodeShift offers an uptime SLA of 99.99% for high reliability.
- Click on the “Show reliability info” to review detailed SLA and reliability options.
Step 4: Select a Region
In the “Compute Nodes” tab, select a geographical region where you want to launch the Virtual Machine (e.g., the United States).
Step 5: Choose VM Configuration
- NodeShift provides two options for VM configuration:
- Manual Configuration: Adjust the CPU, RAM, and Storage to your specific requirements.
- Select the number of CPUs (1–96).
- Choose the amount of RAM (1 GB–768 GB).
- Specify the storage size (20 GB–4 TB).
- Predefined Configuration: Choose from predefined configurations optimized for General Purpose, CPU-Powered, or Storage-Optimized nodes.
- If you prefer custom specifications, manually configure the CPU, RAM, and Storage. Otherwise, select a predefined VM configuration suitable for your workload.
Step 6: Choose an Image
Next, you will need to choose an image for your Virtual Machine. We will deploy the VM on Ubuntu, but you can choose according to your preference. Other options like CentOS and Debian are also available to Install Browser-Use Web UI.
Step 7: Choose the Billing Cycle & Authentication Method
- Select the billing cycle that best suits your needs. Two options are available: Hourly, ideal for short-term usage and pay-as-you-go flexibility, or Monthly, perfect for long-term projects with a consistent usage rate and potentially lower overall cost.
- Select the authentication method. There are two options: Password and SSH Key. SSH keys are a more secure option. To create them, refer to our official documentation.
Step 8: Additional Details & Complete Deployment
- The ‘Finalize Details’ section allows users to configure the final aspects of the Virtual Machine.
- After finalizing the details, click the ‘Create’ button, and your Virtual Machine will be deployed.
Step 9: Virtual Machine Successfully Deployed
You will get visual confirmation that your node is up and running.
Step 10: Connect via SSH
- Open your terminal
- Run the SSH command:
For example, if your username is root
, the command would be:
ssh root@ip
- If SSH keys are set up, the terminal will authenticate using them automatically.
- If prompted for a password, enter the password associated with the username on the VM.
- You should now be connected to your VM!
Step 11: Install Docker
Add Docker GPG Key and Repository
Run the following commands to add Docker’s official GPG key and set up the stable repository:
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
Update the Package Index
Update the local package database to include the Docker packages:
sudo apt update
Install Docker
Install Docker Engine and related components:
sudo apt install docker-ce docker-ce-cli containerd.io -y
Verify Docker Installation
Check Docker’s service status and version to ensure it’s installed successfully:
sudo systemctl status docker
docker --version
Start Docker (If Not Running)
If Docker is inactive, start the service:
sudo systemctl start docker
sudo systemctl enable docker
Step 12: Clone the Repository
Run the following command to clone the Browser-Use Web UI repository:
git clone https://github.com/browser-use/web-ui.git
cd web-ui
Step 13: Install Vim
So, what is Vim?
Vim is a text editor. The last line of the text editor is used to give commands to vi and provide you with information.
Note: If an error occurs which states that Vi is not a recognised internal or external command then install vim using the steps below.
Step 1: Update the package list
Before installing any software, we will update the package list using the following command in your terminal:
sudo apt update
Step 2: Install Vim
To install Vim, enter the following command:
sudo apt install -y vim
This command will retrieve and install Vim and its necessary components.
Step 14: Configure Environment Variables
Run the following command to edit .env to add your API keys:
cp .env.example .env
Step 15: Create OpenAI API Key
To use the OpenAI API, you need to create an API key. This key will allow you to securely access OpenAI’s services. Follow these steps to generate your API key:
Visit the OpenAI platform and log in to your account. If you do not have an account, you will need to sign up.
Once logged in, navigate to the top right corner of the page where your profile icon is located. Click on it and select API from the dropdown menu. Alternatively, you can directly access the API section by clicking on API in the main dashboard.
In the API section, look for an option that says Create new secret key or View API Key. Click on this option.
After clicking on create, a new API key will be generated for you. Make sure to copy this key immediately as it will only be shown once.
Step 16: Run with Docker
Execute the following command to run with docker:
docker compose up --build
Step 17: Access the Application
- WebUI:
http://localhost:7788
- VNC Viewer (to see browser interactions):
http://localhost:6080/vnc.html
Default VNC password is “vncpassword”. You can change it by setting the VNC_PASSWORD
environment variable in your .env
file.
Step 18: Choose the LLM provider
Choose the LLM provider: You’ll see options like Anthropic, Ollama, OpenAI, etc.
- For example, if you select OpenAI, Browser-Use Web UI will enable integration with OpenAI’s API services.
- If you select Anthropic, it connects to models like Claude through their API.
- Ensure that you have the API key or credentials for the chosen provider.
We will use OpenAI for this demo; you can choose an LLM Provider of your choice.
Step 19: Browser Settings
Choose the browser settings.
Step 20: Run Agent
Run and Play with Agent.
Conclusion
Browser-Use WebUI offers a seamless, open-source solution for interacting with advanced agents and models directly from your browser. Its user-friendly interface, robust integration options, and support for multiple LLM providers make it a versatile tool for various applications. With easy installation methods, including Docker, it ensures a consistent and reliable setup while providing features like persistent sessions and custom browser settings. Released under the MIT License, it invites open-source contributors to enhance its functionality further. Whether for individual exploration or collaborative projects, Browser-Use WebUI empowers users to efficiently harness the potential of modern technologies.