The intersection of AI with web automation introduces a novel concept known as Browser Use agents. This concept enables you to create AI agents that can independently navigate and manipulate web content, from gathering simple information to executing complex sequences of actions on websites autonomously. By leveraging DeepSeek’s AI model, the newest revolutionary model in the space, to understand and interact with web environments, users can build agents that perform tasks ranging from simple searches and data retrieval to more advanced interactions like flight ticket bookings, shopping, etc. across websites, all without the need for extensive coding knowledge. The process involves setting up the environment, integrating DeepSeek with Browser Use’s user-friendly Web UI application, and tailoring the agent’s behavior to meet specific automation needs.
In this article, we will explore how to build a Browser Use agent using DeepSeek. By the end, you’ll have a foundational understanding of how to create and deploy your own Browser Use agent, ready to intelligently tackle the automation of your daily web activities.
Prerequisites
- A virtual machine (GPU or CPU, such as the ones provided by NodeShift) with at least:
- 2 vCPUs
- 8 GB RAM
- 50 GB SSD
- Ubuntu 22.04 VM
Note: The prerequisites for this are highly variable across use cases. A high-end configuration could be used for a large-scale deployment.
Step-by-step process to build a browser use agent with DeepSeek
For this tutorial, we’ll use a CPU-powered Virtual Machine by NodeShift, which provides high-compute Virtual Machines at a very affordable cost on a scale that meets GDPR, SOC2, and ISO27001 requirements. It also offers an intuitive and user-friendly interface, making it easier for beginners to get started with Cloud deployments. However, feel free to use any cloud provider you choose and follow the same steps for the rest of the tutorial.
Step 1: Setting up a NodeShift Account
Visit app.nodeshift.com and create an account by filling in basic details, or continue signing up with your Google/GitHub account.
If you already have an account, login straight to your dashboard.
Step 2: Create a Compute Node (CPU Virtual Machine)
After accessing your account, you should see a dashboard (see image), now:
- Navigate to the menu on the left side.
- Click on the Compute Nodes option.
- Click on Start to start creating your very first compute node.
These Compute nodes are CPU-powered virtual machines by NodeShift. These nodes are highly customizable and let you control different environmental configurations, such as vCPUs, RAM, and storage, according to your needs.
Step 3: Select configuration for VM
- The first option you see is the Reliability dropdown. This option lets you choose the uptime guarantee level you seek for your VM (e.g., 99.9%).
- Next, select a geographical region from the Region dropdown where you want to launch your VM (e.g., United States).
- Most importantly, select the correct specifications for your VM according to your workload requirements by sliding the bars for each option.
Step 4: Choose VM Configuration and Image
- After selecting your required configuration options, you’ll see the available VMs in your region and as per (or very close to) your configuration. In our case, we’ll choose a ‘8vCPUs/32GB/200GB SSD’ Compute node.
- Next, you’ll need to choose an image for your Virtual Machine. For the scope of this tutorial, we’ll select Ubuntu.
Step 5: Choose the Billing cycle and Authentication Method
- Two billing cycle options are available: Hourly, ideal for short-term usage, offering pay-as-you-go flexibility, and Monthly for long-term projects with a consistent usage rate and potentially lower cost.
- Next, you’ll need to select an authentication method. Two methods are available: Password and SSH Key. We recommend using SSH keys, as they are a more secure option. To create one, head over to our official documentation.
Step 6: Finalize Details and Create Deployment
Finally, you can also add a VPC (Virtual Private Cloud), which provides an isolated section to launch your cloud resources (Virtual machine, storage, etc.) in a secure, private environment. We’re keeping this option as the default for now, but feel free to create a VPC according to your needs.
Also, you can deploy multiple nodes at once using the Quantity option.
That’s it! You are now ready to deploy the node. Finalize the configuration summary; if it looks good, go ahead and click Create to deploy the node.
Step 7: Connect to active Compute Node using SSH
As soon as you create the node, it will be deployed in a few seconds or a minute. Once deployed, you will see a status Running in green, meaning that our Compute node is ready to use!
Once your node shows this status, follow the below steps to connect to the running VM via SSH:
- Open your terminal and run the below SSH command:
(replace root
with your username and paste the IP of your VM in place of ip
after copying it from the dashboard)
ssh root@ip
2. In some cases, your terminal may take your consent before connecting. Enter ‘yes’.
3. A prompt will request a password. Type the SSH password, and you should be connected.
Output:
Step 8: Install Docker
First, we need Docker to run the Browser Use Web-UI container. Follow the steps below to quickly install Docker on Ubuntu.
- Use curl to add the GPG key for the Docker repository.
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg
2. Add Docker’s APT repository to the system’s source list
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
3. Update the Ubuntu package source-list.
apt update
Output:
4. Install Docker Engine and related dependencies.
sudo apt install docker-ce docker-ce-cli containerd.io -y
Output:
5. Confirm installation.
Check the Docker version to confirm the installation.
docker --version
Output:
6. Start and enable the Docket system service.
sudo systemctl start docker
sudo systemctl enable docker
Output:
Step 9: Set up & run Browser Use Web UI
- Clone the Browser Use Web UI GitHub repository.
git clone https://github.com/browser-use/web-ui.git
Output:
2. Move inside the web-ui
directory then, build and run the container with the following command:
docker compose up --build
Output:
Step 10: Configure the Agent with DeepSeek
Once you start the Browser Use Web UI with Docker, you will be able to access the application in your browser at:
http://localhost:7788 #If you're on local server
http://<YOUR_SERVER_IP>:7788 #If you're on remote server
The application opens up like this:
- In the Agent Settings section, select the configuration for your agent such as:
- Agent Type:
org
if you want to use original or custom
to change its system prompt and customization.
- Max Run Steps: Maximum number of steps you allow your browser agent to perform on the browser.
- Max Actions per Step: Maximum number of actions you allow your browser agent to perform in each step.
- Use Vision: Enable this to let your agent use visual processing capabilities while finding results.
2. LLM Configuration is the most important section for us, as this is where we’ll customize our agent with our choice of model.
There are two ways in which you can use the DeepSeek LLM here:
- Ollama Server: Install a DeepSeek-R1 model with Ollama in your local system. In the Web UI, select
ollama
as the LLM provider, and choose your Model Name (the one downloaded in the system).
Learn how you can install DeepSeek-R1 using Ollama.
- API Key: Second and a straightforward way to power your agent with DeepSeek is by using DeepSeek’s official API. For this, you’ll just need to select
deepseek
as the LLM provider and deepseek-reasoner
as the model. Further, get your API key from the DeepSeek dashboard and put it on the API Key input section.
For the scope of this tutorial, we’ll go with the second option. However, feel free to use any option that fits your requirements.
3. Review all the Browser settings, and check mark all the boxes if you want the Browser agent to open and run the browser inside this Web-UI itself, instead of opening a separate browser tab altogether.
4. Finally, type the prompt in Task Description, e.g.:
“Open Eliza Agent article by NodeShift, and give me its url”.
And run it by clicking on Run Agent.
Inference – 1
Below are the snapshots of how the agent is working on the browser:
- Searches for the article name as per the prompt.
- Finds the list of relevant articles.
Inference – 2
We’ll test it with a different prompt. This time, we are asking the agent to find the best study cafes in Berlin and provide us with a list of the top ones.
Here are the snapshots of it working:
- In the same search bar, searches for the best cafes as per the second prompt.
- Scrolls through a list of relevant links to find the best one.
- Present us the final result of “Top 5 study cafes in Berlin” in the Results section of Web UI:
Below is the full view of the list that the agent presented in the results:
Top 5 study cafes in Berlin (based on Yelp):
1. 19grams Alex
2. Bonanza
3. Oslo Kaffebar
4. Cuccuma
5. Einstein Kaffee
Here’s the thought process flow the model followed while completing our task:
[CustomAgentBrain(prev_action_evaluation='Success - The page successfully loaded and displays search results for "best study cafe Berlin".', important_contents='Top 10 Best Study Cafe Near Berlin, Berlin · 1. 19grams Alex · 2. Bonanza · 3. Oslo Kaffebar · 4. Cuccuma · 5. Einstein Kaffee · 6. Maxway Coffee · 7.', task_progress='1. Searched Google for "best study cafe Berlin".', future_plans='1. Extract the names of the top 5 cafes from the search results.', thought='The Google search results page contains several links to articles and forums discussing study cafes in Berlin. The Yelp link provides a top 10 list, which is more than sufficient to fulfill the request for the top 5. I will extract the names from the Yelp listing.', summary='Extract the names of the top 5 cafes from the Yelp results on the current page.')]
Conclusion
This article explores the synergy between DeepSeek’s AI capabilities and Browser use agents to automate web tasks. We provided a detailed walkthrough from setup to customization, demonstrating how this combination can revolutionize how we interact with web content. Additionally, we deployed our AI agent through NodeShift, which offers a secure environment for deploying and running these AI agents. This ensures they operate efficiently and with high performance, making this sophisticated technology accessible and practical for widespread use.