We’re excited to announce Proxy Lite – a 3B parameter VLM that brings state of the art web automation capabilities to the open source community. Our WebVoyager results place Proxy Lite among the top performers in web automation tasks, while using just a fraction of the computational resources.
Proxy Lite also comes with a comprehensive framework for VLM-browser interaction giving enterprise grade browser control. You can try it out here!
Proxy Lite responses are done in 3 distinct steps allowing for better generalization than traditional prompt-prediction models.
Each response is separated into three parts:
<observation>The privacy consent banner has been successfully dismissed, allowing full access to the webpage. The search bar is visible, and the page is ready for interaction.</observation>
<thinking>The task of finding a vegetarian lasagna recipe has not yet been completed. I now have access to the search bar to begin searching for the recipe. I will type 'vegetarian lasagna' into the search bar and then click the search button to find relevant recipes.</thinking>
<tool_call>{"function": "click", "arguments": {"entries": [{"mark_id": 1, "content": "vegetarian lasagna"}]}}</tool_call>
Here you can see a more detailed breakdown of our task success rates. For even more detail check out our full trajectories on clusterfudge.
Website Name | Success Rate (%) | Finish Rate (%) | Avg. Messages |
---|---|---|---|
Allrecipes | 87.8 | 95.1 | 10.3 |
Amazon | 70.0 | 95.0 | 7.1 |
Apple | 82.1 | 89.7 | 10.7 |
ArXiv | 60.5 | 79.1 | 16.0 |
BBC News | 69.4 | 77.8 | 15.9 |
Booking | 70.0 | 85.0 | 24.8 |
Cambridge Dictionary | 86.0 | 97.7 | 5.7 |
Coursera | 82.5 | 97.5 | 4.7 |
ESPN | 53.8 | 97.5 | 14.9 |
GitHub | 85.0 | 92.5 | 10.0 |
Google Flights | 38.5 | 51.3 | 34.8 |
Google Map | 78.9 | 94.7 | 9.6 |
Google Search | 71.4 | 92.9 | 6.0 |
Huggingface | 68.6 | 74.3 | 18.4 |
Wolfram Alpha | 78.3 | 93.5 | 6.1 |
You can find the version of WebVoyager we used here.
We are excited to release this work to the open source community. While Proxy Lite is already the best open weights agent, we’re just getting started.
Proxy Lite shows that a smart training regime can compete with massive parameter counts. We invite developers and researchers to join us – whether by contributing code, building applications, or sharing use cases.
Stay tuned for updates, and don’t forget to star our repository!