Skip to content

agent-infra/browser

Repository files navigation

简体中文 | English

Agent Infra Browser

@agent-infra/browser is dedicated to building a comprehensive browser infrastructure SDK specifically designed for AI Agents.


What is this for?

This toolkit is specifically designed for:

  • GUI AI Agent that needs to interact with web browsers
  • Browser screen casting in non-VNC or headless scenarios
  • MCP service for browser automation control

Architecture

architecture

Packages Overview

Core Browser Control Library. Abstracts and encapsulates the fundamental capabilities required to manipulate browsers.

Browser Screen Casting UI Components. Can connect to remote browsers via CDP and then display their screen casting content.

Cross-Platform Browser Detection. Automatically locate installed browsers (Chrome, Edge, Firefox) on Windows, macOS, and Linux systems.

Smart Web Content Extraction. Extract clean, readable content from web pages and convert to Markdown format with advanced algorithms and browser automation support.

Media Processing Utilities. Media tools for handling browser-related tasks, such as high-performance base64 image parsing and media resource processing.


Development

This is a monorepo managed with pnpm. To get started:

# Install dependencies
pnpm install

# Build all packages
pnpm run build

# Run tests
pnpm run test

# Lint code
pnpm run lint

Requirements

  • Node.js >= 20.x
  • pnpm for package management
  • Chrome/Chromium browser for browser automation features

License

This project is licensed under the Apache License 2.0.


Credits

Special thanks to the open source projects that inspired this toolkit:

About

agent browser infra

Resources

License

Code of conduct

Stars

Watchers

Forks

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •