Skip to content

feat(browser): implement experimental browser agent#19284

Merged
gsquared94 merged 17 commits intogoogle-gemini:mainfrom
gsquared94:browser_use
Feb 24, 2026
Merged

feat(browser): implement experimental browser agent#19284
gsquared94 merged 17 commits intogoogle-gemini:mainfrom
gsquared94:browser_use

Conversation

@gsquared94
Copy link
Copy Markdown
Contributor

@gsquared94 gsquared94 commented Feb 17, 2026

Summary

Implements an experimental browser_agent subagent that enables semantic and visual web browsing via chrome-devtools-mcp. The agent uses the Accessibility Tree for reliable element identification and supports optional vision-based interactions via screenshots.

🛡️ Safe & Localized Implementation

  • Disabled by default: No effect on standard CLI operation unless explicitly enabled in settings.json.
  • Strictly scoped: All browser logic is self-contained in packages/core/src/agents/browser/ (12 new files).
  • Minimal core changes: AgentRegistry registration (+8 lines), Config schema (+68 lines), subagent-tool-wrapper (+13 lines). No existing agents, tools, or utilities were modified.

Architecture

browserAgentFactory (entry point)
├── BrowserManager — MCP client lifecycle, connection, tool dispatch
├── mcpToolWrapper — Wraps MCP tools as DeclarativeTools with confirmation
│   └── type_text — Composite tool: types text via press_key + optional submitKey
├── analyzeScreenshot — Vision tool (gated on visualModel config)
├── browserAgentDefinition — System prompt, model config, output schema
├── browserAgentInvocation — Tool invocations dispatched to BrowserManager
└── modelAvailability — Auth-type gating for visual model access

Session Modes

Configured via agents.overrides.browser_agent.customConfig.sessionMode:

Mode Behavior
persistent (default) Persistent Chrome profile at ~/.gemini/cli-browser-profile/. Preserves logins/cookies across sessions.
isolated Temporary profile, deleted after session. Clean slate each run.
existing Attaches to a running Chrome via --autoConnect. Requires remote debugging enabled at chrome://inspect/#remote-debugging. ⚠️ Known upstream issue (chrome-devtools-mcp #870) may cause new_page/list_pages timeouts.

Key Features

  • Autonomous task delegation: Goal-oriented description encourages the parent agent to delegate entire multi-step tasks in a single call, rather than micromanaging individual browser actions.
  • Semantic AX Tree navigation: Uses take_snapshot → uid-based click/fill for reliable element interactions.
  • Visual mode (optional): visualModel config enables analyze_screenshot for coordinate-based interactions via click_at. The system prompt conditionally includes visual instructions only when enabled.
  • type_text composite tool: Uses document.execCommand('insertText') for fast bulk text entry with optional submitKey (Enter, Tab, etc.).
  • Context-aware error handling: Connection errors provide session-mode-specific remediation (e.g., profile lock → kill processes, existing mode → enable remote debugging).
  • Confirmation flow: Destructive MCP tools (click, fill, navigate, evaluate_script) require user confirmation unless YOLO mode is enabled.
  • Connection timeout with clearTimeout: Prevents timer leaks via finally block.
  • Documentation: docs/core/subagents.md and docs/cli/settings.md updated with browser agent config reference.

Test Coverage

53 unit tests across 6 test files:

Test File Tests Coverage
browserManager.test.ts 16 Connection, tool dispatch, MCP result mapping, error handling
browserAgentFactory.test.ts 13 Tool assembly, vision gating, conditional prompts
mcpToolWrapper.test.ts 9 MCP → DeclarativeTool wrapping, description augmentation
analyzeScreenshot.test.ts 7 Screenshot tool creation, invocation, error handling
browserAgentInvocation.test.ts 6 Tool invocations, abort signal handling
mcpToolWrapperConfirmation.test.ts 2 Confirmation requirement classification

How to Validate

1. Enable the agent

Add to .gemini/settings.json:

{
  "agents": {
    "overrides": {
      "browser_agent": { "enabled": true }
    }
  }
}

2. Example prompts to try (interactive mode)

Basic navigation & extraction:

Go to github.com/google-gemini/gemini-cli and tell me the latest release version

Multi-step autonomous task:

Go to the NYT Wordle site and solve today's puzzle

Authenticated workflow (use existing session mode, with signed in account):

Open a new Google Sheet, add a header row with 'Name, Score, Grade', and fill in 3 rows of sample data

3. What to verify

Check Expected
Tool confirmations MCP tools like new_page, click, fill should show "Action Required" before executing (unless YOLO mode is active)
Subagent progress The "Browser Agent" card should show thought chunks (e.g., 🌐💭 Navigating to...) as the agent works
Task delegation The parent agent should delegate the entire task in a single browser_agent call, not call it repeatedly for each step
Error handling If Chrome isn't available, the error message should include session-mode-specific remediation steps
Cleanup After the agent finishes, Chrome should close (isolated/persistent modes)

⚠️ Known rendering issue

The Browser Agent status card may briefly appear duplicated — a frozen card above and a live card below. This seems like an Ink rendering artifact that occurs when <Static> history items (like hint messages or update notifications) are added while a tool card is in the pending area.

Pre-Merge Checklist

  • All 53 browser agent tests passing
  • Full typecheck clean
  • Updated documentation (docs/core/subagents.md, docs/cli/settings.md)
  • No breaking changes — feature is opt-in / disabled by default
  • Validated on macOS

Fixes #15956

@gsquared94 gsquared94 requested review from a team as code owners February 17, 2026 20:45
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @gsquared94, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request introduces a new, experimental browser agent designed to automate web tasks within the Gemini CLI. It provides robust web interaction capabilities, allowing the system to navigate, fill forms, click elements, and extract information from web pages. The agent is built with a focus on isolation and configurability, ensuring it integrates safely and can be tailored to specific use cases, including visual analysis when a vision model is enabled.

Highlights

  • Experimental Browser Agent Implementation: Introduced an experimental browser_agent subagent for semantic and visual web browsing, leveraging chrome-devtools-mcp for reliable element identification via the Accessibility Tree and optional vision-based interactions.
  • Safe and Localized Design: The browser agent is disabled by default, strictly scoped to its own directory (packages/core/src/agents/browser/), and minimizes changes to core components, ensuring it's opt-in and isolated.
  • Flexible Session Modes: Supports persistent (default, preserves logins/cookies), isolated (temporary profile), and existing (attaches to running Chrome) session modes for managing Chrome instances.
  • Advanced Interaction Capabilities: Features include semantic AX Tree navigation, optional visual mode with analyze_screenshot for coordinate-based interactions, a type_text composite tool for fast bulk text input, and context-aware error handling.
  • Comprehensive Documentation and Testing: Updated docs/core/subagents.md and docs/cli/settings.md with configuration references, and includes 53 unit tests across 6 files covering connection, tool dispatch, vision gating, and error handling.
Changelog
  • docs/cli/settings.md
    • Added a new 'Agents' section to the CLI settings documentation.
    • Documented configuration options for the experimental browser agent, including enabled, sessionMode, headless, chromeProfilePath, visualModel, and allowedDomains.
  • docs/core/subagents.md
    • Added a detailed section for the experimental Browser Agent.
    • Described the purpose, prerequisites, enabling instructions, session modes, configuration reference, security features, and visual agent capabilities.
  • docs/sidebar.json
    • Added a new entry for the 'Browser agent (experimental)' to the documentation sidebar, linking to its dedicated section.
  • docs/tools/index.md
    • Added a reference to the new 'Browser agent' in the list of available tools.
  • packages/cli/src/config/settingsSchema.ts
    • Extended the AgentOverride schema to include a customConfig property, allowing agent-specific configurations.
  • packages/core/src/agents/browser/analyzeScreenshot.test.ts
    • Added unit tests for the createAnalyzeScreenshotTool function and AnalyzeScreenshotInvocation class.
    • Verified screenshot capture, visual model invocation, and error handling scenarios for visual analysis.
  • packages/core/src/agents/browser/analyzeScreenshot.ts
    • Implemented the createAnalyzeScreenshotTool function to provide visual analysis capabilities.
    • Defined AnalyzeScreenshotInvocation to capture screenshots and send them to a visual model for interpretation, returning coordinates and descriptions.
  • packages/core/src/agents/browser/browserAgentDefinition.ts
    • Defined the BROWSER_AGENT_NAME constant and BrowserTaskResultSchema for agent output.
    • Implemented buildBrowserSystemPrompt to dynamically generate the system prompt, including visual instructions if vision is enabled.
    • Created BrowserAgentDefinition factory to configure the browser agent with appropriate model and run settings.
  • packages/core/src/agents/browser/browserAgentFactory.test.ts
    • Added unit tests for createBrowserAgentDefinition and cleanupBrowserAgent functions.
    • Verified browser connection, tool discovery, visual prompt inclusion, and proper cleanup of browser resources.
  • packages/core/src/agents/browser/browserAgentFactory.ts
    • Implemented createBrowserAgentDefinition to dynamically create the browser agent's definition, including connecting to the MCP client and wrapping tools.
    • Added cleanupBrowserAgent to ensure proper closure of browser resources after agent execution.
    • Included logic to conditionally enable visual tools based on configuration and MCP capabilities.
  • packages/core/src/agents/browser/browserAgentInvocation.test.ts
    • Added unit tests for the BrowserAgentInvocation class.
    • Verified constructor behavior, description generation, and tool location handling.
  • packages/core/src/agents/browser/browserAgentInvocation.ts
    • Implemented BrowserAgentInvocation to manage the lifecycle of the browser agent, including async setup of MCP tools and cleanup.
    • Provided methods for generating a human-readable description and executing the browser agent task.
  • packages/core/src/agents/browser/browserManager.test.ts
    • Added unit tests for the BrowserManager class.
    • Verified MCP client connection, tool discovery, tool calling, and error handling for various session modes and connection failures.
  • packages/core/src/agents/browser/browserManager.ts
    • Implemented BrowserManager to encapsulate the logic for connecting to and managing the chrome-devtools-mcp client.
    • Provided methods for ensuring connection, discovering tools, calling tools, and closing the browser, with robust error handling and session mode support.
  • packages/core/src/agents/browser/mcpToolWrapper.test.ts
    • Added unit tests for createMcpDeclarativeTools and the wrapped MCP tool invocations.
    • Verified the creation of declarative tools, correct parameter passing, and error handling during tool execution.
  • packages/core/src/agents/browser/mcpToolWrapper.ts
    • Implemented createMcpDeclarativeTools to wrap dynamically discovered MCP tools into DeclarativeTool instances.
    • Introduced TypeTextDeclarativeTool as a composite tool for efficient text input.
    • Added augmentToolDescription and postProcessToolResult for enhanced tool descriptions and error handling hints.
  • packages/core/src/agents/browser/mcpToolWrapperConfirmation.test.ts
    • Added unit tests to verify the confirmation details and policy update options for MCP tool invocations.
  • packages/core/src/agents/browser/modelAvailability.ts
    • Defined VISUAL_AGENT_MODEL and getVisualAgentModel to manage the model used for visual analysis in the browser agent.
  • packages/core/src/agents/registry.ts
    • Registered the BrowserAgentDefinition in the AgentRegistry if the browser agent is enabled in the settings.
  • packages/core/src/agents/subagent-tool-wrapper.ts
    • Added special handling in SubagentToolWrapper to use BrowserAgentInvocation for the browser agent, accommodating its async setup requirements.
  • packages/core/src/config/config.test.ts
    • Added unit tests for getBrowserAgentConfig to ensure correct retrieval and default application of browser agent settings.
  • packages/core/src/config/config.ts
    • Updated the AgentOverride interface to include an optional customConfig property for agent-specific settings.
    • Defined BrowserAgentCustomConfig interface to specify configuration options unique to the browser agent.
    • Implemented getBrowserAgentConfig to provide a consolidated view of the browser agent's configuration, including defaults.
Activity
  • The author has implemented the experimental browser agent and its supporting infrastructure.
  • Unit tests have been added for all new components, ensuring functionality and error handling.
  • Documentation has been updated to reflect the new agent and its configuration options.
  • The author has validated the feature on macOS and confirmed no breaking changes, with the feature being opt-in.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@gemini-cli gemini-cli bot added priority/p2 Important but can be addressed in a future release. area/agent Issues related to Core Agent, Tools, Memory, Sub-Agents, Hooks, Agent Quality labels Feb 17, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This is an impressive and well-architected addition of an experimental browser agent. The feature is self-contained, disabled by default, and includes comprehensive documentation and tests. The design choices, such as the isolated BrowserManager, dynamic tool wrapping, and detailed error handling with user-friendly remediation steps, are excellent. The use of prompt engineering techniques like augmenting tool descriptions and post-processing results to improve agent reliability is particularly noteworthy.

I have one high-severity suggestion regarding the performance of the type_text tool to align it with the performance claims in the pull request description.

Add extensible browser agent configuration using the agents.overrides pattern:
- Extended AgentOverride interface with customConfig field for agent-specific settings
- Added BrowserAgentCustomConfig type for browser-specific configuration
- Added getAgentOverride() and getBrowserAgentConfig() methods to Config class
- Settings configured via agents.overrides.browser_agent.customConfig
- Updated settings schema with customConfig in AgentOverride definition

This establishes the foundational pattern for configuring the browser agent
through the standard agents.overrides infrastructure.
Implement the browser agent using the LocalAgentDefinition pattern:
- BrowserAgentDefinition: Agent metadata and prompt configuration
- BrowserAgentInvocation: Handles individual browser agent invocations
- BrowserAgentFactory: Creates agent definitions with dynamic MCP tools
- BrowserManager: Manages chrome-devtools-mcp connection lifecycle

Uses getBrowserAgentConfig() to read settings from agents.overrides.browser_agent
Implement the visual agent using the LocalAgentDefinition pattern:
- VisualAgentDefinition: Agent metadata for coordinate-based visual tasks
- delegateToVisualAgent.ts: Tool for semantic agent to delegate visual tasks
- Uses gemini-2.5-computer-use-preview-10-2025 model for Computer Use capability

The visual agent handles tasks requiring visual identification or precise
coordinate-based actions that cannot be done via the accessibility tree.
…ation

Fix chrome-devtools-mcp CLI flags:
- --existing (invalid) → --autoConnect for existing session mode
- --profile-path (invalid) → --userDataDir for custom profile path
- Default session mode changed from 'isolated' to 'persistent'

Add 'persistent' session mode (new default) which uses a persistent
Chrome profile at ~/.cache/chrome-devtools-mcp/chrome-profile.

Add connection timeout and actionable error for 'existing' mode when
Chrome remote debugging is not enabled.
- Add custom type_text tool that types a full string by internally
  calling press_key for each character, turning N model round-trips
  into 1. Dramatically speeds up text input in complex web apps.

- Move tool-specific usage rules from system prompt to individual
  tool descriptions via augmentToolDescription() for better
  organization and token efficiency.

- Add terminal failure handling instructions to system prompt
  (Chrome connection errors, browser crashes, repeated errors)
  with specific remediation steps.

- Add complex web app guidance (spreadsheets, rich editors) to
  system prompt, recommending type_text + keyboard navigation.

- Fix augmentToolDescription key ordering so more-specific keys
  (fill_form, click_at) match before shorter keys (fill, click).

- Remove non-existent tool references (scroll, type_text as MCP tool)
  and add click_at hint for vision tool.
Vision (screenshot analysis + coordinate-based interactions) is now
disabled by default. Set visualModel in browser_agent customConfig
to enable it, e.g. visualModel: 'gemini-2.5-computer-use-preview-10-2025'.
…n errors

- Add submitKey parameter to type_text tool for pressing Enter/Tab/etc
  after typing, eliminating a separate model round-trip per value entry
- Update system prompt and tool hints to guide model toward type_text
  with submitKey instead of per-character press_key calls
- Refactor connection error handling into createConnectionError() with
  session-mode-aware remediation messages for profile locks, timeouts,
  and generic failures
- Update terminal failure prompts to pass through error remediation
  verbatim instead of hardcoding instructions
- Add tests for profile-lock, timeout, and generic connection errors
The system prompt always included the VISUAL IDENTIFICATION section
telling the model about analyze_screenshot, even when visualModel was
not configured. This caused the model to attempt calling the tool
despite it not being registered.

- Convert BROWSER_SYSTEM_PROMPT to buildBrowserSystemPrompt(visionEnabled)
- Pass vision state from factory to definition builder
- Remove analyze_screenshot reference from click_at tool description
- Add tests for conditional prompt inclusion/exclusion
- Fix misleading test comment about tool count
…n mode, headless, profile path, and visual model settings.
Updated the browser_agent description from a primitive-focused listing
(navigating, filling, clicking) to a goal-oriented description that
emphasizes autonomy, multi-step reasoning, and dynamic feedback
interpretation. This encourages the parent agent to delegate entire
tasks in a single call rather than micromanaging individual browser
actions.
@gsquared94
Copy link
Copy Markdown
Contributor Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new experimental browser agent, a significant feature addition. The implementation is well-structured across multiple new files, including comprehensive documentation and tests. The architecture for managing the browser lifecycle and wrapping MCP tools is robust, with good error handling and attention to user experience.

I've identified a few high-severity issues related to the propagation of AbortSignal for cancellation, which should be addressed to ensure the agent can be reliably cancelled, preventing potential resource leaks or unintended continued execution. Additionally, there's a suggestion for simplifying redundant cancellation logic.

@jacob314
Copy link
Copy Markdown
Contributor

Great work implementing this experimental feature! The architecture and test coverage look solid.

I just had one minor piece of feedback regarding the strict development rules:

In packages/core/src/agents/browser/mcpToolWrapperConfirmation.test.ts, there are a couple of instances where any is used (one to suppress a type check for mocking an interface, and another to access a protected method):

// We accept any cast here because we are mocking the interface
mockMessageBus = { ... } as unknown as MessageBus;
// ...
// Use "any" to access protected method
// eslint-disable-next-line @typescript-eslint/no-explicit-any
const details = await (invocation as any).getConfirmationDetails( ... );

Our strict development guidelines state to strictly forbid any in CLI and Core packages (including tests). For testing protected methods, consider testing the behavior through the public API instead, or if absolutely necessary, try to find a type-safe workaround that doesn't rely on any.

Otherwise, the PR looks great!

gsquared94 added a commit to gsquared94/gemini-cli that referenced this pull request Feb 24, 2026
- Remove redundant Promise.race in McpToolInvocation.execute (event listener leak)
- Propagate AbortSignal to all press_key calls (submitKey + typeCharByChar)
- Call this.close() on connectMcp failure (zombie process leak)
- Set showInDialog: false for all browser settings
- Remove debug log truncation in analyzeScreenshot
- Fix misleading --experimental-vision error message
- Replace any casts with typed TestableConfirmation interface in tests
- Update license year to 2026 in all browser agent files
- Merge duplicate imports in mcpToolWrapper
- Add sync comment to BrowserAgentCustomConfig
- Update subagents.md Chrome requirement wording
- Regenerate settings docs
gsquared94 added a commit to gsquared94/gemini-cli that referenced this pull request Feb 24, 2026
- Remove redundant Promise.race in McpToolInvocation.execute (event listener leak)
- Propagate AbortSignal to all press_key calls (submitKey + typeCharByChar)
- Call this.close() on connectMcp failure (zombie process leak)
- Set showInDialog: false for all browser settings
- Remove debug log truncation in analyzeScreenshot
- Fix misleading --experimental-vision error message
- Replace any casts with typed TestableConfirmation interface in tests
- Update license year to 2026 in all browser agent files
- Merge duplicate imports in mcpToolWrapper
- Add sync comment to BrowserAgentCustomConfig
- Update subagents.md Chrome requirement wording
- Regenerate settings docs
gsquared94 added a commit to gsquared94/gemini-cli that referenced this pull request Feb 24, 2026
- Remove redundant Promise.race in McpToolInvocation.execute (event listener leak)
- Propagate AbortSignal to all press_key calls (submitKey + typeCharByChar)
- Call this.close() on connectMcp failure (zombie process leak)
- Set showInDialog: false for all browser settings
- Remove debug log truncation in analyzeScreenshot
- Fix misleading --experimental-vision error message
- Replace any casts with typed TestableConfirmation interface in tests
- Update license year to 2026 in all browser agent files
- Merge duplicate imports in mcpToolWrapper
- Add sync comment to BrowserAgentCustomConfig
- Update subagents.md Chrome requirement wording
- Regenerate settings docs
gsquared94 added a commit to gsquared94/gemini-cli that referenced this pull request Feb 24, 2026
- Remove redundant Promise.race in McpToolInvocation.execute (event listener leak)
- Propagate AbortSignal to all press_key calls (submitKey + typeCharByChar)
- Call this.close() on connectMcp failure (zombie process leak)
- Set showInDialog: false for all browser settings
- Remove debug log truncation in analyzeScreenshot
- Fix misleading --experimental-vision error message
- Replace any casts with typed TestableConfirmation interface in tests
- Update license year to 2026 in all browser agent files
- Merge duplicate imports in mcpToolWrapper
- Add sync comment to BrowserAgentCustomConfig
- Update subagents.md Chrome requirement wording
- Regenerate settings docs
gsquared94 added a commit to gsquared94/gemini-cli that referenced this pull request Feb 24, 2026
- Remove redundant Promise.race in McpToolInvocation.execute (event listener leak)
- Propagate AbortSignal to all press_key calls (submitKey + typeCharByChar)
- Call this.close() on connectMcp failure (zombie process leak)
- Set showInDialog: false for all browser settings
- Remove debug log truncation in analyzeScreenshot
- Fix misleading --experimental-vision error message
- Replace any casts with typed TestableConfirmation interface in tests
- Update license year to 2026 in all browser agent files
- Merge duplicate imports in mcpToolWrapper
- Add sync comment to BrowserAgentCustomConfig
- Update subagents.md Chrome requirement wording
- Regenerate settings docs
@gsquared94
Copy link
Copy Markdown
Contributor Author

The way we are launching the MCP server causes critical problems in alternate buffer mode a likely elsewhere. Code from the MCP server is leaking into the UI corrupting it as seem in this snapshot. Notice the text from the MCP corrupting the Gemini CLI UI resulting in duplicate rendering.

Fixed this. It should no longer be corrupting the UI on alternate buffer mode.

@gsquared94
Copy link
Copy Markdown
Contributor Author

High level design question: it seems like the MCP server gets started every single time I ask the agent a question about rendering. Even when I tell the agent to leave the webpage open so I can view it, the webpage gets closed on me with the default configuration. This results in the same text spam in the uI each time I launch the agent, appears to make simple operations with the agent very slow (although could be general Gemini availability issues on my test machine)

WIll revise the browser connection management along with the devtools mcp dependency management in an immediate followup PR.

- Remove redundant Promise.race in McpToolInvocation.execute (event listener leak)
- Propagate AbortSignal to all press_key calls (submitKey + typeCharByChar)
- Call this.close() on connectMcp failure (zombie process leak)
- Set showInDialog: false for all browser settings
- Remove debug log truncation in analyzeScreenshot
- Fix misleading --experimental-vision error message
- Replace any casts with typed TestableConfirmation interface in tests
- Update license year to 2026 in all browser agent files
- Merge duplicate imports in mcpToolWrapper
- Add sync comment to BrowserAgentCustomConfig
- Update subagents.md Chrome requirement wording
- Regenerate settings docs
Pipe stderr from npx chrome-devtools-mcp instead of inheriting it.
The server's banner warnings were leaking into the terminal and
corrupting the Ink-based UI in alternate buffer mode. Piped output
is forwarded to debugLogger so it remains visible with --debug.
@gsquared94
Copy link
Copy Markdown
Contributor Author

@jacob314 addressed all your comments. PTAL

@gsquared94 gsquared94 requested a review from jacob314 February 24, 2026 10:23
Copy link
Copy Markdown
Contributor

@jacob314 jacob314 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@gsquared94 gsquared94 added this pull request to the merge queue Feb 24, 2026
Merged via the queue into google-gemini:main with commit 9e95b8b Feb 24, 2026
26 of 27 checks passed
@gsquared94 gsquared94 deleted the browser_use branch February 24, 2026 17:34
BryanBradfo pushed a commit to BryanBradfo/gemini-cli that referenced this pull request Mar 5, 2026
liamhelmer pushed a commit to badal-io/gemini-cli that referenced this pull request Mar 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/agent Issues related to Core Agent, Tools, Memory, Sub-Agents, Hooks, Agent Quality priority/p2 Important but can be addressed in a future release.

2 participants