LLM Agent Lab

LLM Agent Lab

One BrowserPod branch, one failing security test, one AI-generated patch proposal, human-approved before execution.

BrowserPod verifiedServer-side LLM keysHuman approval gate

Provider control

Patch planner

Model names come from server env. API keys never leave the server route.

Selected modelgemini -> groq
Fallback modeNot used
AI-generated code is untrusted until BrowserPod verifies it.

Run control

Human-approved patch loop

First runnot-run
Patch approvalwaiting
Second runnot-run

Why this bug matters

Multi-tenant access risk

Tenant isolation bugs let privileged users cross account boundaries. In billing or invoice systems, checking role before tenant identity can expose another customer's data even when the user is otherwise valid.

/agent-lab · tenant-access-control
BrowserPod terminal
Ready. Boot BrowserPod to run the access-control security test.

File viewer

BrowserPod files

function canViewInvoice(user, invoice) {
  if (user.role === "admin") return true;
  return user.id === invoice.ownerId;
}

module.exports = { canViewInvoice };

LLM patch proposal

Waiting for proposal

Run the failing test, then ask the selected provider for a patch.
Proof report: pending
Failing test observedWaiting
LLM patch proposedWaiting
Human approvedWaiting
Patch written into BrowserPodWaiting
Passing test observedWaiting

Next scenario

Frontend Sidebar Route Change Bug

Prepared / not live yet. Planned files: /forklab-frontend/src/sidebarState.js and /forklab-frontend/tests/test-sidebarState.js. This scenario is not verified in v1.