Gautam Tata

Browser Agents Are Clutch for EMR Automation

I founded Northlight to solve a problem that's been frustrating clinicians for decades: documentation.

Post-acute healthcare is brutal. These are the nurses and therapists who visit patients at home - wound care, physical therapy, hospice. They drive from house to house all day, seeing patients, taking notes on their phones or scraps of paper. Then they get home, exhausted, and face the real work: filling out regulatory forms.

OASIS assessments. SOAP notes. Care plans. These aren't simple forms - they're 50+ page documents required by CMS for reimbursement. Miss a field, and you don't get paid. Fill it out wrong, and you're facing compliance issues. Clinicians spend hours every night doing paperwork instead of resting or spending time with their families.

The solution seemed obvious: build an AI agent that listens to clinical notes, converts them into the appropriate format, and fills out the EMR automatically. Straightforward, right?

Not quite.

The EMR Problem

Here's what we discovered about post-acute EMRs: they have no APIs.

These systems were built in the early 2000s, before APIs were standard practice. They're web applications with labyrinthine interfaces, designed when "integration" meant faxing documents. The vendors have no incentive to modernize - they have captive customers who can't easily switch, and any integration request gets met with demands for revenue sharing.

We could build the best documentation AI in the world, but if we couldn't get the data into the EMR, clinicians would still be copy-pasting. That's not a solution - that's a demo.

We thought we were cooked. Nobody wants a product that gets you 80% of the way there. The last mile - actually filling out the forms - was the whole point.

Then we started exploring browser agents.

What Are Browser Agents?

Browser agents are AI systems that can interact with web applications the way a human would. They see the page, understand the interface, click buttons, fill forms, navigate between screens. If a human can do it in a browser, a browser agent can too.

This completely sidesteps the API problem. We don't need the EMR vendor's permission or cooperation. We just need to teach the agent how to use their software.

There are two main architectural approaches, each with tradeoffs.

Option 1: Background Agents with Browserbase

Browserbase provides cloud-hosted browsers that AI agents can control. Combined with their Stagehand SDK, you can write agents that navigate web applications programmatically.

The workflow looks like this:

  1. Clinician records patient visit using our mobile app
  2. AI transcribes and structures the documentation
  3. Clinician reviews and approves the structured data
  4. Agent spins up a browser session in the background
  5. Agent logs into the EMR, navigates to the correct patient and form
  6. Agent fills out the documentation fields
  7. Clinician gets notified when complete

The clinician never touches the EMR. They review the AI-generated documentation in our clean interface, click approve, and the agent handles the tedious data entry.

Advantages:

  • Runs in the background - clinician can do other things
  • Handles the entire workflow end-to-end
  • Works with any web-based EMR regardless of how outdated

Challenges:

  • Authentication is tricky (more on this below)
  • Session management across long-running tasks
  • Handling EMR quirks and edge cases gracefully

Option 2: Chrome Extension with DOM Streaming

The alternative is a Chrome extension that augments the EMR interface directly. Instead of a background agent, this approach keeps the clinician in the driver's seat.

The workflow:

  1. Clinician records visit and generates structured documentation
  2. Clinician opens the EMR and navigates to the patient's form
  3. Extension detects the page and streams DOM elements to our backend
  4. Agent identifies form fields and maps them to the structured documentation
  5. Pre-filled values stream back and populate the form in real-time
  6. Clinician reviews and submits

This is more of a "co-pilot" model - the agent assists but doesn't take full control.

Advantages:

  • No authentication complexity - uses the clinician's existing session
  • Clinician maintains full visibility and control
  • Lower compliance risk since human is always in the loop
  • Easier to handle edge cases (clinician can intervene)

Challenges:

  • Requires clinician to navigate the EMR manually
  • Extension needs to handle the chaos of legacy DOM structures
  • Less "magical" user experience

The Authentication Problem

Background agents face a fundamental challenge: how do you securely handle credentials?

The agent needs to log into the EMR as the clinician. That means storing or accessing their credentials somehow. In healthcare, where we're dealing with PHI and HIPAA requirements, this gets complicated fast.

Browserbase recently partnered with 1Password to address exactly this problem. The idea is that credentials stay in the password manager, and the agent can request access through secure APIs without ever seeing the raw password. The clinician approves the authentication request, and the session proceeds.

This is a meaningful step toward making background agents viable in high-compliance environments. But it's not a complete solution - you still need BAAs with every service in the chain, audit logs for credential access, and careful handling of session tokens.

Compliance in Healthcare AI

HIPAA isn't optional. Every piece of the architecture that touches patient data needs a Business Associate Agreement. That includes:

  • The transcription service
  • The LLM provider (or self-hosted models)
  • The browser automation platform
  • Any cloud infrastructure

This is why the human-in-the-loop model matters so much in healthcare. Even if the agent is 99% accurate, that 1% in a clinical context can mean serious consequences. Clinicians need to review before anything gets submitted.

The good news is that browser agents don't fundamentally change the compliance picture - they're just another piece of infrastructure that needs to be locked down. The patterns exist; they just need to be applied carefully.

Why Post-Acute Is Ripe for This

Post-acute care is where healthcare goes to be forgotten by technology vendors. The EMRs are ancient. The workflows are painful. The clinicians are overworked and underserved by their tools.

This creates massive opportunity. Documentation automation is just the start:

Quality Assurance: Agents can review completed documentation for missing fields, inconsistencies, or compliance issues before submission.

Billing Optimization: Proper documentation directly affects reimbursement. Agents can ensure documentation supports the appropriate billing codes.

Prior Authorization: The nightmare of getting payer approval for services could be automated - agents navigating payer portals, submitting requests, tracking status.

Care Coordination: Faxes (yes, healthcare still runs on faxes) could be parsed, routed, and actioned by agents.

The common thread is that these are all tasks where humans are currently acting as the "integration layer" between systems that don't talk to each other. Browser agents can take over that role.

Where We Landed

For Northlight, we ended up building both approaches. The Chrome extension for clinicians who want to stay in control. The background agent for those who want to fire-and-forget.

The technical architecture is similar either way - structured documentation, intelligent form mapping, robust error handling. The difference is where the browser runs and who's watching.

What surprised us was how capable these agents have become. A year ago, browser automation meant brittle Selenium scripts that broke if a button moved three pixels. Today's agents can handle variation, recover from errors, and navigate interfaces they've never seen before.

The EMR vendors probably won't build APIs anytime soon. They don't need to. But that's fine - we'll just teach the agents to use their software the same way their customers do.

Conclusion

Browser agents aren't a hack or a workaround. They're a legitimate architectural pattern for integrating with systems that refuse to be integrated with.

In healthcare, where legacy software is everywhere and API access is gatekept, this matters enormously. Clinicians shouldn't spend their evenings doing data entry because a software vendor from 2003 never built an API.

The technology is finally good enough to automate the tedious parts of clinical documentation. The compliance patterns exist. The infrastructure is available. What's left is execution - building agents that are reliable enough to trust with real clinical workflows.

Post-acute clinicians deserve better tools. Browser agents are how we build them.