System audio capture
System audio capture is the technique of reading the audio a computer is already playing — such as the voices coming through a video call — directly and locally, without joining the meeting as a bot or installing a virtual audio device.
Last updated May 24, 2026
System audio capture is the technique of reading the audio a computer is already playing — such as the voices coming through a video call — directly and locally, without joining the meeting as a bot or installing a virtual audio device.
How it works
When you’re on a call, the other participants’ voices are already being played by your operating system. System audio capture taps that output stream on-device. Canary uses this to listen to a meeting without a bot, a plugin, or a virtual audio device — the audio it processes is simply what your speakers are already playing.
Why it matters
Capturing system audio means there’s nothing in the meeting for anyone to admit or notice, and the audio never has to leave through a third-party meeting bot. It works the same across Zoom, Google Meet, Teams, or any other call, because it doesn’t depend on the platform. That foundation is what powers streaming transcription and lets Canary act as an ambient meeting assistant.
A note on consent
Capturing audio should always be transparent, not secret. Recording and consent laws vary by region — some places require one party’s consent, others require everyone’s. Tell participants when you’re capturing a call. See how to take meeting notes without a bot for the responsible approach.