Core Audio tap
A Core Audio tap is a built-in macOS API (AudioHardwareCreateProcessTap, added in macOS 14.2 Sonoma) that lets an app record the audio a Mac is already playing — such as the voices in a video call — straight from the operating system's output, without a microphone, a meeting bot, or a virtual audio device.
Last updated June 11, 2026
A Core Audio tap is a built-in macOS API that lets an app record the audio a Mac is already playing — such as the voices in a video call — straight from the operating system’s output, without a microphone, a meeting bot, or a virtual audio device.
How a Core Audio tap works
Core Audio is the low-level audio framework that macOS has shipped for years. Normally an app opens an input device (a microphone) to record, or an output device (your speakers or headphones) to play sound.
A process tap is the twist: the app asks Core Audio to insert a tap on the audio that one or more processes are rendering to an output device, and hands the app a copy of that stream. In code this is the AudioHardwareCreateProcessTap call paired with a CATapDescription — which says which processes to tap, whether to mix down to mono or stereo, and whether the original audio should keep playing. The tapped audio is then read like any other Core Audio stream, often through an aggregate device. Apple introduced this process-tap API in macOS 14.2 (Sonoma); before it, capturing system audio on a Mac generally meant installing a virtual audio device or a kernel extension.
Crucially, your audio keeps playing normally through your real speakers — the app just reads a copy of the same stream — and macOS gates the capability behind a user permission prompt, so nothing happens silently.
A tap vs. a microphone vs. a virtual audio device
These three approaches all let you “record a call,” but they are not equivalent:
- Microphone — records the room: your own voice, background noise, and the speaker output played back through the air, echoey and degraded. It captures your side cleanly but the other participants’ voices badly.
- Virtual audio device — a third-party driver (BlackHole, Loopback, Soundflower) you install and route by hand. It works, but it has to be configured, it can hijack or mute your normal playback, and it is one more thing to break.
- Core Audio tap — uses Apple’s own API, so there is nothing to install and nothing to route. Playback continues normally while the app reads a copy of the stream.
This is the macOS half of system audio capture: the general technique of reading what the computer is already playing, locally, instead of rejoining the meeting as a bot. For the practical walkthrough, see how to record system audio on a Mac without a virtual device.
Capturing one app vs. the whole system
Because a CATapDescription takes a list of process objects, a tap can be scoped to just the meeting app — capturing only Zoom, Google Meet, or Teams and leaving your music untouched — or it can capture the full system mix. (macOS also exposes system-audio capture through ScreenCaptureKit, but the Core Audio process tap is the dedicated, audio-only route.) Either way it is the same native mechanism: no virtual device, no bot in the call.
Why it matters for bot-free meeting notes
Because a Core Audio tap reads audio that is already playing, a meeting tool can listen to a Mac call without sending a bot into it or installing a browser plugin. That is the engine behind bot-free meeting notes on macOS — there is no extra attendee on the participant list, and the audio is processed on your own machine. Canary feeds that captured stream into its streaming transcription and a live, rolling, multi-resolution summary, so you can catch up the instant your name is called. For the hands-on version, see how to take meeting notes without a bot.
The Windows and Linux equivalents
The Core Audio tap is the macOS-specific name for a cross-platform idea. Windows exposes the same capability through WASAPI loopback, a built-in mode of the Windows Audio Session API. Linux surfaces “monitor” sources through PulseAudio and PipeWire. All three are forms of loopback audio capture; the Core Audio tap is simply how macOS does it.
A note on consent
Capturing a call should always be transparent, never secret. Recording-consent laws vary by region — some require only one party’s consent, others require everyone’s — so tell participants when you are capturing a meeting, regardless of which audio API does the work.