Glossary

Core Audio tap

Short answer

A Core Audio tap is a built-in macOS API (AudioHardwareCreateProcessTap, added in macOS 14.2 Sonoma) that lets an app record the audio a Mac is already playing — such as the voices in a video call — straight from the operating system's output, without a microphone, a meeting bot, or a virtual audio device.

Last updated June 11, 2026

A Core Audio tap is a built-in macOS API that lets an app record the audio a Mac is already playing — such as the voices in a video call — straight from the operating system’s output, without a microphone, a meeting bot, or a virtual audio device.

How a Core Audio tap works

Core Audio is the low-level audio framework that macOS has shipped for years. Normally an app opens an input device (a microphone) to record, or an output device (your speakers or headphones) to play sound.

A process tap is the twist: the app asks Core Audio to insert a tap on the audio that one or more processes are rendering to an output device, and hands the app a copy of that stream. In code this is the AudioHardwareCreateProcessTap call paired with a CATapDescription — which says which processes to tap, whether to mix down to mono or stereo, and whether the original audio should keep playing. The tapped audio is then read like any other Core Audio stream, often through an aggregate device. Apple introduced this process-tap API in macOS 14.2 (Sonoma); before it, capturing system audio on a Mac generally meant installing a virtual audio device or a kernel extension.

Crucially, your audio keeps playing normally through your real speakers — the app just reads a copy of the same stream — and macOS gates the capability behind a user permission prompt, so nothing happens silently.

A tap vs. a microphone vs. a virtual audio device

These three approaches all let you “record a call,” but they are not equivalent:

This is the macOS half of system audio capture: the general technique of reading what the computer is already playing, locally, instead of rejoining the meeting as a bot. For the practical walkthrough, see how to record system audio on a Mac without a virtual device.

Capturing one app vs. the whole system

Because a CATapDescription takes a list of process objects, a tap can be scoped to just the meeting app — capturing only Zoom, Google Meet, or Teams and leaving your music untouched — or it can capture the full system mix. (macOS also exposes system-audio capture through ScreenCaptureKit, but the Core Audio process tap is the dedicated, audio-only route.) Either way it is the same native mechanism: no virtual device, no bot in the call.

Why it matters for bot-free meeting notes

Because a Core Audio tap reads audio that is already playing, a meeting tool can listen to a Mac call without sending a bot into it or installing a browser plugin. That is the engine behind bot-free meeting notes on macOS — there is no extra attendee on the participant list, and the audio is processed on your own machine. Canary feeds that captured stream into its streaming transcription and a live, rolling, multi-resolution summary, so you can catch up the instant your name is called. For the hands-on version, see how to take meeting notes without a bot.

The Windows and Linux equivalents

The Core Audio tap is the macOS-specific name for a cross-platform idea. Windows exposes the same capability through WASAPI loopback, a built-in mode of the Windows Audio Session API. Linux surfaces “monitor” sources through PulseAudio and PipeWire. All three are forms of loopback audio capture; the Core Audio tap is simply how macOS does it.

Capturing a call should always be transparent, never secret. Recording-consent laws vary by region — some require only one party’s consent, others require everyone’s — so tell participants when you are capturing a meeting, regardless of which audio API does the work.