Glossary

Interim transcription results

Short answer

Interim transcription results are the provisional, low-latency text hypotheses a streaming speech-to-text engine emits while a person is still speaking, refined and replaced by final results once the phrase settles.

Last updated June 12, 2026

Interim transcription results are the provisional, low-latency text hypotheses a streaming speech-to-text engine emits while a person is still speaking — refined and then replaced by a final result once the phrase settles. They are also called partial results, interim hypotheses, or non-final results.

Interim vs final results

A streaming transcriber returns two kinds of output for the same stretch of audio:

The interim results are what make text appear to type itself out word by word during a live caption or transcript. The trade-off is accuracy for speed: you see something instantly, but the wording may still shift.

Why interim results matter for live summaries

A summary can only be as fresh as the text feeding it. If a system waits for final results before doing anything, the live view always lags real speech by a sentence or more. Consuming interim results instead lets the “now” view track the conversation almost as it happens — which is exactly what you need when your name is suddenly called and you want to know what you just missed.

How it fits together

Interim results sit on top of streaming transcription, which itself relies on voice activity detection to know when speech is happening. The stream of interim and final text is then condensed by real-time meeting summarization into a rolling summary.

Canary leans on interim results so its “now” pane reflects what’s being said the instant it’s said — part of why you can get a live summary during a Zoom call without a bot in the meeting.