| Metric | v2.0 (2024) | v2.1.6 (2025) | Improvement | | :--- | :--- | :--- | :--- | | (M3 Max) | 14 minutes | 8 minutes | 42% faster | | Word Accuracy (Clean Audio) | 94% | 98.5% | +4.5% | | Word Accuracy (Noisy Audio) | 78% | 89% | +11% | | Speaker Diarization (2 hosts) | 80% correct | 94% correct | +14% | | Manual Corrections needed | ~65 edits | ~18 edits | 72% reduction |

Once transcribed, converting speech to captions is instantaneous. v2.1.6 includes updated caption style presets that adhere to broadcast standards. The integration ensures that captions are burned into the video or exported as sidecar files (SRT, VTT, XML) with frame-accurate timing.

You can edit your video by simply deleting or moving text in the transcript, which automatically updates the timeline.