Know when they finish talking.
A lightweight ML library detects turn completion, thinking pauses, and interrupts. It works entirely in the browser. There are no servers, no API keys, and no latency.
npm install @utterance/coreWhy Utterance?
No cloud dependency
Everything runs in the browser. No servers, no API keys, no network requests for audio processing.
Zero latency
On-device inference means instant results. No round trip to a server. Decisions happen in milliseconds.
Privacy first
Audio never leaves the user’s device. No recording, no uploading, no third-party processing.
Lightweight model
Small ONNX model that loads fast and runs efficiently. It is designed for real-time performance on any device.
Framework agnostic
Works with any JavaScript framework. Use it with React, Vue, vanilla JS, or any voice SDK.
Simple event API
Just listen for turnEnd, pause, and interrupt events. Get building in minutes, not hours.
Trained to understand conversations.
A hybrid conv + attention model trained on real conversational data. Quantized to int8 and optimized for WASM, so it runs on any device without breaking a sweat.
Install and start detecting in seconds.
import { Utterance } from "@utterance/core";
const detector = new Utterance();
detector.on("turnEnd", (result) => {
console.log("User is done speaking", result.confidence);
});
detector.on("pause", (result) => {
console.log("User is thinking...", result.duration);
});
detector.on("interrupt", () => {
console.log("User wants to speak. Stop AI response");
});
await detector.start();