It is 2:47 in the morning. A call comes in. A voice says three words:
“Please send help.”
Then the line goes quiet.
The Dispatcher’s Ear
An experienced dispatcher does not hear only the words. They hear the breathing. They notice the hesitation. They recognize that the voice has become almost a whisper. A faint sound in the background, perhaps a door closing, perhaps something else, becomes part of the overall assessment.
The understanding of the event is built from all of it.
Later, the transcript may contain only three words:
“Please send help”
Much of the surrounding acoustic context has disappeared.
When an author writes a novel, they do more than record dialogue. Through descriptions of emotion, silence, and the surrounding environment, they help readers reconstruct the scene.
A simple sentence such as: “Please hurry”, can carry a very different meaning when accompanied by:
[voice trembling]
[long silence]
[door slams]
These details are not the story itself, but they provide the context that gives meaning to the words.
In many ways, dispatchers perform a similar task. They do not simply listen to what callers say. They build a mental picture from tone of voice, hesitation, silence, and the sounds of the surrounding environment. A whisper, heavy breathing, a crying child, or a smoke alarm may all contribute to understanding what is happening.
The operational need for this information is already well understood. Experienced dispatchers routinely supplement transcripts with observations such as:
Caller crying
Heavy breathing
Background arguing
Child heard
Fire alarm audible
These annotations become part of the incident record and support shift handover, investigations, quality assurance, and training. They exist because experienced operators know that words alone may not fully describe an event.
The fact that dispatchers document these observations manually is perhaps the strongest evidence that this information has operational value. This is not simply a workaround. It is a professional practice, a recognition by the people closest to the problem that the spoken words are only one part of the overall picture.
Yet when speech becomes text, much of this context is often lost.
The Limits of the Transcript
Current Automatic Speech Recognition (ASR) systems are primarily optimized for Word Error Rate (WER), a metric that measures how accurately words are recognized. By design, WER does not evaluate whether a system preserved environmental sounds, speaker state, hesitation, silence, or other non-verbal acoustic events. In mission-critical communications, some of this information may be operationally as important as the words themselves.
This is not a limitation of ASR alone. Speech recognition and sound event detection have largely evolved as separate research areas, each with their own datasets, benchmarks, and evaluation methods. Mission-critical communications may require these capabilities to work together.
The engineering challenge is significant. Emergency communications often involve a mix of legacy telephony, radio networks, mobile devices, and VoIP systems. The resulting audio can vary significantly in quality and may include compression artifacts, background noise, overlapping speech, and channel distortions that are rarely represented in the datasets used to train many modern AI models.
The challenge becomes particularly important in edge cases, such as silent or partially silent calls, where environmental sounds may provide some of the few available clues about the situation.
Perhaps this suggests that we should rethink what a transcript is expected to do.
Rethinking What a Transcript Is For
Traditional ASR answers one question:
What was said?
Mission-critical transcription may also need to answer another:
What was happening?
An emergency call contains several layers of information:
the spoken words,
the speaker’s condition,
the interaction dynamics,
the surrounding environment.
Current transcription systems preserve the first layer very well. The others are often not retained, despite their potential operational value.
The objective is not to replace the judgment and experience of a trained dispatcher. Human interpretation will always remain central to public safety operations. Instead, AI may help preserve acoustic context that might otherwise disappear when speech becomes text. This could ensure that important situational cues remain part of the record even after the conversation has ended.
A future operational transcript might include:
[heavy breathing]
“Please send help…”
[child crying]
[long pause]
The goal is not to tell operators what to think. It is to preserve information that may assist trained professionals in understanding the event.
As speech AI continues to evolve, we may need to ask not only how accurately we transcribe words, but how much of the original event we preserve.
Because in mission-critical communications, words are only part of the event.
Dr. Salma Ait Farès
Technical Research Chair
Learn more about InterTalk AI Research for Mission-Critical Communication
Whether you’re exploring the technology, evaluating a partnership, or want to shape what comes next, we’d love to hear from you.
Recent News
Command and Control: How Radio Dispatch Consoles Elevate Your Security Communications
In the world of professional security, information moves fast, and so do threats. Whether you're overseeing a sprawling fulfillment center, coordinating response across a hospital campus, or managing access at a university with tens of thousands of students, the...
Enlite Feature Update | Q1 2026
From January through March 2026, the Enlite team delivered a series of updates focused on dispatcher efficiency, radio operations flexibility, and improved audio/messaging workflows. Below is a summary of the key new features now available in Enlite. New Feature...
InterTalk Deploys P25 Cloud-Hosted Dispatch for MACC 911
DARTMOUTH, NOVA SCOTIA, CANADA – MARCH 12, 2026 – InterTalk Critical Information Systems has successfully completed the first phase of a P25 cloud-hosted dispatch deployment for the Multi-Agency Communication Center (MACC 911), delivering a resilient,...


