Beyond Words: Why Plain Text Is Not the Whole Story in Public Safety

It is 2:47 in the morning. A call comes in. A voice says three words:

“Please send help.”

Then the line goes quiet.

The Dispatcher’s Ear

An experienced dispatcher does not hear only the words. They hear the breathing. They notice the hesitation. They recognize that the voice has become almost a whisper. A faint sound in the background, perhaps a door closing, perhaps something else, becomes part of the overall assessment.

The understanding of the event is built from all of it.

Later, the transcript may contain only three words:

“Please send help”

Much of the surrounding acoustic context has disappeared.

When an author writes a novel, they do more than record dialogue. Through descriptions of emotion, silence, and the surrounding environment, they help readers reconstruct the scene.

A simple sentence such as: “Please hurry”, can carry a very different meaning when accompanied by:

[voice trembling]
[long silence]
[door slams]

These details are not the story itself, but they provide the context that gives meaning to the words.

In many ways, dispatchers perform a similar task. They do not simply listen to what callers say. They build a mental picture from tone of voice, hesitation, silence, and the sounds of the surrounding environment. A whisper, heavy breathing, a crying child, or a smoke alarm may all contribute to understanding what is happening.

The operational need for this information is already well understood. Experienced dispatchers routinely supplement transcripts with observations such as:

Caller crying
Heavy breathing
Background arguing
Child heard
Fire alarm audible

These annotations become part of the incident record and support shift handover, investigations, quality assurance, and training. They exist because experienced operators know that words alone may not fully describe an event.

The fact that dispatchers document these observations manually is perhaps the strongest evidence that this information has operational value. This is not simply a workaround. It is a professional practice, a recognition by the people closest to the problem that the spoken words are only one part of the overall picture.

Yet when speech becomes text, much of this context is often lost.

The Limits of the Transcript

Current Automatic Speech Recognition (ASR) systems are primarily optimized for Word Error Rate (WER), a metric that measures how accurately words are recognized. By design, WER does not evaluate whether a system preserved environmental sounds, speaker state, hesitation, silence, or other non-verbal acoustic events. In mission-critical communications, some of this information may be operationally as important as the words themselves.

This is not a limitation of ASR alone. Speech recognition and sound event detection have largely evolved as separate research areas, each with their own datasets, benchmarks, and evaluation methods. Mission-critical communications may require these capabilities to work together.

The engineering challenge is significant. Emergency communications often involve a mix of legacy telephony, radio networks, mobile devices, and VoIP systems. The resulting audio can vary significantly in quality and may include compression artifacts, background noise, overlapping speech, and channel distortions that are rarely represented in the datasets used to train many modern AI models.

The challenge becomes particularly important in edge cases, such as silent or partially silent calls, where environmental sounds may provide some of the few available clues about the situation.

Perhaps this suggests that we should rethink what a transcript is expected to do.

Rethinking What a Transcript Is For

Traditional ASR answers one question:

What was said?

Mission-critical transcription may also need to answer another:

What was happening?

An emergency call contains several layers of information:

the spoken words,
the speaker’s condition,
the interaction dynamics,
the surrounding environment.

Current transcription systems preserve the first layer very well. The others are often not retained, despite their potential operational value.

The objective is not to replace the judgment and experience of a trained dispatcher. Human interpretation will always remain central to public safety operations. Instead, AI may help preserve acoustic context that might otherwise disappear when speech becomes text. This could ensure that important situational cues remain part of the record even after the conversation has ended.

A future operational transcript might include:

[heavy breathing]
“Please send help…”
[child crying]
[long pause]

The goal is not to tell operators what to think. It is to preserve information that may assist trained professionals in understanding the event.

As speech AI continues to evolve, we may need to ask not only how accurately we transcribe words, but how much of the original event we preserve.

Because in mission-critical communications, words are only part of the event.

Dr. Salma Ait Farès

Technical Research Chair

Learn more about InterTalk AI Research for Mission-Critical Communication

Whether you’re exploring the technology, evaluating a partnership, or want to shape what comes next, we’d love to hear from you.

LEARN MORE ABOUT MISSION-CRITICAL AI

Recent News

Enlite Feature Update | Q2 2026

Jul 3, 2026

It's been a big quarter for Enlite. From April through June 2026, our development team shipped a wave of new capabilities focused on smarter call management, the foundation for AI-powered transcription, and meaningful improvements to console reliability and dispatcher...

Command and Control: How Radio Dispatch Consoles Elevate Your Security Communications

May 15, 2026

In the world of professional security, information moves fast, and so do threats. Whether you're overseeing a sprawling fulfillment center, coordinating response across a hospital campus, or managing access at a university with tens of thousands of students, the...

Enlite Feature Update | Q1 2026

Apr 7, 2026

From January through March 2026, the Enlite team delivered a series of updates focused on dispatcher efficiency, radio operations flexibility, and improved audio/messaging workflows. Below is a summary of the key new features now available in Enlite. New Feature...

News

Beyond Words: Why Plain Text Is Not the Whole Story in Public Safety

The Dispatcher’s Ear

The Limits of the Transcript

Rethinking What a Transcript Is For

Dr. Salma Ait Farès

Learn more about InterTalk AI Research for Mission-Critical Communication

Recent News

Enlite Feature Update | Q2 2026

Command and Control: How Radio Dispatch Consoles Elevate Your Security Communications

Enlite Feature Update | Q1 2026

Join The (R)Evolution Of Dispatch Today

Book a needs assessment with our amazing team below!

Contact

Menu

Newsletter Sign Up