Skip to main content
Skip table of contents

Speech Tag Training Reference Guide

Overview

Speech Recognition provides tools for quickly understanding key elements that may or may not be present in conversations agents have with customers. With tools such as full-text transcription, contact centers can quickly perform searches for specific words or phrases or have the system identify these on an ongoing basis as tags. 
Emotion detection can identify conversations where customers are unhappy, or even more specifically where a customer’s mood was improved or made worse by interacting with our agent.

This guide is meant as a reference while a consultant works with the contact center to establish the most effective speech tags for the business.

Audience

This document is intended for those who have purchased the Speech Recognition module and are responsible for setting up speech tags within the Eleveo Workforce Optimization solution.

Key Terms

Speech Recognition - Elevēo's Speech Recognition solution includes: Transcription Services, Emotion Detection, and Phrase Detection 

Auto Quality Management (AQM) - A tool that can automatically evaluate a call-based conversation based on created rules and provide a score on all transcribed conversations.  

Transcription Services - Speech-to-text for recorded calls with punctuation and number formatting 

Emotion Detection – Ability to identify positive, neutral, negative emotions within a conversation.  Additionally, the system may note when moods improve or worsen.

Phrase Detection – A process by which a collection of words and phrases are labeled to help businesses meet important objectives. 

Redaction of Sensitive Data - Ability to redact sensitive numbers and replace with ##### instead of displaying numbers 

Full-Text Search - Ability to search conversations that include or exclude specific terms, names, or text 

Speech Recognition

The Speech Recognition engine generates transcriptions of conversations that may be viewed in Conversation Explorer.  

If a user/administrator has also set up pre-defined Speech Tags, the engine will also review these during the transcription process, tagging the moment in the conversation the phrase or word was detected with an icon. 

Redaction

Eleveo takes personal data seriously and supports the redaction of sensitive data from transcriptions. The redaction of credit card data and other sensitive numbers is enabled by default when the solution is first configured.  
Any redacted information is shown as #### (hashtags) in the transcription so that viewers can clearly see that something was hidden.  

Emotion Detection

The average emotion at the end of the conversation is extracted from the transcription and then displayed within the Conversation Explorer Details Pane. 

Speech engine emotion detection feature uses a synthesis of acoustic features and word sentiment scores to determine if a given utterance is Positive, Negative, Neutral, Improving or Worsening.

Phrase Detection – Speech Tags

How Does it Work (in Brief)  

To find speech phrases that have been pre-defined by the user, the system performs several steps. 

The list of speech phrases is compared to the transcription of audio recordings for each channel. To do this the system: 

  1. Splits each speech phrase into a list of words, 

  2. Compares the transcription of the recording with the first word from the list of Speech Phrases (Step a) until the words are matched. When a match happens, the system then attempts to match the whole speech phrase (Step a) within the transcription of the same channel. When there is a match, the system saves the speech phrase occurrence to the database.  

  3. Found phrases are displayed within the Conversation Explorer.  

How to Prepare Speech Tags and Phrases 

Speech Recognition searches for targeted phrases which are user defined and grouped according to categories referred to as Speech Tags. Optimally, the groupings (Tags) should be created with the following in mind:

  • As the system searches for an exact match, phrases should be short. This will ensure that the search terms are sufficiently unique to be found. 

  • If the system is having trouble catching all or most phrases of importance, consider shortening the phrase further. For example, is there an “anchor word” that could be searched for? (e.g. instead of searching for “may I have your social security number” search for “social security number” alone)?

  • Consider using different ways of saying the same thing (e.g. “Thank you for calling” and “Thanks for calling”)

  • As a phrase will only be captured by one Tag, do not use the same phrase in multiple Speech Tags.

Example Speech Tags

Negative Sentiment 

Positive Sentiment 

Customer Verification   

Customer Greeting 

I can't believe this  

That's incredible  

Social Security Number  

Thank you for calling 

This is ridiculous  

I really appreciate that  

Date of Birth  

Thanks for calling 

I'm unhappy about  

That's so amazing  

Address  

Hello, my name 

File a complaint  

Thank you so much  

Phone Number  

Hello, this is  

So frustrating  

You are the best  

Your Social  

Good morning 

Very frustrating  

This is excellent  

SSN  

Good afternoon 

Pissing me off  

You just made my day  

Last four digits 

Good Afternoon 

Aggravation   

That's such a relief  

Account number 

This is Customer Service 

Frustration  

You're the best  

Reference number 

Welcome to ABC 

So aggravating  

That's a relief  

Contract number 

How can I help you 

Very aggravating  

You're a life saver  

Last payment made 

How may I help you 

Speak to a manager  

That made my day  

Email address 

How may I assist 

Speak to a supervisor  

You're amazing  

 

How are you today 

Speak to your supervisor  

You are amazing  

 

Hi, How are  

Talk to a manager  

So grateful  

 

Help you 

Talk to a supervisor  

 

 

 

Transfer me to a manager  

 

 

 

Transfer me to a supervisor  

 

 

 

Speech Tag Examples cont.

Upsell or Close 

Fraud 

Product Knowledge Gap Phrases 

Closing statement 

Buy today 

Write off 

Need to check  

Anything else 

Can I activate 

Failed investment 

Need to verify 

Appreciate your  

Discount off  

Off the books 

Let me check 

Enjoyed Talking 

Agree to 

Nobody will find out 

I don’t know 

Once Again My Name 

Have you considered 

Grey area 

I can't help you 

Glad I could help 

Set a time 

Illegal 

I'm not sure 

Any further questions 

We can deliver 

Do not volunteer info 

Let me call you back 

Brief survey 

Actually offer 

Not ethical 

Let me look into that 

Happy to help 

Handle that 

They owe it to me 

I guess 

 

Like to go with 

Backdate 

I'll get back to you 

 

 

No inspection 

Find that out for you 

 

 

Pull earning forward 

I'm new 

 

 

Special fees 

Need to ask someone 

 

 

Friendly payments 

Put you on hold 

 

 

Do not share 

 

 

 

Expected to announce 

 

 

 

Deserve to get paid 

 

 

 

Breach 

 

 

 

Top secret 

 

 

 

Cook me up 

 

 

 

Cover up 

 

 

FAQs

How many syllables should my word or phrase be to be most successful? 

We suggest that each Speech Phrase be short. (one to 3 words) This will ensure that the search terms are sufficiently unique to be found. 

Only exact matches will be found. 

Is there a maximum duration for a word or phrase to be detected? 

No, but remember that the search is looking for exact matches on the transcribed text. The longer the text, the less likely an exact match will be located.  

Is there a maximum call length that can be scanned by the Speech engine? 

No, as the search engine is looking at the transcribed text, there is no max call length.  

When I add a new Word or Phrase to my Speech Tag, does that apply to interactions already recorded and saved to my system? 

Phrase detection occurs once a transcription has been processed and saved. 

How quickly is a call transcribed?  

At a minimum, transcription should be set to occur 1 hour after call completion.  

Can we re-scan calls or scan calls in the past?  

Re-scan is currently not supported but, we are working actively to enable this feature.  

Can I use special characters (like commas, periods, and question marks) when creating Speech Phrases?  

Yes, the following special characters are supported ! - + . ^ : , ? 

Can numbers be found by speech detection?  

Yes. Speech tags support alphanumeric characters. Speech tags support both numbers and words. (e.g. 20 min / twenty minutes).  

Reminder that sensitive numbers are redacted and changed to ###. In this case, the search function or speech detection will not locate the number.  

Are phrases case sensitive?  

No, the algorithm is not case-sensitive.  

How is emotion determined?  

The speech engine emotion detection feature uses a synthesis of acoustic features and word sentiment scores to determine. Resulting in a Positive, Improving, Neutral, Worsening, or Negative response emotion tag.  

How does the algorithm deal with the use of contraction words, such as it's vs it is?  

At the moment, the algorithm supports exact match only. It is NOT equal to it is.  We recommend that multiple phrases be added to cover all variations that may be used 

How much of the call does the system use to recognize which language to transcribe the call into? 

The system uses the first 20 seconds of the call to identify the language. If multiple languages are used within the first twenty seconds (e.g., English and Spanish), the system will default to English. 

 

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.