Topic building can give you an edge over the competition by learning what matters most to your customers.
Topic building is a machine learning technique that categorizes large collections of unstructured text and speech data by assigning “tags” or categories according to keywords or themes.
These tags give it structure so you can gain insights from the data.
Other names for topic building include topic detection, topic analysis, topic modeling, or topic extraction.
The data comes from sources like customer support phone calls, surveys, emails, chatbots, online reviews, social media, and messaging platforms.
Even the Biggest Companies Tap into Topic Building
For example, a company like McDonalds receives global customer feedback data all the time. They want to know as soon as possible what customers think of their restaurants, the food, the service, etc.
Text analytics provides them with a way to drill down to specific topics, themes, and even individual stores. This way they can deal with any issues or get an overview of how certain regions, countries, promotions, and products are performing.
McDonalds can then act in real time, improve their customer experiences, and improve their business strategy.
Let’s look more closely at topic building and what this data-driven approach to customer feedback could mean for your business.
What is topic building?
Topic building is the categorization of text and speech data. It’s accomplished by tagging specific keywords to help you filter large amounts of data and pinpoint the most frequent topics mentioned in customer feedback.
You can accomplish this through manual data processing, but that’s time consuming and tedious. Topic building speeds things up by combinimg human and machine intelligence to discover and recommend high value actions. This focuses employee attention where it matters most.
After the data is structured using text analytics, it’s fed into a reporting tool to give you real time access to customer feedback so you can act immediately.
The reporting also provides insights into possible or current trends, how a new product or service is performing, or highlight any unexpected issues.
You’ve likely seen a simple example of topic building in your casual Google searches. Look up “mechanic near me” and customer reviews will pop up with “people often mention” categories pertaining to individual businesses.
You can see what people are saying about this mechanic’s customer service, ability to fix brakes and engines, whether the prices are fair.
Topic Building and Sentiment Analysis
Before we move on to the process, we should insert a quick word on sentiment analysis.
Sentiment analysis (or opinion mining) is a natural language processing technique used to determine whether the data is positive, negative, or neutral.
An offshoot of topic building is the model to recognize whether a comment related to a specific keyword is negative or positive.
Consider website first impression tests. You can crowdsource feedback on the look and functionality of your e-commerce website, for example. You ask basic questions, like: How’s the design? Is it easy to navigate? What would you change?
Sentiment analysis trains the models to recognize whether a comment related to a specific keyword is negative or positive.
Outlining the Topic Building Process
Fast, scalable, and cost efficient, topic building helps you improve the customer experience, and ultimately help your business grow.
Here’s how we do it.
Who does the work?
As with other data collection projects, it’s a combination of freelancers, third-party transcription vendors, and crowd workers, depending on the needs of the project.
For example, annotators with experience in medical field are required to dive into data with respect to medical devices.
Our article on why language service providers are pivoting to data collection further details how Summa Linguae Technologies goes about fulfilling our client’s needs.
What’s the process?
It begins with the transcriptions and text annotation of customer feedback to train the AI models.
The team builds the topics individually by searching for common comments or specific phrases within the data.
Examples of data include:
- Speech – AI transcripts of customer calls
- Social Listening – Customer feedback on social media
- Email – Customer service emails received from customers
- Chat – Customer interactions on online chat support
- Messaging – Customer interactions on various messaging platforms
- Call Notes – Transcriptions of calls written by customer support agents
There are two different approaches you can take once you have collected the data.
How to Know What You Need
If you have a bunch of data and want to figure out what they cover, you need topic modeling.
This is used when you have a set of text documents like emails, survey responses, support tickets, product reviews, and you want to find out the different topics that they cover. Feed the data and set up the training parameters and unsupervised algorithms will group them by the uncovered topics.
If you know the keywords or topics and want them tagged, you need topic classification.
For example, let’s say you already know people are talking about “X product issue” or “Y service breakdown.” We feed the model with examples of data labeled according to these topics, and it will eventually learn how to tag future data conversely on its own.
Topic classification’s supervised algorithms require the legwork of training the machine. It needs to be taught what it is that you want via the tagged examples that you feed it.
The topic definition and tagging process are important human steps that should not be taken lightly, since they make or break the real-life performance of the model.
Here’s an example to bring it all together. There’s been an uptick in returns of a specific product, let’s say a smart TV. The company will gather as much information as possible about why people are returning the TV.
With the help of human annotators and the machine algorithms, they’ll be able to spot specific recurring reasons – screen quality, sound issues etc. This will illuminate the problem, and get people going in terms of improving the UX.
The team gathers to examine and verify the results of the human or machine output.
Questions asked include:
- Are any of the rules too broad? Do any rules contain only one keyword?
- Are any of the rules too narrow? Do any rules contain 4 or more keywords?
- Should anything be added or removed?
- Is there a topic that’s been missed?
- Is something positive that should have been marked neutral or negative?
All of this and more is discussed before the results are passed on to the client. It’s a full team effort, and there’s a lot of collaboration among the topic builders and with the project manager along the way.
Our Unique Approach
Summa Linguae’s process was built over time and is constantly being refined.
The process is in place to give you what you want, even if you don’t know exactly what that looks like just yet.
We become “experts” on certain industries to get a sense of what kind of topics people are concerned with within, and in relation to common services and products.
It allows you the freedom to work on other things and has opened the door to begin implementing base topic sets that can be mined for important information right out of the gate.
For example, there can be an insurance topic set designed to capture feedback and collect comments about people’s experience in a general sense. It can then be applied to all clients in the insurance industry.
Find Out What People Are Saying
Annotation is required for machine learning and training your AI.
Topic building is an offshoot that allows you to keep track of what people are saying about your smart device, for example.
We provide end-to-end data solutions to get your project where you want it to be.
Contact us today to learn more.
Speaking Your Customers’ Language: How Multilingual Text Data Empowers Cha...
Multilingual text data forms the cornerstone of training chatbots to operate effectively across language b...
The Impact of Accurate Data Labeling on Model Performance
Discover how accurate data labeling transforms the chaos of raw data into clarity, significantly impacting...