Supervised vs Unsupervised models

8 min read

Pedro Daire

COO

Supervised and unsupervised deep learning models are two different ways of analyzing text data.

In Deep Talk, we have to analyze data that sometimes have annotations and sometimes do not. What does this mean? For example, a database with conversations between companies and customers that has a column “subject of the conversation” is a type of annotation that tells us what the conversation was about if it was a complaint, congratulations, sales opportunity, etc., those are “supervised” data because I have someone who wrote down what type of conversation it was. Then our deep learning models can be based on that annotation to classify the conversations.

Other times we have unsupervised data; in the previous case, the same conversation but without the column that tells us the subject of the conversation could be unsupervised data. In that case, we use other models to analyze that data.

What is the big difference between unsupervised and supervised models?

Unsupervised models:

In Deep Talk, you have products like the “Sunburst Chart” where the unsupervised models take the data and make clusters of that data. They classify up to 3 levels without the data having a label. The algorithms look for similarities between paragraphs or sentences and group them.

In short, unsupervised data has no labeling, and therefore unsupervised models must be able to analyze that data without any annotation to it.

Unsupervised data analysis (Sunburst chart)

Supervised Models:

Supervised models process data that has an annotation.

Deep Talk’s products like the “Deepers” ask the user to “annotate” the data. For example, suppose you are building a “complains deeper”. In that case, you must annotate phrases that are complaints and annotate phrases that are congratulations and do not correspond to what you want to identify. The saved data will have a “complaint” label, and the model will then search for all similar data to detect all the complaints that have been made.

Supervised data models on Deep Talk

What type of model do I need for my company?

Well, that depends a lot on the data you have. For example, if you have an extensive database of chats or Whatsapp conversations with customers, that data has no tag and is not classified. If you need to classify the data, use unsupervised models to know what the customers talked about, the most frequent topics, when the different topics appeared over time, etc.

What would unsupervised analysis be useful for?

To create clusters of your conversations with customers and have associated metrics. For example, how those conversations are distributed, the most frequent, which topics are most talked about, etc.

Discover problems that had not been identified, understand the flow of conversations between customers and the company.

You can also use this analysis to create bots, train them with the phrases of each topic, re-train them, etc.

What would supervised analysis be useful for? (Deepers)

The supervised analysis allows you to train models looking for topics you want to detect in conversations. For example, train a model to detect sales opportunities on social networks, or complaints in customer emails. To do this, you must give examples of phrases that respond to what you want to detect and note down examples that fit and also those that do not fit.

The model will then detect all the sentences or paragraphs similar to your examples in the data, and you can even do it in real-time using the API.

Why Deep Talk?

Deep Talk is a no-code text analysis platform to extract valuable data from any piece of text. Transform text inside conversations, surveys, emails, social networks, etc. into actionable data to sell more, have a better customer experience or improve your products.


Welcome to our blog