DI: Machine Learning Tool for Sentiment Analysis
Sentiment Analysis is very important for opinion surveys and election campaigns. For example, we send out a lot of survey questionnaires every year asking for opinions and comments on various topics, which normally include some open-end questions allowing respondents to fill in whatever they want. Yet, it is extremely difficult to analyze them, especially when you have to classify them into some simple categories, such as supportive, opposing, or neutral. For example, when the response says, “It is not bad, but it can be better”, should we classify it as positive or neutral? Sometimes, people’s responses can be tricky by using a double negative. For example, when it says, “the product is not without defects”, it may require an intelligent machine to interpret it correctly.
In an election campaign, it is also very important to know about the voters’ sentiments before the voting date, so as to estimate the winning chance and determine the last moment strategies. Even though there are now social media for voters to voice out their concerns, they are very difficult to analyze, not to mention the difficulty in categorizing them into For or Against.
Nowadays, there are some machine learning (ML) tools for sentiment analysis. But the consultants charge a very high fee for doing the analysis. Can we do it by ourselves?
It may require an ML program for training the computer to analyze the responses. Fortunately, there are now some readily available programs free of charge for trial. For example, the MonkeyLearn.com (just quoted as an example and I have no interest in the company) provides a text analysis program to do sentiment analysis free of charge. Of course, it is highly limited but it is still good for exploring the fun of training a machine on how to analyze the sentiment of responses.
However, unlike an expert system, ML requires someone to train the machine before it can become experienced in sentiment analysis. In other words, if the trainer interprets the responses wrongly, then the machine would follow the trainer’s mistakes. For an introduction on the training requirement of ML, please refer to my previous article at Yiu (2019).
Figure 1 shows a training example of the ML tool. You have to tell the machine whether this response should be classified as positive, negative or neutral. The more examples you train the machine, the higher the accuracy the analysis would be, because it is based on the keywords search, I suppose, as shown in Figure 2.
After the training session, you may try testing the accuracy of the analyzer by typing or providing different responses. It would also report the confidence of the guess. I have found very funny results when we input very complicated responses. It is fun!
Let’s try a sentiment analysis of your opinions on my blog. Here is a Survey Form of just 4 questions, without asking any personal information, https://www.surveymonkey.com/r/SFH6X28
The first two questions are to help classify your comments. It helps to train the machine without relying too much on the human trainer. If the first two answers are of high scores, then the comment is more likely to be positive. So remember to input your written comments, so that it can train the machine to analyze the readers’ sentiments in the future. I will tell you the analysis results if we have more than 50 responses. Have fun!
(This article is for the Data Intelligence Series)
References
Yiu, C.Y. (2019) From Automation to Machine Learning, Medium Jan 3. https://medium.com/@ecyY/from-automation-to-machine-learning-c61fefe483f5