Coding Emotional Reactions in Python
Sentiment Analysis • 2019-2020 • Independent Project
Background
Sentiment analysis is a popular tool used by researchers and data analysts to further understand their customer and client base. It has a wide variety of applications from brand monitoring to market research and often focuses on the opinions and subjective information given by customer feedback. The most common use of sentiment analysis is organizing large amounts of subjective and quantitative data into three categories: positive, negative and neutral. This allows for researchers to efficiently organize the text into attitudes towards a topic.
Project Design
This project was an independent project completed for a final examination of my independent study in Python programming. I was tasked with finding a complete project that allowed me to showcase the skills I learned over the duration of the course. For this project, I performed sentiment analysis that had the ability to analyze customer comments on the 2008 Honda Accord. For this project I chose to code the sentiments into different emotions to see the overall consensus of the customers and compare the Honda Accord for its comfort, performance, and feedback on the interior.
Data
The dataset was obtained from an open source set from 2010 that contained various opinions on different topics such as opinions on cars, hotels, and electronics (Ganesan, 2010).
Method
This coding processes utilized Python and Tableau to create the finished project. With Python, I followed a database created by Attreya Bhatt. Check out my GitHub for the full code and emotion spreadsheet.
Results
Results yielded a successful base sentiment analysis that had the ability to analyze text and count the amount of different emotions located within the text.
Example
Below is a bubble chart for the Comfort reviews of the 2008 Honda Accord. It seemed that a lot of the sentiments were positive and most of the words that were processed fell into the category of “Happy”. Another sentiment that was a reoccurring theme for the Comfort comments was “attached” which stands for words within the comments that related to a sentimental attachment such as “loved”.
Because the emotions were already defined and organized, I decided to separate them into the categories of “happy” and “unhappy” with the product instead of “positive” and “negative”. This process followed the traditional codes for sentiment analysis but instead of coding with the words positive and negative I replaced them with happy and unhappy.
Results indicated that for the 2008 Honda Accord, customer comments indicated that they were happy with the performance and comfort of the vehicle. This included emotional sentiments such as “satisfied” and “comfortable”.
Future Directions
The purpose of this project was to create a full project that utilized the coding I had learned over my independent course in Python. In total, this project tested my ability to showcase my understanding of natural language processing, filtering and looping, and stop words. As with all sentiment analyses, in order to get a higher accuracy of feedback, you need to train the algorithm to recognize more phrases and emotions. My first step will be to increase the emotional word bank to encompasses different iterations of words (i.e. comfort, comforted, comfortable) and add an additional code that can process the negative such as “not comfortable” or “not happy”.
Citations
Ganesan, K., Zhai, C. X., and Han, J. (2010). Opinosis: A Graph Based Approach to Abstractive Summarization of Highly Redundant Opinions. Proceedings of the 23rd International Conference on Computational Linguistics (COLING 2010), Beijing, China.