• Login
    • University Home
    • Library Home
    • Lib Catalogue
    • Advance Search
    View Item 
    •   IR@KDU Home
    • INTERNATIONAL RESEARCH CONFERENCE ARTICLES (KDU IRC)
    • 2021 IRC Articles
    • Computing
    • View Item
    •   IR@KDU Home
    • INTERNATIONAL RESEARCH CONFERENCE ARTICLES (KDU IRC)
    • 2021 IRC Articles
    • Computing
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Comparison of Machine Learning Classifiers for Sentiment Analysis in Hotel Reviews

    Thumbnail
    View/Open
    27.pdf (603.7Kb)
    Date
    2021
    Author
    Kaushalya, PLU
    Wickramaarachchi, WU
    Metadata
    Show full item record
    Abstract
    Sentiment analysis or opinion mining refers to the process of identifying people’s sentiments, opinions, attitudes and emotions behind a written text. In recent years, sentiment analysis studies have become an active research area under natural language processing. Understanding the opinion behind the usergenerated text can be applied to various applications. When it comes to the hotel sector and travel planning, user reviews and comments are quite useful. Therefore, guest reviews are becoming a prominent factor, which influence people’s booking decisions. In addition, knowing about these comments is important for quality control of the hotel management too, because it may be worth checking out some stats over time. The fundamental objective of this research is to compare several machine learning classifiers and find out the best classifiers to develop a sentiment analysis model for the hotel reviews, to tackle customers’ sentiment. Under this research, a comparative analysis was established among Multinomial Naïve Bayes (MNB), Bernoulli Naïve Bayes (BNB), Logistic Regression (LR), Stochastic Gradient Descent Classifier (SGD), Linear Support Vector Classifier (SVC), Random Forest Classifier and Multi-layer Perceptron Classifier (MLP) classifiers. Moreover, two feature extraction techniques called Count Vectorizer and Term Frequency Inverse Document (TF-IDF)) are also compared to find out the best approach to perform the feature extraction. The result from this research shows that the highest results were obtained in Logistic Regression with TF-IDF method (Accuracy 87.39%) and SGD algorithms with TFIDF (Accuracy 87.71%), while the lowest accuracy was obtained for Bernoulli NB classifier with Count Vectorizer (Accuracy 64.67%). Every time when using Count Vectorizer as the feature extraction method, the accuracies decreased, than when the TF-IDF method was used.
    URI
    http://ir.kdu.ac.lk/handle/345/5224
    Collections
    • Computing [62]

    Library copyright © 2017  General Sir John Kotelawala Defence University, Sri Lanka
    Contact Us | Send Feedback
     

     

    Browse

    All of IR@KDUCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsFacultyDocument TypeThis CollectionBy Issue DateAuthorsTitlesSubjectsFacultyDocument Type

    My Account

    LoginRegister

    Library copyright © 2017  General Sir John Kotelawala Defence University, Sri Lanka
    Contact Us | Send Feedback