• Login
    • University Home
    • Library Home
    • Lib Catalogue
    • Advance Search
    View Item 
    •   IR@KDU Home
    • INTERNATIONAL RESEARCH CONFERENCE ARTICLES (KDU IRC)
    • 2021 IRC Articles
    • Computing
    • View Item
    •   IR@KDU Home
    • INTERNATIONAL RESEARCH CONFERENCE ARTICLES (KDU IRC)
    • 2021 IRC Articles
    • Computing
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Binary and Multi-Class Classification Using Supervised Machine Learning Algorithms and Ensemble Model

    Thumbnail
    View/Open
    13.pdf (518.8Kb)
    Date
    2021
    Author
    Asela, H
    Metadata
    Show full item record
    Abstract
    Classification is a vital aspect in data mining, where vast quantities of data are segregated into discrete classes. Models based on different statistical and machine learning approaches are used for this task. However, the classification performance depends on multiple factors like selected algorithm, domain and features of the dataset. The objective of this study is to evaluate the classification performance of widely used supervised machine learning algorithms; Decision Tree (DT), Naïve Bayes (NB) algorithm, Support Vector Classifier (SVC), KNearest Neighbour (KNN) algorithm and the Ensemble Model (EM) based on soft voting technique. These algorithms are tested on 6 datasets in different domains, and the datasets contain both multi-class and binary class data as well as balanced and imbalanced data. Accuracy, Precision and Recall are used as evaluation metrics to evaluate the classification performance in balanced datasets, where F1- measure is used in imbalanced dataset for the same task. The evaluation results indicate that EM outperformed single algorithms at most instances. When comparing single algorithms, KNN performed best with multi class classification, where SVC performed best in binary classification in balanced datasets. Also, KNN showed the best classification performance when it comes to imbalanced dataset. All the algorithms performed well when the data set is balanced. However, the classification performance in all models including EM is below expectation, when the data distribution is highly imbalanced.
    URI
    http://ir.kdu.ac.lk/handle/345/5210
    Collections
    • Computing [62]

    Library copyright © 2017  General Sir John Kotelawala Defence University, Sri Lanka
    Contact Us | Send Feedback
     

     

    Browse

    All of IR@KDUCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsFacultyDocument TypeThis CollectionBy Issue DateAuthorsTitlesSubjectsFacultyDocument Type

    My Account

    LoginRegister

    Library copyright © 2017  General Sir John Kotelawala Defence University, Sri Lanka
    Contact Us | Send Feedback