• Login
    • University Home
    • Library Home
    • Lib Catalogue
    • Advance Search
    View Item 
    •   KDU-Repository Home
    • ACADEMIC JOURNALS
    • International Journal of Research in Computing
    • Volume 04 , Issue 01 , 2025
    • View Item
    •   KDU-Repository Home
    • ACADEMIC JOURNALS
    • International Journal of Research in Computing
    • Volume 04 , Issue 01 , 2025
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Deep Learning Approaches for Classifying Informal and Formal English Texts Using Linguistic Features

    Thumbnail
    View/Open
    IJRC V 4 I (pages 9-22).pdf (495.4Kb)
    Date
    2025-01
    Author
    Karunarathna, KMGS
    Rupasingha, RAHM
    Kumara, BTGS
    Metadata
    Show full item record
    Abstract
    Effective techniques for automatically classifying texts are becoming increasingly necessary due to the exponential expansion of digital material. Differentiating between formal and informal documents can help students identify appropriate resources for their assignments and improve the effectiveness of information retrieval systems. Although machine learning is extensively utilized in classification of text, there is a lack of research focused to the effective differentiation of formal and informal writings through linguistic features. This gap highlights the necessity for advanced methodologies that improve classification accuracy and enhance the value of digital content in academic and retrieval systems. Our research addresses the problem by utilizing deep learning methodologies and a wide range of 13 linguistic attributes to get enhanced efficacy in text classification. Artificial Neural Networks (ANN), Convolutional Neural Networks (CNN), and Long Short-Term Memory Networks (LSTM) were considered. A dataset , including both formal (news articles, formal documents) and informal (personal letters, personal blogs) texts, were gathered from several web sources. We considered linguistic markers such as colloquialisms, contractions, modal verbs, slang, acronyms, pronouns, phrasal verbs, grammar complexity, vocabulary complexity, voice, and language type to generate the feature vector. The feature vectors were utilized to train and assess the classification models using several cross-validation techniques, particularly 3, 5, 7, and 10 folds. The efficacy of the models was evaluated using performance indicators, f-measure, accuracy, precision, and recall. With the highest accuracy of 99.8% and resilience in differentiating between formal and informal texts, the LSTM model outperformed than the others. Future research will examine big datasets, more linguistic characteristics, sophisticated deep learning models, and real-time and multilingual classification systems.
    URI
    https://ir.kdu.ac.lk/handle/345/8917
    Collections
    • Volume 04 , Issue 01 , 2025 [6]

    Library copyright © 2017  General Sir John Kotelawala Defence University, Sri Lanka
    Contact Us | Send Feedback
     

     

    Browse

    All of KDU RepositoryCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsFacultyDocument TypeThis CollectionBy Issue DateAuthorsTitlesSubjectsFacultyDocument Type

    My Account

    LoginRegister

    Library copyright © 2017  General Sir John Kotelawala Defence University, Sri Lanka
    Contact Us | Send Feedback