dc.description.abstract | For efficient communication, comprehension,
and writing in a variety of contexts including academic
papers, legal documents, and media documents proper use
of tense is essential. Notwithstanding its significance, tense
usage is a common source of errors for both experts and
students. By having access to a built-in tense detection tool,
students can easily enhance their sentence construction
skills through consistent self-study in early life education.
As a solution for this, the study objective is to propose an
automatic classification of English sentences using machine
learning algorithms according to their tense: past, present,
or future. We used a dataset with 1500 sentences that were
split equally across the three tense groups. Tokenization
and lowercasing were used in the preprocessing phase, and
the Term Frequency-Inverse Document Frequency (TF-
IDF) technique was then used to extract features. Then, six
classification algorithms Naive Bayes, Random Forest,
Decision Tree (J48), Support Vector Machine (SVM),
Logistic Regression and ensemble learning by combining
above five algorithms were tested in this research. Metrics
including accuracy, precision, recall, F-measure, and error
values were used in the evaluation. In terms of evaluation,
the ensemble learning strategy outperformed individual
models in all evaluations by achieving the best accuracy of
95.56%. In ensemble learning, the majority voting
combination rule worked best in 70% training data. This
work shows how machine learning may improve tense
classification, providing a useful tool for both academic and
professional contexts. | en_US |