Show simple item record

dc.contributor.authorRilfi, MRM
dc.contributor.authorGunawansha, UGYM
dc.contributor.authorPrasandika, KAC
dc.contributor.authorChandrani, KGA
dc.date.accessioned2021-12-27T06:07:57Z
dc.date.available2021-12-27T06:07:57Z
dc.date.issued2021
dc.identifier.urihttp://ir.kdu.ac.lk/handle/345/5248
dc.description.abstractIn any neural machine translation between two natural languages, parallel corpus is a compulsory part of the training process. The most crucial step in an MT system is to develop an effective method for gathering parallel corpus. The construction of a parallel corpus, on the other hand, necessitates substantial knowledge of both languages and is a time-consuming procedure. Due to these limits, digitizing documents becomes extremely challenging, lowering the quality of machine translation systems. This research offers a method for producing an English to Sinhala parallel corpus that is both faster and more efficient, while requiring less human intervention. This system generates a parallel corpus for language pair using the following steps: scanning the exam question papers using a special type of scanner, Image optimization for Optical Character Recognition (OCR), text extraction from images and converting unstructured text into structured form as parallel corpus.en_US
dc.language.isoenen_US
dc.subjectparallel corpusen_US
dc.subjectimage optimizationen_US
dc.subjecttext extractionen_US
dc.subjectneural machine translationen_US
dc.titleBuilding a Sinhala-English Parallel Corpus for Neural Machine Translation Based on Exam Questionsen_US
dc.typeArticle Full Texten_US
dc.identifier.journalKDU IRC, 2021en_US
dc.identifier.issueFaculty of Computingen_US
dc.identifier.pgnos349-356en_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record