Music Training Interface for Visually Impaired with a Universally Applicable OMR Engine
Abstract
Assistive technologies specifically built for the visually impaired have not been able to cater to all their requirements. Visually impaired need third party assistance to convert visual Eastern music notation scripts to formats read able to them. Translation error rate remains high even with human assistance since music Braille form becomes more difficult to follow especially for complex notations. This research focuses on recognizing Eastern music notation scripts and translating them to an auditory output for the users. The main goal is to assist the visually impaired to independently visualize and train music notations. Optical Music Recognition (OMR) engine brought forward in this research consists of a pre-processor, regions-detector, recognizer and a postprocessor. Pre-processor captures images of notation scripts de-skewing and through binaries. Regions-detector identifies tabulated segments, rows, columns, notation groups and atomic notation symbols. Recognizer provides shape definitions for each symbol and recognizes the best match for given language. The OMR also provides an adaptable API which developers can use to initially train a new set of symbols. It is capable of providing descriptions called shape definitions for new symbols. Mapping shape definitions to corresponding music notation symbols can be done through a configuration file. The intermediary file produced by the Recognizer is further analysed by the post-processor which refines the notation sequence depending on a matrix on probabilities of one note following another. This also suggests most appropriate substitutions for missing or noisy symbols. Recognized notation sequence is then converted to an auditory form. The OMR engine performs with 94.2% accuracy rate for Sinhala Eastern music notations while revealing successful adaptations for Hindi and English language symbols. 81% of the users accept that this type of an interface is more convenient for them compared to the existing method.
Collections
- Computing [32]