Microarray based Multi Filter Fusion Gene Selection and Ensemble Classification of Leukemia Sub Types mia Sub Types
View/ Open
Date
2016Author
Edirimanna, C.L
Jinasena, T.M.K.K.
Edirisuriya, E.A.T.A
Metadata
Show full item recordAbstract
Leukemia is a blood cancer which exists in bone marrow. Two major acute leukemia types, Acute Lymphocytic Leukemia (ALL) and Acute Myelogenous Leukemia (AML) need immediate treatments .Conventional lab methods take more time to differentiate these two types risking the patient?s life. The invention of the micro array technology has been recognized as a major advancement in cancer diagnosis and prognosis. However, these gene expression data has a significant higher number of dimensions. This curse of dimensionality makes it difficult to find associations and patterns across multiple dimensions. The benchmark micro array data set, consisting of 72 patients with 7000 attributes, has been used. Extracting genes only related to the disease and classifying them across the multiple dimensions are the research challenges. A Multi Filter Fusion based gene selection and an Ensemble based Classifier (MFF-EC) is proposed to improve the accuracy of individual filters. The main three steps are (1) Feature selection (2) Multi filter fusion and (3) Ensemble classification of ALL/AML. Further, both parallel and sequential approaches are used for step 1 and 2 separately. ReliefF, Correlation, Gain Ratio, and Weight-Support Vector Machine are used as distinct feature selection methods. In MFF-EC parallel approach, a consensus score is introduced to select a sub set of genes from each individual filter method and combined the maximum scored genes. In MFF-EC sequential approach, selected filters are applied one after the other to gradually reduce the number of dimensions. Finally, an ensemble classifier is used to combine the results of multi filters. Performances of the classification models have been evaluated and MFF-EC parallel performs better than the other four methods in terms of accuracy, sensitivity and specificity with average values 98.56, 98.87 and 99.1 respectively.
Collections
- Computing [28]