Feature evaluation of the support vector machine for micro-RNA target prediction in Arabidopsis thaliana based on antisense transcription and small RNA abundance
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Micro RNAs (miRNAs) are small non coding RNA that contribute to post transcriptional regulation. They are 21-23 nucleotide long sequences that effect development by binding by Watson- Crick pairing to a target gene and antagonizing various pathways of expression. This thesis explores the miRNA binding within the Arabidopsis thaliana genome as it relates to antisense transcription of target genes. Presented is a prediction mechanism that is based on two related features antisense transcription and small RNA abundance, hypothesized to be markers of the miRNA binding site in the target gene. A newly discovered phenomenon in the antisense strand of the target genes was implemented as a novel feature for target gene prediction. This feature, along with small RNAs and a commonly used indicator of binding sites, were used in a Support Vector Machine to build a prediction model. The three features were incorporated and analyzed using the output of the Support Vector Machine. Comparison was made between predicted and validated classifications to evaluate the importance of the features. Based on the accuracy, specificity, sensitivity and precision of the SVM results, the newly discovered feature may be able to identify new miRNA target sires in Arabidopsis and other species with deep genomic resources.