Ferdous, Moshina M and Bao, Yanchun and Vinciotti, Veronica and Liu, Xiaohui and Wilson, Paul (2018) Predicting gene expression from genome wide protein binding profiles. Neurocomputing, 275. pp. 1490-1499. DOI https://doi.org/10.1016/j.neucom.2017.09.094
Ferdous, Moshina M and Bao, Yanchun and Vinciotti, Veronica and Liu, Xiaohui and Wilson, Paul (2018) Predicting gene expression from genome wide protein binding profiles. Neurocomputing, 275. pp. 1490-1499. DOI https://doi.org/10.1016/j.neucom.2017.09.094
Ferdous, Moshina M and Bao, Yanchun and Vinciotti, Veronica and Liu, Xiaohui and Wilson, Paul (2018) Predicting gene expression from genome wide protein binding profiles. Neurocomputing, 275. pp. 1490-1499. DOI https://doi.org/10.1016/j.neucom.2017.09.094
Abstract
High-throughput technologies such as chromatin immunoprecipitation (IP) followed by next generation sequencing (ChIP-seq) in combination with gene expression studies have enabled researchers to investigate relationships between the distribution of chromosome-associated proteins and the regulation of gene transcription on a genome-wide scale. Several attempts at integrative analyses have identified direct relationships between the two processes. However, a comprehensive understanding of the regulatory events remains elusive. This is in part due to the scarcity of robust analytical methods for the detection of binding regions from ChIP-seq data. In this paper, we have applied a recently proposed Markov random field model for the detection of enriched binding regions under different biological conditions and time points. The method accounts for spatial dependencies and IP efficiencies, which can vary significantly between different experiments. We further defined the enriched chromosomal binding regions as distinct genomic features, such as promoter, exon, intron, and distal intergenic, and then investigated how predictive each of these features are of gene expression activity using machine learning techniques, including neural networks, decision trees and random forest. The analysis of a ChIP-seq time-series dataset comprising six protein markers and associated microarray data, obtained from the same biological samples, shows promising results and identified biologically plausible relationships between the protein profiles and gene regulation.
Item Type: | Article |
---|---|
Uncontrolled Keywords: | ChIP-seq; Epigenetics; Gene Expression; Markov Random Field; Machine Learning |
Subjects: | Q Science > QH Natural history > QH426 Genetics |
Divisions: | Faculty of Science and Health Faculty of Science and Health > Mathematics, Statistics and Actuarial Science, School of |
SWORD Depositor: | Unnamed user with email elements@essex.ac.uk |
Depositing User: | Unnamed user with email elements@essex.ac.uk |
Date Deposited: | 06 Oct 2017 14:26 |
Last Modified: | 30 Oct 2024 17:21 |
URI: | http://repository.essex.ac.uk/id/eprint/20467 |
Available files
Filename: 1-s2.0-S0925231217316235-main.pdf
Licence: Creative Commons: Attribution 3.0