Shanahan, Hugh P and Memon, Farhat N and Upton, Graham JG and Harrison, Andrew P (2012) Normalized Affymetrix expression data are biased by G-quadruplex formation. Nucleic Acids Research, 40 (8). pp. 3307-3315. DOI https://doi.org/10.1093/nar/gkr1230
Shanahan, Hugh P and Memon, Farhat N and Upton, Graham JG and Harrison, Andrew P (2012) Normalized Affymetrix expression data are biased by G-quadruplex formation. Nucleic Acids Research, 40 (8). pp. 3307-3315. DOI https://doi.org/10.1093/nar/gkr1230
Shanahan, Hugh P and Memon, Farhat N and Upton, Graham JG and Harrison, Andrew P (2012) Normalized Affymetrix expression data are biased by G-quadruplex formation. Nucleic Acids Research, 40 (8). pp. 3307-3315. DOI https://doi.org/10.1093/nar/gkr1230
Abstract
Probes with runs of four or more guanines (G-stacks) in their sequences can exhibit a level of hybridization that is unrelated to the expression levels of the mRNA that they are intended to measure. This is most likely caused by the formation of G-quadruplexes, where inter-probe guanines form Hoogsteen hydrogen bonds, which probes with G-stacks are capable of forming. We demonstrate that for a specific microarray data set using the Human HG-U133A Affymetrix GeneChip and RMA normalization there is significant bias in the expression levels, the fold change and the correlations between expression levels. These effects grow more pronounced as the number of G-stack probes in a probe set increases. Approximately 14 of the probe sets are directly affected. The analysis was repeated for a number of other normalization pipelines and two, FARMS and PLIER, minimized the bias to some extent. We estimate that ∼15 of the data sets deposited in the GEO database are susceptible to the effect. The inclusion of G-stack probes in the affected data sets can bias key parameters used in the selection and clustering of genes. The elimination of these probes from any analysis in such affected data sets outweighs the increase of noise in the signal. © 2011 The Author(s).
Item Type: | Article |
---|---|
Uncontrolled Keywords: | Humans; DNA Probes; Oligonucleotide Array Sequence Analysis; Gene Expression Profiling; G-Quadruplexes |
Subjects: | Q Science > Q Science (General) |
Divisions: | Faculty of Science and Health Faculty of Science and Health > Mathematics, Statistics and Actuarial Science, School of |
SWORD Depositor: | Unnamed user with email elements@essex.ac.uk |
Depositing User: | Unnamed user with email elements@essex.ac.uk |
Date Deposited: | 02 Feb 2013 18:18 |
Last Modified: | 16 May 2024 17:09 |
URI: | http://repository.essex.ac.uk/id/eprint/5411 |
Available files
Filename: shanahan_nar_2012.pdf
Licence: Creative Commons: Attribution-Noncommercial 3.0