Chromatin accessibility data sets show bias due to sequence specificity of the DNase I enzyme.

Koohy H.; Down TA.; Hubbard TJ.

Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

Chromatin accessibility data sets show bias due to sequence specificity of the DNase I enzyme.

Koohy H., Down TA., Hubbard TJ.

See more details

BACKGROUND: DNase I is an enzyme which cuts duplex DNA at a rate that depends strongly upon its chromatin environment. In combination with high-throughput sequencing (HTS) technology, it can be used to infer genome-wide landscapes of open chromatin regions. Using this technology, systematic identification of hundreds of thousands of DNase I hypersensitive sites (DHS) per cell type has been possible, and this in turn has helped to precisely delineate genomic regulatory compartments. However, to date there has been relatively little investigation into possible biases affecting this data. RESULTS: We report a significant degree of sequence preference spanning sites cut by DNase I in a number of published data sets. The two major protocols in current use each show a different pattern, but for a given protocol the pattern of sequence specificity seems to be quite consistent. The patterns are substantially different from biases seen in other types of HTS data sets, and in some cases the most constrained position lies outside the sequenced fragment, implying that this constraint must relate to the digestion process rather than events occurring during library preparation or sequencing. CONCLUSIONS: DNase I is a sequence-specific enzyme, with a specificity that may depend on experimental conditions. This sequence specificity is not taken into account by existing pipelines for identifying open chromatin regions. Care must be taken when interpreting DNase I results, especially when looking at the precise locations of the reads. Future studies may be able to improve the sensitivity and precision of chromatin state measurement by compensating for sequence bias.

Original publication

DOI

10.1371/journal.pone.0069853

Type

Journal

PLoS One

Publication Date

2013

Volume

Keywords

Base Sequence, Bias, Cell Line, Chromatin, Databases, Nucleic Acid, Deoxyribonuclease I, Humans, Molecular Sequence Data, Nucleotide Motifs, Substrate Specificity