Skip to main content

Cookies on this website

We use cookies to ensure that we give you the best experience on our website. If you click 'Accept all cookies' we'll assume that you are happy to receive all cookies and you won't see this message again. If you click 'Reject all non-essential cookies' only necessary cookies providing core functionality such as security, network management, and accessibility will be enabled. Click 'Find out more' for information on how to change your cookie settings.

AbstractMotivationTracking and understanding data quality, analysis and reproducibility are critical concerns in the biological sciences. This is especially true in genomics where Next Generation Sequencing (NGS) based technologies such as ChIP-seq, RNA-seq and ATAC-seq are generating a flood of genome-scale data. These data-types are extremely high level and complex with single experiments capable of mapping ten to hundreds of thousands of biologically meaningful events across the genome. However, such data are usually processed with automated tools and pipelines, generating tabular outputs and static visualizations. These are difficult to interact with and require substantial bioinformatic skills to manipulate and query. Similarly, interpretation is normally made at a high level without the ability to visualise the underlying data in detail and so the complexity and quality of the real underlying biological signal is lost. Also genomics datasets require integration with other genomics datasets to be properly interpreted and this integration with multiple tracks again requires substantial bioinformatics skills and is difficult to visualise across multiple pertinent datasets. Conventional genome browsers do allow for the detailed visualisation of multiple tracks but are limited to browsing single locations and do not allow for interactions with the dataset as a whole. MLV has been developed to allow users to fluidly interact with genomics datasets at multiple scales, from complete metadata labelled and clustered populations to detailed representations of individual elements. It has inbuilt tools to integrate signals across multiple datasets and to perform dimensionality reduction and clustering analysis based on the extracted signal, allowing for the high-level analysis of complex datasets while maintaining visualisation of the fine grain structure of the data. MLV’s ability to visualise clustering within the data combined with efficient tools for large-scale tagging of individual elements makes it a unique tool for the generation of annotated datasets for modern machine learning approaches.ResultsMulti Locus View (MLV) is a web based tool for the visualisation, analysis and annotation of Next Generation Sequencing data sets. The user is able to browse the raw data, cluster, and combine the data with other analysis. Intuitive filtering and visualisation then enables the user to quickly locate and annotate regions of interest. User datasets can then be shared with other users or made public for quick assessment from the academic community. MLV is publically available at https://mlv.molbiol.ox.ac.uk and the source code is available at https://github.com/Hughes-Genome-Group/mlv

Original publication

DOI

10.1101/2020.06.15.151837

Type

Publisher

Cold Spring Harbor Laboratory

Publication Date

16/06/2020