The zebrafish reference genome sequence and its relationship to the human genome

Kerstin Howe, Matthew D. Clark, Carlos F. Torroja, James Torrance, Camille Berthelot, Matthieu Muffato, John E. Collins, Sean Humphray, Karen McLaren, Lucy Matthews, Stuart McLaren, Ian Sealy, Mario Caccamo, Carol Churcher, Carol Scott, Jeffrey C. Barrett, Romke Koch, Gerd Jörg Rauch, Simon White, William ChowBritt Kilian, Leonor T. Quintais, José A. Guerra-Assunção, Yi Zhou, Yong Gu, Jennifer Yen, Jan Hinnerk Vogel, Tina Eyre, Seth Redmond, Ruby Banerjee, Jianxiang Chi, Beiyuan Fu, Elizabeth Langley, Sean F. Maguire, Gavin K. Laird, David Lloyd, Emma Kenyon, Sarah Donaldson, Harminder Sehra, Jeff Almeida-King, Jane Loveland, Stephen Trevanion, Matt Jones, Mike Quail, Dave Willey, Adrienne Hunt, John Burton, Sarah Sims, Kirsten McLay, Bob Plumb, Joy Davis, Chris Clee, Karen Oliver, Richard Clark, Clare Riddle, David Eliott, Glen Threadgold, Glenn Harden, Darren Ware, Beverly Mortimer, Giselle Kerry, Paul Heath, Benjamin Phillimore, Alan Tracey, Nicole Corby, Matthew Dunn, Christopher Johnson, Jonathan Wood, Susan Clark, Sarah Pelan, Guy Griffiths, Michelle Smith, Rebecca Glithero, Philip Howden, Nicholas Barker, Christopher Stevens, Joanna Harley, Karen Holt, Georgios Panagiotidis, Jamieson Lovell, Helen Beasley, Carl Henderson, Daria Gordon, Katherine Auger, Deborah Wright, Joanna Collins, Claire Raisen, Lauren Dyer, Kenric Leung, Lauren Robertson, Kirsty Ambridge, Daniel Leongamornlert, Sarah McGuire, Ruth Gilderthorp, Coline Griffiths, Deepa Manthravadi, Sarah Nichol, Gary Barker, Siobhan Whitehead, Michael Kay, Jacqueline Brown, Clare Murnane, Emma Gray, Matthew Humphries, Neil Sycamore, Darren Barker, David Saunders, Justene Wallis, Anne Babbage, Sian Hammond, Maryam Mashreghi-Mohammadi, Lucy Barr, Sancha Martin, Paul Wray, Andrew Ellington, Nicholas Matthews, Matthew Ellwood, Rebecca Woodmansey, Graham Clark, James Cooper, Anthony Tromans, Darren Grafham, Carl Skuce, Richard Pandian, Robert Andrews, Elliot Harrison, Andrew Kimberley, Jane Garnett, Nigel Fosker, Rebekah Hall, Patrick Garner, Daniel Kelly, Christine Bird, Sophie Palmer, Ines Gehring, Andrea Berger, Christopher M. Dooley, Zübeyde Ersan-Ürün, Cigdem Eser, Horst Geiger, Maria Geisler, Lena Karotki, Anette Kirn, Judith Konantz, Martina Konantz, Martina Oberländer, Silke Rudolph-Geiger, Mathias Teucke, Kazutoyo Osoegawa, Baoli Zhu, Amanda Rapp, Sara Widaa, Cordelia Langford, Fengtang Yang, Nigel P. Carter, Jennifer Harrow, Zemin Ning, Javier Herrero, Steve M.J. Searle, Anton Enright, Robert Geisler, Ronald H.A. Plasterk, Charles Lee, Monte Westerfield, Pieter J. De Jong, Leonard I. Zon, John H. Postlethwait, Christiane Nüsslein-Volhard, Tim J.P. Hubbard, Hugues Roest Crollius, Jane Rogers, Derek L. Stemple*

*Corresponding author for this work

Abstract

Zebrafish have become a popular organism for the study of vertebrate gene function. The virtually transparent embryos of this species, and the ability to accelerate genetic studies by gene knockdown or overexpression, have led to the widespread use of zebrafish in the detailed investigation of vertebrate gene function and increasingly, the study of human genetic disease. However, for effective modelling of human genetic disease it is important to understand the extent to which zebrafish genes and gene structures are related to orthologous human genes. To examine this, we generated a high-quality sequence assembly of the zebrafish genome, made up of an overlapping set of completely sequenced large-insert clones that were ordered and oriented using a high-resolution high-density meiotic map. Detailed automatic and manual annotation provides evidence of more than 26,000 protein-coding genes, the largest gene set of any vertebrate so far sequenced. Comparison to the human reference genome shows that approximately 70% of human genes have at least one obvious zebrafish orthologue. In addition, the high quality of this genome assembly provides a clearer understanding of key genomic features such as a unique repeat content, a scarcity of pseudogenes, an enrichment of zebrafish-specific genes on chromosome 4 and chromosomal regions that influence sex determination.

Original languageEnglish
JournalNature
Volume496
Issue number7446
Pages (from-to)498-503
Number of pages6
ISSN0028-0836
DOIs
Publication statusPublished - 25.04.2013

Funding

Acknowledgements We wish to thank R. Durbin, E. Birney, A. Scally, C. P. Ponting, E. Busch-Nentwich and R. Kettleborough for helpful discussions, as well as F. L. Marlow and P.Aanstad for criticalreadingand helpfulcommentsonmanuscripts.Wethankthe zebrafish information network (ZFIN) for funding part of the manual annotation of the zebrafish genome and the ZFIN staff for support with gene nomenclature and other genome issues. We also thank the Genome Reference Consortium for the maintenance and improvement of the zebrafish genome assembly. We are indebted to the Ensembl team for providing a browser and database that greatly facilitated the use and the analyses of the zebrafish genome. We thank A. Pirani at Affymetrix for genotyping advice support, and the Zebrafish International Resource Center (ZIRC) for distributing the SAT strain. J.H.P.was supportedbythe NationalInstitutesofHealth(NIH) grantR01 GM085318 (to J.H.P.), NIH grant P01 HD22486 (to J.H.P.) and R01 OD011116 (later changed to R01 RR020833) (to J.H.P.). We would like to acknowledge the support of the European Commission’s Sixth Framework Programme (contract no. LSHG-CT-2003-503496, ZF-MODELS) and Seventh Framework Programme (grant no. HEALTH-F4-2010-242048, ZF-HEALTH). R.G. was supported by the German Human Genome Project (DHGP Grant 01 KW 9627 and 01 KW 9919). C.N.-V., G.-J.R. and R.G. were supported by the NIH (NIH grant1 R01DK55377-01A1). S.C.S. was supported by the German Research Foundation (DFG Grant NU 22/5). The Zebrafish Genome Project at the Wellcome Trust Sanger Institute was funded by Wellcome Trust grant number 098051. S.M., C.S., J.C., B.F., E.L., S.F.M., M.J., M.Q., D.W., A.H., J.B., S.S., K.M., B.P., J..D., C.C., K.O., B.M., G.K., B.P., A.T., N.C., C.J., S.C., M.S., R.G., P.H., N.B., C.Lanz, C.S., J.H., K.H., G.P., J.L., H.B., C.H., D.G., D.W., C.R., L.D., K.L., L.R., K.A., D.L., S.M., R.G., C.G., D.M., S.N., G.B., S.W., M.K., J.B., C.M., E.G., M.H.,N.S., D.B.,D.S.,J.W.,A.B.,S.H., K.O., M.M.-M.,L.B.,S.M., P.W.,A.E., N.M., M.E., R.W., G.C., J.C., A.T., D.G., C.S., R.P., R.A., E.H., A.K., J.G., N.F., R.H., P.G., D.K., C.B. and S.P. The generation of maps used in the initial assemblies and the production of clone tiling paths were carried out by R.K., S.H., G.-J.R., Y.Z., C.R., R.C., D.E., D.W., S.B., L.M., M.D., I.G., A.B., C.M.D., Z.E.-Ü., C.E., H.G., M.G., L.K., A.K., J.K., M.K., M.O., S.R.-G., M.T., C.Lanz, G.R., S.C.S., R.B., F.Y., N.P.C., R.G., R.H.A.P. and C.Lee. K.O., B.Z. and P.J.d.J. generated and provided clone libraries. The Zebrafish Genome Project was coordinated by L.I.Z., J.H.P., C.N.-V., T.J.P.H., J.R. and D.L.S. In this Letter, five authors were inadvertently omitted: Sharmin Begum and Christine Lloyd from the Wellcome Trust Sanger Institute, and Christa Lanz, Günter Raddatz and Stephan C. Schuster from the Max Planck Institute for Developmental Biology. David Elliot was incorrectly listed as David Eliot, Beverley Mortimore was incorrectly listed as Beverly Mortimer, and James D. Cooper was incorrectly listed as James Cooper. In addition, the acknowledgements section should state that author S.C.S. was supported by the German Research Foundation (DFG Grant NU 22/5). These errors, along with corresponding minor changes to the Author Contributions section, have been corrected in the HTML and PDF versions of the original manuscript.

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 3 - Good Health and Well-being
    SDG 3 Good Health and Well-being

Fingerprint

Dive into the research topics of 'The zebrafish reference genome sequence and its relationship to the human genome'. Together they form a unique fingerprint.

Cite this