dc.contributor.author |
Otto TD, Böhme U, Sanders M, Reid A, Bruske EI, Duffy CW, Bull PC, Pearson RD, Abdi A, Dimonte S, Stewart LB, Campino S, Kekre M, Hamilton WL, Claessens A, Volkman SK, Ndiaye D, Amambua-Ngwa A, Diakite M, Fairhurst RM, Conway DJ, Franck M, Newbold CI, Berriman M. |
|
dc.description.abstract |
Background: Although thousands of clinical isolates of Plasmodium falciparum are being
sequenced and analysed by short read technology, the data do not resolve the highly
variable subtelomeric regions of the genomes that contain polymorphic gene families
involved in immune evasion and pathogenesis. There is also no current standard
definition of the boundaries of these variable subtelomeric regions. Methods: Using longread sequence data (Pacific Biosciences SMRT technology), we assembled and annotated
the genomes of 15 P. falciparum isolates, ten of which are newly cultured clinical
isolates. We performed comparative analysis of the entire genome with particular
emphasis on the subtelomeric regions and the internal var genes clusters. Results: The
nearly complete sequence of these 15 isolates has enabled us to define a highly conserved
core genome, to delineate the boundaries of the subtelomeric regions, and to compare
these across isolates. We found highly structured variable regions in the genome. Some
exported gene families purportedly involved in release of merozoites show copy number
variation. As an example of ongoing genome evolution, we found a novel CLAG gene in
six isolates. We also found a novel gene that was relatively enriched in the South East
Asian isolates compared to those from Africa. Conclusions: These 15 manually curated
new reference genome sequences with their nearly complete subtelomeric regions and
fully assembled genes are an important new resource for the malaria research
community. We report the overall conserved structure and pattern of important gene
families and the more clearly defined subtelomeric regions. |
en_US |