Pangenomes

Sequence graphs are a intuitive way to represent the variation in a collection of DNA sequence. Sets of sequences scale from homologous copies of multi-copy chromosomes, to sets of related bacterial strains, to collections of plant cultivars. We develop techniques to create and analyze such graph representations.

Topics we address:

  • Sequence and graph alignment of multiple genomes using maximally unique matching
  • Pangenome graph construction from assemblies and/or reference based variant calls
  • Structural variant aware genome graph construction
  • Multi-level graph construction based on syntenic gene anchoring
  • Aligning reads to a graph representation of genomes  using graph decomposition indexing
  • Scalable, interactive visualization of sequence graphs for data exploration purpose