GeneGrouper

GeneGrouper is a shiny new tool that quantifies variation in clusters of contiguous genes in multiple genomes. This is useful in inferring functionality from DNA sequencing data and for deducing evolutionary history of a gene cluster. You can access on GitHub or PyPI. Check out the publication in Bioinformatics. Feel free to contact lead author Alex McFarland if you have questions or feedback.

Figure 1: GeneGrouper overviewA. GeneGrouper use case. B. Required user inputs for GeneGrouper. C. Grouping algorithm used by GeneGrouper. The genomic region surrounding the seed gene is extracted and has all genes within the genomic region subject to orthology clustering. Orthology results are mapped back to all genes to then generate a Jaccard-dissimilarity matrix on which the DBSCAN algorithm is run to identify groups of similar gene clusters. D. LT2 Pdu gene cluster search in 1130 Salmonella enterica genomes using PduA as the query gene returns four groups of gene clusters (groups 0-3). The main GeneGrouper output consists of three parts. The left panel shows a boxplot with overlaid points for the seed gene coverage and identity of each member of that group relative to the initial translated query gene used for the search. The middle panel shows the representative architecture of each GeneGrouper group. The x-axis is scaled so that the 0 bp position is the start of the seed gene. ‘Group Unb.’ contains all gene clusters that DBSCAN could not bin into a discrete group. Genes are shown with RefSeq gene name or product annotation if the gene name is not available. Numbers above genes indicate internal orthology identifiers. The right panel shows a boxplot with overlaid points showing the Jaccard dissimilarity of each member of that group. Dissimilarities are presented for each member relative to the group representative shown in the middle panel, and also its average pair-wise dissimilarity to all other members in the group. On the right-hand side of the right panel are the counts of all members that belong to that group.

Logo credit goes to Dr. Carrie Mills.