News
Subscribe to the Australian BioCommons monthly newsletter or read previous editions
Phylogenetics collaboration takes researchers back to basics with new training
A new online tutorial is taking researchers back to basics to uncover the principles of phylogenetics and how tree-building methods work thanks to a longstanding collaboration between Professor Michael Charleston from the University of Tasmania and Australian BioCommons.
Charles Darwin's first sketch of an evolutionary tree. Source: Wikimedia commons
A new online tutorial has been created to take researchers back to basics to uncover the principles of phylogenetics and how tree-building methods work. A longstanding collaboration between Professor Michael Charleston from the University of Tasmania and Australian BioCommons has delivered this self-guided tutorial featuring videos and hands-on exercises. To maximise its impact, the resource was tailored specifically to be shared globally via the Galaxy Training Network, and will form the basis of an upcoming live training workshop.
Using real-life data, and tools available in Galaxy and SplitsTree, the tutorial demonstrates the principles behind a variety of methods used to estimate phylogenetic trees from aligned sequence data or distance data. With a conversational style Michael discusses why phylogenetics is important, unpicks phylogenetics terminology from the roots to the tips and explains concepts such as multiple sequence alignment, how alignments are used to build trees, and phylogenetic networks.
Having the materials readily available online is already bringing benefits to Michael’s teaching at the University of Tasmania.
“Having the materials online with exercises in Galaxy is just fantastic for my university teaching. It means that students don’t have to take notes and frees them up to engage more deeply in class. Once they understand the concepts they can easily try out basic phylogenetic analyses and see how the tools work without needing to know how to code. ” - Professor Michael Charleston, University of Tasmania
Michael first created the concept of a workshop that explained the principles behind building phylogenetic trees in 2019 when experts from around Australia came together to consult on his materials ahead of a national workshop. Rather than providing an introduction to the topic, mathematician Michael’s deeper explanations of the underlying theories for people already creating phylogenetic trees found a unique niche. After amassing 24,000 views on the BioCommons YouTube channel, the need for an updated standalone tutorial was obvious.
Michael worked closely with the BioCommons training team over the last year and a half to develop this tutorial by tailoring and refreshing the materials and activities for the self-contained and easy to use Galaxy platform. This activity is part of the BioCommons’ commitment to making our training materials FAIR.
Try out the Phylogenetics: back to basics tutorial in the Galaxy Training Network.
Or if you prefer live training, join us for a workshop based on the tutorial in July.
A new best-practice workflow for easy and efficient genome assembly
An off-the-shelf bioinformatics workflow for genome assembly from HiFi read data is now available and has been specifically tailored for Australian researchers through a collaboration between BioCommons and the Australian Genomics Research Facility.
An off-the-shelf bioinformatics workflow for genome assembly from HiFi read data is now available and has been specifically tailored for Australian researchers. The new custom-built genome assembly workflow:
Allows researchers to easily and efficiently assemble genomes from their HiFi read data
Has a full suite of supporting guides and technical documents for users to follow thanks to the experts at the AGRF and the BioCommons team
Is easily findable via WorkflowHub.
Assembling genomes from HiFi reads is a common roadblock for researchers. Now, researchers can access a customised solution following a successful collaboration between two Bioplatforms Australia facilities, the Australian Genomics Research Facility (AGRF) and the Australian BioCommons. Dr Kenneth Chan, Bioinformatics Manager at the AGRF, said that:
This custom-built genome assembly workflow provides a standardised approach that follows best practice in terms of workflow design, documentation and user support. Now when AGRF generates HiFi long read sequencing data for researchers we can direct them to this workflow solution with confidence that it will suit their needs.
The workflow is written in NextFlow and employs assembly software specific for HiFi sequencing reads. It features pre-assembly quality control for the raw sequence data, a primary assembly stage using the Improved Phased Assembler from PacBio, and a post-assembly quality control stage.
Outline of the tools and processes within the HiFi genome assembly workflow
Community scale research requires reproducible, best-practice, bioinformatics workflows that can be run on a multitude of computational systems. The new custom-built workflow has been optimised across several national research consortiums, and can run on the Gadi supercomputer at NCI Australia, the Setonix supercomputer at Pawsey, Amazon Web Services, and the in-house computational systems at the AGRF. Looking to the future, the workflow has been prepared for use on NextFlow Tower as the BioCommons and our infrastructure partners are in the process of setting up a national NextFlow Tower service.
Researchers can find the new workflow easily on WorkflowHub. If you are interested in contributing to future efforts in the workflows space, the Australian BioCommons coordinates a community for computational workflows in bioinformatics. Anyone is welcome to join the conversation and contribute!
The international genetics and genomics community comes to Melbourne
Meet us at booth 27 of the 23rd International Congress of Genetics, which is being held in Melbourne, Australia from 16-21 July 2023. You can find out about our activities in building bioinformatics infrastructure to support human health, agricultural and environmental science research, and discuss the challenges researchers face. We’ll showcase the services that the BioCommons and our partners offer and we would love to hear your feedback!
Members of the Australian BioCommons team are attending the 23rd International Congress of Genetics (ICG) from 16-21 July in Melbourne. Meet us at booth 27 to find out about our activities in building bioinformatics infrastructure to support human health, agricultural and environmental science research, and discuss the challenges researchers face. We’ll showcase the services that the BioCommons and our partners offer and we would love to hear your feedback!
As well as visiting our booth, keep an eye out for the BioCommons team in the symposia, and the poster and speciality sessions:
Associate Director - Human Genome Informatics, Bernie Pope: Ultra-sensitive detection of circulating tumour DNA enriches for patients with higher risk disease in clinically localised prostate cancer (talk on 20 July, 11 am)
Human Genomics Data Specialist, Marion Shadbolt (poster): Advancing human genomics data sharing in Australia: Highlights from the Australian BioCommons
Community Engagement Officer, Tiff Nelson (poster): Robust public computational services supporting Genome Assembly and Annotation for Australian researchers
Bernie Pope (poster): Somatic mutation landscape in a cohort of meningiomas that have undergone grade progression
Deputy Director, Jeff Christiansen will give an overview of BioCommons services supporting genomics research at approximately 1 pm during the Genetics Society of AustralAsia (GSA) annual general meeting on 19 July
While at the congress, you can also visit our partners the Atlas of Living Australia (ALA), who are providing an early look at the new Australian Reference Genome Atlas interface, plus other Bioplatforms Australia supported facilities including the Australian Genome Research Facility, Ramaciotti Centre for Genomics and the Australian National University Biomolecular Research Facility.
Registration is still open, so come along and chat with the BioCommons team. We can’t wait to see you there!
Genome Lab: The new online workbench for easier data analysis
The new Galaxy Australia Genome Lab is now available for use by the Australian genomics community. This customised, user-friendly workbench provides rapid access to a range of sophisticated resources needed for genome assembly and annotation.
The new Galaxy Australia Genome Lab interface.
The new Galaxy Australia Genome Lab is now available for use by the Australian genomics community. This customised, user-friendly view of Galaxy Australia provides rapid access to a range of sophisticated genome assembly and annotation resources while retaining the full power of Galaxy Australia.
Genome Lab offers a curated collection of bioinformatics tools, workflows and tutorials tailored to data preparation, genome assembly and genome annotation. The new user-friendly, one-stop-shop within the Galaxy platform is the perfect place for newcomers to data analysis. All relevant tools and workflows come with descriptions and examples of required inputs to help researchers get started. For more advanced users, the full functionality of the Galaxy Australia platform is accessible through the surrounding tools panel and navigation bar. User history, jobs and data quota are shared with the main service, making it easy to switch between Genome Lab and the main Galaxy Australia interface.
Galaxy Australia provides Australian researchers with fully subsidised access to a high-performance computing network through a simple web interface. Researchers are able to undertake reproducible and transparent computational research in an accessible format, without needing any prior command line programming experience. There are over 1,500 pre-installed tools and over 350 workflows with extensive documentation, tutorials and training available. Wanting to simplify access and use by genomics researchers, the Galaxy Australia team developed the Genome Lab interface.
The development of Galaxy Australia’s Genome Lab represents an important step forward in Australian BioCommons activities to support Australian genomics researchers. The Genome Annotation and Genome Assembly Infrastructure Roadmaps identified a need to implement easily accessible platforms that contain all the tools for genome annotation and assembly in one location. The Galaxy Australia team has worked hard to deliver this for the genomics community, and will now continue to gather user feedback to enhance the Genome Lab. Stay tuned for updates, plus releases of other Labs supporting different research domains in the future.
If you are an Australian researcher with an interest in genomics, be sure to try out the new Galaxy Australia Genome Lab now!
Participating in a National Approach to Genomic Information Management
Australian BioCommons took part in the development of prototype components of the proposed National Approach to Genomic Information Management research ecosystem. The collaborative submission made by our Human Genomes Platform Project and the University of Melbourne Centre for Cancer Research’s returned very favourable reviews in the Preliminary Implementation Recommendations to the Australian Government (April 2022) recently released by Australian Genomics.
Australian Genomics recently released Preliminary Implementation Recommendations to the Australian Government (April 2022) to progress the development and establishment of a National Approach to Genomic Information Management (NAGIM) for Australia.
Australian BioCommons took part in the development of prototype components of the proposed NAGIM research ecosystem in 2021. Genomic data infrastructure stakeholders nationally were invited to participate in prototype development, in an open call leveraging existing capabilities/funding. BioCommons and others were tasked with addressing the identified priority areas, with the goal of identifying the best combination of components that can serve as the basis for long-term national research infrastructure.
An international expert panel was convened to advise and evaluate the NAGIM Blueprint Implementation prototypes and the collaborative submission made by Australian BioCommons’ Human Genomes Platform Project and the University of Melbourne Centre for Cancer Research was very favourably reviewed. BioCommons are listed as demonstrated NAGIM exemplars for several proposed workstreams and deliverables to support the NAGIM Implementation recommendations.
The latest report outlines the preliminary implementation recommendations for NAGIM, informed by the international panel’s NAGIM prototype evaluations, and parallel stakeholder consultation. Feedback is currently being sought on these recommendations from all stakeholders including research and clinical data communities, patient advocacy groups, Aboriginal and Torres Strait Islander genomic experts, industry, government representatives and agencies. Feedback will inform the development of a full report of the NAGIM Implementation Recommendations and proposed strategy for progressing NAGIM, which will be delivered to Government mid-year.
Stakeholders are encouraged to read the Preliminary Implementation Recommendations and to provide feedback by responding to the targeted questions in the online form by 5 pm AEST 17 June 2022.
Virus research tips Galaxy Australia over 3 million jobs
The Galaxy Australia service is being chosen by large numbers of researchers from around Australia to complete their bioinformatics analyses. Rapid uptake of the service has seen millions of jobs submitted across a broad spectrum of critical research questions with hard-hitting outcomes for the real world.
Here we highlight the work of Dr Rhys Parry, who recently submitted the three millionth job to Galaxy Australia. Rhys has used Galaxy Australia extensively - first for his PhD work in virus discovery and transcriptome assembly, and now for RNA-Seq analysis and assembly of SARS-CoV-2 genomes.
The Galaxy Australia service is being chosen by large numbers of researchers from around Australia to complete their bioinformatics analyses. Rapid uptake of the service has seen millions of jobs submitted across a broad spectrum of critical research questions with hard-hitting outcomes for the real world.
Here we highlight the work of Dr Rhys Parry, who recently submitted the three millionth job to Galaxy Australia. Rhys uses Galaxy Australia extensively in his current role as a Postdoctoral Research Fellow in Professor Alexander Khromykh’s RNA Virology Lab in the School of Chemistry and Molecular Biosciences, University of Queensland.
Currently utilising Galaxy Australia for RNA-Seq analysis and assembly of SARS-CoV-2 genomes, Rhys has become a power user since he was first encouraged by his PhD supervisor, Professor Sassan Asgari, School of Biological Sciences, University of Queensland, to make use of Galaxy and the Galaxy training resources.
The Aedes aegypti (top) and Aedes albopictus mosquitoes (below) vector many pathogenic viruses to humans, but non-human viruses remain elusive. Bioinformatics tools from Galaxy Australia helped explore the virome of these mosquitoes. (Picture of mosquitoes by Ana L. Ramírez.)
For my PhD project I assembled close to 4000 RNA-Seq datasets from samples from all over the world - a task that would have been impossible without Galaxy Australia
— Rhys Parry
On the hunt for mosquito-borne viruses, Rhys undertook ‘Trinity’ de novo assembly of the transcriptomes of two medically important mosquito species, Aedes aegypti, the yellow fever mosquito and Aedes albopictus, the Asian tiger mosquito. These two mosquitoes vector significant viruses including Dengue, Zika and Yellow fever. The research not only improved our understanding of the microbiome and virome of these mosquito species, but discovered many novel viruses including one that was pathogenic to humans.
Recognising the value of Galaxy Australia beyond virus discovery and transcriptome assembly, Rhys has also used Galaxy for bacterial de novo assembly and RNA-Seq pipelines and annotation and small RNA mapping and analysis.
For the past few years, my bioinformatics analyses have used Galaxy Australia extensively to avoid the expense of proprietary software and to allow for reproducible and modular pipelines
— Rhys Parry
Six publications resulting from this work have acknowledged the Galaxy Australia team for not only the maintenance and provision of essential computational resources, but also for the technical assistance and scientific advice that individual team members Dr Gareth Price and Dr Igor Makunin provide users of the Galaxy Australia service.
Parry, R., James, M. E., & Asgari, S. (2021). Uncovering the Worldwide Diversity and Evolution of the Virome of the Mosquitoes Aedes aegypti and Aedes albopictus. Microorganisms, 9(8), 1653. https://doi.org/10.3390/microorganisms9081653
Madhav, M., Parry, R., Morgan, J. A., James, P., & Asgari, S. (2020). Wolbachia endosymbiont of the horn fly (Haematobia irritans irritans): a Supergroup A strain with multiple horizontally acquired cytoplasmic incompatibility genes. Applied and environmental microbiology, 86(6), e02589-19. https://doi.org/10.1128/aem.02589-19
Parry, R., Wille, M., Turnbull, O. M., Geoghegan, J. L., & Holmes, E. C. (2020). Divergent influenza-like viruses of amphibians and fish support an ancient evolutionary association. Viruses, 12(9), 1042. https://doi.org/10.3390/v12091042
Bishop, C., Parry, R., & Asgari, S. (2020). Effect of Wolbachia wAlbB on a positive-sense RNA negev-like virus: A novel virus persistently infecting Aedes albopictus mosquitoes and cells. Journal of General Virology, 101(2), 216-225. https://doi.org/10.1099/jgv.0.001361
Parry, R., Naccache, F., Ndiaye, E. H., Fall, G., Castelli, I., Lühken, R., ... & Becker, S. C. (2020). Identification and RNAi profile of a novel iflavirus infecting Senegalese Aedes vexans arabiensis mosquitoes. Viruses, 12(4), 440. https://doi.org/10.3390/v12040440
Parry, R., & Asgari, S. (2019). Discovery of novel crustacean and cephalopod flaviviruses: insights into the evolution and circulation of flaviviruses between marine invertebrate and vertebrate hosts. Journal of virology, 93(14), e00432-19. https://doi.org/10.1128/JVI.00432-19
Some research presented here received funding through ARC grants DP190102048 and DP150101782 and a University of Queensland PhD scholarship.
BioCommons supports the creation of specialist training with Genomics for Australian Plants
Australian BioCommons partners broadly in our efforts to drive coordinated solutions to life science researchers’ problems. Genomics for Australian Plants (GAP) is developing genomics resources to enhance our understanding of the evolution and conservation of the unique Australian flora. GAP’s phylogenomics bioinformatics working group has combined newly developed and existing scripts into an integrated workflow for the assembly of target capture data.
Keen to share these resources with researchers who can use them, the group has been working with BioCommons to offer a series of events to train others in using these novel pipelines. Theoretical webinars and hands-on training workshops will be delivered virtually in conjunction with the upcoming Australasian Systematic Botany Society Conference.
Australian BioCommons partners broadly in our efforts to drive coordinated solutions to life science researchers’ problems. We connect the appropriate large organisations, small facilities, collaborative initiatives and individual experts to solve community-scale challenges. This means we’re well placed to hear about new resources and help to bring training opportunities to fruition.
Genomics for Australian Plants (GAP) is developing genomics resources to enhance our understanding of the evolution and conservation of the unique Australian flora. Established in 2018, the Australian State and National Herbaria and Botanic Gardens came together with Bioplatforms Australia to form the GAP initiative that has flourished as an active national consortium of researchers and expert working groups.
One of GAP’s goals is to build capacity in the management and application of genomic data and to provide tools to enable genomic data to be used to identify and classify biodiversity at a range of scales. Bringing together staff from Royal Botanic Gardens Victoria, CSIRO’s Centre for Australian National Biodiversity Research (Canberra) and Australian Tropical Herbarium (Cairns), Bioplatforms Australia and Australian BioCommons, GAP’s phylogenomics-bioinformatics working group has combined newly developed and existing scripts into an integrated workflow for the assembly of target capture data.
Keen to share these resources with researchers who can use them, the group has been working with BioCommons to offer a series of events to train others in using these novel pipelines. Theoretical webinars and hands-on training workshops will be delivered virtually in conjunction with the upcoming Australasian Systematic Botany Society Conference.
BioCommons offered advice on how to train others in what we’ve built: what was possible to run live, which computational environment would work best and the practicalities of leading a group through hands-on learning exercises. With their support we’ve been able to focus on our strengths like making sure the pipelines, data and technical information are spot on.
- Lalita Simpson, Research Community Project Manager,
Genomics for Australian Plants - Phylogenomics Project.
Introductory overviews of the challenges of conflict within target capture datasets and strategies to employ during analysis will be delivered as short public webinars:
Conflict in multi-gene datasets: why it happens and what to do about it - deep coalescence, paralogy and reticulation (20 May)
Detection of and phasing of hybrid accessions in a target capture dataset (10 June)
For participants who would like to take the next step, a series of three interactive in-depth workshops will be delivered as part of the ASBS conference. The analysing target capture datasets workshops are suited to researchers analysing target capture datasets and will provide hands-on training in the use of workflows covering the processing of raw sequence reads, as well as strategies for resolving paralogy and hybridisation.
GAP phylogenomics bioinformatic pipeline – Part 1: Assembly of raw reads using HybPiper (6 July)
GAP phylogenomics bioinformatic pipeline – Part 2: Yang and Smith paralogy resolution (7 July)
HybPhaser – Detection and phasing of hybrid accessions in a target capture dataset (8 July)
Further information and registration details are available on the ASBS2021 conference website.
Building better genome browsers
We were pleased to see one of our close partners recently awarded a grant to extend on a highly collaborative BioCommons project. Dominique Gorse from QCIF, along with Sandie Degnan and Bernie Degnan from the School of Biological Sciences at UQ, received support for Developing a scalable genome browser and interactive repository for large and complex multi-omic datasets from non-model organisms of environmental and economic importance.
Read on to hear how their 2021 UQ Genome Informatics Hub (GIH) collaborative project, it will deliver an interactive repository for diverse transcriptomic, chromatin-state and proteomic data and will be immediately populated with existing genomes of two Great Barrier Reef animals: the notorious destroyer of coral reefs, the crown-of-thorns starfish and a model for animal evolution, the sponge Amphimedon queenslandica”.
We were pleased to see one of our close partners recently awarded a grant to extend on a highly collaborative BioCommons project. Dominique Gorse from QCIF, along with Sandie Degnan and Bernie Degnan from the School of Biological Sciences at UQ, received support for Developing a scalable genome browser and interactive repository for large and complex multi-omic datasets from non-model organisms of environmental and economic importance.
Great Barrier Reef, Queensland (Image: Daniel Pelaez Duque)
Announced as a 2021 UQ Genome Informatics Hub (GIH) collaborative project, it will deliver an interactive repository for diverse transcriptomic, chromatin-state and proteomic data and will be immediately populated with existing genomes of two Great Barrier Reef animals: the notorious destroyer of coral reefs, the crown-of-thorns starfish and a model for animal evolution, the sponge Amphimedon queenslandica”.
QCIF are a key contributor to BioCommons activities around developing systems for non-model organism de novo genome assembly and annotation, and will launch a new national hosted Apollo service as part of that initiative in coming months. In the GIH project the Apollo browser and service will be extended to provide an interactive repository facilitating the viewing and interrogation of a wide range of omics data used for the curation and annotation of non-model organisms.
The GIH is an initiative designed to develop and advance innovative genomic capabilities at the University of Queensland. To find out more about their activities, visit the recently launched a website and subscribe to their newsletter.
Rapid genome assembly on Gadi at NCI
The Australian BioCommons is identifying community-supported bioinformatics tools used for assembly of non-model organism reference genomes, and subsequently coordinating the install, optimisation and documentation of these tools across Australian computing facilities, including the national (tier 1) high performance computing centres. A major aim is to provide reusable and reproducible methods that can be applied across these and other infrastructures available to the genome assembly community.
The Australian BioCommons is identifying community-supported bioinformatics tools used for assembly of non-model organism reference genomes, and subsequently coordinating the install, optimisation and documentation of these tools across Australian computing facilities, including the national (tier 1) high performance computing centres. A major aim is to provide reusable and reproducible methods that can be applied across these and other infrastructures available to the genome assembly community.
The first tool considered by this activity was Canu, a long read assembly package for Nanopore and PacBio sequencing data. Collaboration between researchers from the Genomics for Australian Plants (GAP) consortium and specialists at the National Computational Infrastructure (NCI) resulted in a decrease in assembly time for the Golden Wattle (Acacia pycnantha Benth.) from more than 2 weeks on institutional resources to 3 days on the Gadi supercomputer. This was achieved using a wrapper script that makes distributed jobs from Canu compatible with the scheduler on Gadi: allowing the tool to make use of multiple nodes. The Gadi-optimised implementation of Canu is described in detail on the BioCommons GitHub Canu repository.
The success of this work has led to multiple additional activities:
Completion of the Waratah (Telopea speciosissima) genome assembly for GAP during a user test of the optimised Canu installation
Sharing of the optimised Canu with BioCommons stakeholder researchers
Additional optimisation and troubleshooting on Gadi for larger mammalian genomes (> 3 Gb) to support Oz Mammals Genomics (OMG)
Benchmarking activities for Canu to support merit applications by the bioinformatics community.
Australian BioCommons regularly engages with Australian bioscience research communities to document challenges and define requirements for shared bioinformatics resources. Please join the discussion with the Genome Assembly community to develop a vision for shared national infrastructure that will support your research. For further information: contact@biocommons.org.au