News

Subscribe to the Australian BioCommons monthly newsletter or read previous editions

Patrick Capon 27/6/24 Patrick Capon 27/6/24

Conservation managers welcome readymade bioinformatics workflows that support decision making

A small carnivorous marsupial, the Kowari; one of the threatened species that the Galaxy Australia workflows are being applied to.

Conservation decisions are becoming better informed by important genetic information thanks to a new set of bioinformatics workflows that assemble and annotate reference genomes for threatened vertebrate species. Bioinformaticians and conservation managers alike are now effortlessly customising and executing these tailored workflows to conduct sophisticated analyses without extensive computational knowledge.

The Threatened Species Initiative (TSI) has worked intensively with Australian BioCommons to overcome the barriers that limit the uptake of genetic data into conservation management programs. As part of TSI’s mission to improve conservation practices, a new suite of resources facilitates the easy generation of high quality genetic resources for downstream analyses in situations when a lack of bioinformatics knowledge or access to large-scale compute would have previously made it impossible.

Making use of the user-friendly ‘point and click’ Galaxy Australia interface, new workflows for genome assembly, data quality control, transcriptome assembly and annotation are easy to use and come with detailed instructions. By incorporating other BioCommons infrastructure like the Australian Fgenesh++ Service, the workflows are readily available at no cost to Australian researchers and conservation managers. The Galaxy Australia team tailored each step with the tools and compute resources that TSI bioinformaticians and conservation managers would need, and have made it all accessible via the Genome Lab.

Detailed “how-to” guides are already enabling collaborators at Museums Victoria to assemble the genome of the Victorian grassland earless dragon without requiring additional assistance. The detailed instructions are supporting insights into a range of interesting native species, as Patra Petrohilos, PhD candidate in the Australasian Wildlife Genomics Group at The University of Sydney, attests:

“I recently used the Galaxy Australia how-to-guide to help me through the process of reference assembling for the Kowari (a small carnivorous marsupial). I found the instructions incredibly clear and easy to follow and the entire workflow was completed in a few days.”

Development of the workflows and “how-to” guides was led by Dr Luke Silver, Postdoctoral Bioinformatician in the Australasian Wildlife Genomics Group at The University of Sydney, and Dr Anna Syme, Bioinformatician at Australian BioCommons. Luke will present this work at the Genetics Society of Australasia 2024 Conference.

The workflows are available for anyone to use in WorkflowHub as assembly and annotation collections. The suite of complementary “how-to” guides are easily accessible in BioCommons’ How-To Hub. Prof Carolyn Hogg, TSI Science Lead & co-lead of the Australasian Wildlife Genomics Group at the University of Sydney, expects the resources to be widely applicable:

“The workflows have been built and designed to make genome assembly and annotation simple and easy, enabling the whole Australian life science community to use them.”

The TSI is now calling for new partnerships to extend their range of targeted threatened species, including to insects and invertebrates. If your team’s project focuses on a target species where genetic information could support conservation efforts, visit the TSI website to apply for partnership before 19 July.

Christina Hall 27/6/24 Christina Hall 27/6/24

Join the Australian Outpost of BioHackathon Europe 2024!

Are you interested in joining this year’s BioHackathon Europe, but can’t face the long haul flights? You should join the Australian Outpost team who will gather to work locally, while checking in regularly with our international colleagues.

This is a unique opportunity to participate in a significant global event and network with your international peers while working intensively on practical bioinformatics challenges. We will cover your costs when you come to Brisbane for the duration of ELIXIR’s BioHackathon: 4 Nov to 8 Nov 2024. Participation will require an afternoon session plus some work in the early evening to facilitate linking up live with the team in Barcelona. We always make it fun, and you’ll get to know others from around Australia while you learn new skills.

We’ve narrowed down the projects we’re interested in, and want to hear what you want the Australian Outpost of the BioHackathon to work on:

BioHackathon aims to:

Advance the development of an open source infrastructure for data integration to accelerate scientific innovation
Engage technical people in the bioinformatics community to work together on topics of common interest
Strengthen interactions, establish and reinforce collaborations through hands-on programming activities.

Please contact us if you are interested in joining the Australian Outpost of the BioHackathon Europe, and tell us which project/s you would like to participate in, and why. You can read last years’ wrap up and a blog post on why attending is so valuable for inspiration. Once we get a feel for who is interested, we will select a team of people and organise our meetup. There’s no need to register for a place on the BioHackathon Europe website - they have reserved places for the Australian Outpost.

Please submit your expression of interest to comms@biocommons.org.au by 9 Aug 2024.

Patrick Capon 27/6/24 Patrick Capon 27/6/24

Improving access to bioinformatics tools and software in Australia

Finding the right computational tools or software for your research can be frustrating. Which tools do what, where can you find them, and which high end computers can you access?

Finding the right computational tools or software for your research can be frustrating. Which tools do what, where can you find them, and which high end computers can you access?

Since 2021, Australian BioCommons has worked with our infrastructure partners to pull together a convenient list of the tools and software that are available to Australian life science researchers. As part of our ongoing efforts to facilitate simple access to and reuse of existing digital research infrastructure, the Bioinformatics ToolFinder offers a landscape view of what’s out there already.

ToolFinder has undergone significant ‘under the hood’ changes to make your life easier. ToolFinder 2.0 allows you to filter by research topic according to EDAM ontology, plus free text search across the whole table for tool names, descriptions, licence types, and compute providers. You can even customise your personal view of ToolFinder by adding or removing columns from the table.

Only looking for tools that have a container? Be sure to use the BioContainers column in your view. Already know the particular tool you need for your analysis? You can search for the tool’s availability and find which versions are installed at various computational facilities. Is your favourite tool not installed where you need it? Not to worry, ToolFinder 2.0 has links for you to request tool installation across Galaxy Australia, NCI, Pawsey and QRIS-Cloud.

With all these search and sorting improvements made, you’ll quickly narrow down to the tool or software you need among the 600+ entries.
Visit the refreshed ToolFinder now, and let us know how it goes!

Patrick Capon 27/6/24 Patrick Capon 27/6/24

BioCLI: Improving command-line infrastructure for life scientists

Learn more about how the newly established BioCLI Project is empowering Australian life scientists to access command-line infrastructure.

Australian life scientists are set to be empowered with the resources, skills, and knowledge required to access command-line infrastructure for bioinformatics research through the newly established BioCLI Project.

Data analysis in the life sciences is constantly evolving, as new instrument types are rolled out and larger amounts of data are generated. The flexibility, scalability, and control uniquely afforded by the command-line interface (CLI) gives users powerful capabilities to interrogate their data, meaning that coding skills can sometimes be essential to particularly complex data analyses. However, the sheer number and diversity of bioinformatics data, tools, and working scales presents a significant entry barrier to using the CLI for life scientists.

Australian BioCommons has established the BioCLI Project to uplift life scientists and help tackle the challenges of working at the CLI, offering environments and services that will reduce friction for processing and analysis of molecular data at scale. Working with our partners at Sydney Informatics Hub, the National Computational Infrastructure (NCI), and the Pawsey Supercomputing Research Centre, BioCLI will:

Develop key CLI infrastructure such as public virtual machine images that come preconfigured with all the essentials for life sciences research (eg. the BioImage)
Accelerate command-line job throughput by configuring key tools and workflows to run efficiently on specialised hardware or queuing systems (eg. configuration of Parabricks for NCI’s Gadi supercomputer)
Provide clear documentation for accessing and configuring all BioCLI outputs
Have a strong focus on empowering researchers through a dedicated training program

Keep up to date with the latest BioCLI project developments on our website, and be sure to register for our upcoming entry-level webinar “What exactly is bioinformatics?” delivered by Dr Georgie Samaha, Product Owner of BioCLI and Bioinformatics Group Lead at the Sydney Informatics Hub, The University of Sydney.

Patrick Capon 30/5/24 Patrick Capon 30/5/24

Galaxy Australia leads 2024 Galaxy Project publication

The latest developments in the Galaxy platform have been captured in a definitive publication describing the popular international data analysis platform.

The latest developments in the Galaxy platform have been captured in a definitive publication describing the popular international data analysis platform. Lead author and Project Lead of Galaxy Australia, Dr Gareth Price, coordinated the paper that documents the key features supporting accessible, reproducible, and transparent user-driven research.

Gareth was proud that The Galaxy platform for accessible, reproducible, and collaborative data analyses: 2024 update highlighted several key features that will particularly benefit Australian researchers:

“One exciting new offering is Galaxy Australia’s Genome Lab, which presents a customised, user-friendly view with rapid access to a range of sophisticated resources tailored for genome researchers. Our locally developed features are now available for researchers around the world to use.”

Galaxy Project spans the globe, with over 500,000 individuals registering an account across the 19 years of operation. It is a collaborative community of dedicated contributors, working to constantly improve the service while following core values of supporting accessible, reproducible, and transparent user-driven research.

Leading the biennial Galaxy Project publication in the 2024 edition of the Nucleic Acids Research’s annual Web Server Issue was a rewarding experience for Gareth:

“It was a privilege to lead such an impactful publication that now becomes the citable entity for any researchers using Galaxy, particularly after the 2022 publication had 490 citations and over 16,000 views in only two years. And what a wonderful opportunity to collate Galaxy’s progress through the contributions of over 120 authors globally and their efforts in feature development for Galaxy users.”

In all, 19 Australians contributed to the paper, representing the important input of Galaxy Australia to the global research community. The Galaxy Australia team has been instrumental in delivering service optimisations like the Total Perspective Vortex, and feedback from Australian users has driven user experience improvements internationally.

The publication also highlights the addition of licensed bioinformatics tools Fgenesh++ and Cell Ranger to Galaxy. These powerful software packages are fully subsidised for Australian researchers to use, and through their enthusiastic use, the Galaxy team hope to build a body of evidence that encourages software developers to consider providing both open and commercial licensing options to meet researchers’ needs. Further improvements across user accessibility, self-guided training via the Galaxy Training Network, and back-end technical aspects to ensure jobs continue to run smoothly are also detailed in the publication.
Galaxy continues to evolve based on the needs of the open research community. With the release of this publication, and many new features to explore, there’s never been a better time to get started using Galaxy!

Patrick Capon 30/5/24 Patrick Capon 30/5/24

ZERO Childhood Cancer Data Portal launched

Researchers can now access a large bank of paediatric genomic data collected as part of the ZERO Childhood Cancer program.

Researchers can now access a large bank of paediatric genomic data collected as part of the ZERO Childhood Cancer program, including molecular, phenotypic, multi-omics, and clinical data, plus physical biospecimens across 38 diseases. Housed within the newly launched ZERO Childhood Cancer Data Portal, the datasets represent tumour samples from 1,019 participants in the ZERO2 clinical trial, which includes all children with cancer in Australia and Aotearoa/New Zealand.

The Human Genome Informatics team at the Australian BioCommons was thrilled to recently partner with the Children’s Cancer Institute (CCI) and the ZERO program to support their efforts in establishing the data portal, which formed part of a national, multi-institutional project called the Human Genomes Platform Project (HGPP). The HGPP was designed to enhance capability for securely and responsibly sharing human genome research data nationally and internationally, ensuring maximum value can be derived from these assets. Along with CCI and ZERO, HGPP was a successful partnership between Australian Access Federation, Australian BioCommons, Australian Genomics, Garvan Institute of Medical Research, National Computational Infrastructure, QIMR Berghofer Medical Research Institute, and University of Melbourne Centre for Cancer Research.

The HGPP team adapted GA4GH’s Beacon network for implementation in Australia, with Beacon v2 now supporting the ZERO Childhood Cancer Data Portal. To learn more about key outcomes and outputs of the HGPP, watch the final showcase on the BioCommons YouTube channel.

The development of the ZERO Childhood Cancer Data Portal was supported by the Australian BioCommons' Human Genomes Platform Project and funded through NCRIS investments from Bioplatforms Australia and the Australian Research Data Commons, and investment by the ZERO Childhood Cancer Program and Children's Cancer Institute.

Christina Hall 30/5/24 Christina Hall 30/5/24

Reducing the frustrations of research data movement

All BioCommons activities are informed by researchers’ challenges. Read more about how we make the right connections between people, facilities and services to ease data movement.

A stick figure struggles to move a briefcase labelled 'data'

Researchers' problems inform everything BioCommons does. We seek out researchers, institutions and research consortia willing to share their roadblocks, annoyances and limitations. Designing solutions to these challenges that enable researchers to do their best work without impediments requires connecting the right people across Australia’s digital life science research landscape with the best international efforts.

Collaborating locally

As the bioinformatics capability of Bioplatforms Australia, we regularly collaborate with other BPA-funded facilities. When the national Biomolecular Resource Facility (BRF) expressed an urgent need for faster data transfer to their research customers and collaborators, we listened very carefully to their sources of frustration. The sheer size of their data movements had seen their team even resorting to delivering physical hard drives in the past!

After understanding the requirements, we identified AARNet, the operator of Australia’s national research and education network dedicated to moving research data, as being well placed to solve this problem. AARNet’s Digital Research Product Manager Greg D’Arcy collaborated with BioCommons to create a solution.

Making the right connections between people, facilities and services can lead to wonderful efficiencies in data movement, and AARNet recently described how BRF is now efficiently and safely transferring vast amounts of data overseas, and without any risk of data loss: Data Without Borders: the role of Globus in international genome research. As Globus’ partner for universities and research institutes in Australia, AARNet was key to implementing this data management tool into an elegant workflow. Data that is transferred directly from BRF instruments to the National Computational Infrastructure (NCI) for processing and storage can now be seamlessly transferred to recipients with their own Globus endpoints.

Collaborating internationally

Another example of seeking out researchers’ opinions is our ongoing call out for individuals with an interest in data submission to international repositories. Our community consultations have already teased out some of the challenges faced by Australian-based researchers in the data submission process and we have published a set of recommendations to address them.

As part of a range of activities around interfacing with international omics data repositories, BioCommons is investigating the value of bringing European Bioinformatics Institute (EMBL-EBI) / European Nucleotide Archive (ENA) team members to Australia. To facilitate improvements by connecting the right people, we are exploring various opportunities, with training workshops for data submission to ENA, sessions to provide feedback on existing ENA documentation, and collaborations around documentation all up for consideration.

We’d love to know what you think of these opportunities to interact with the ENA team and if you’d like to participate! Please complete our survey by 14 June 2024.

To contribute to the conversations in your area of interest, join one of our research domain focused mailing groups to hear about future consultations and events.

Patrick Capon 29/5/24 Patrick Capon 29/5/24

Latest tech unlocks deep sea mysteries hidden in museum collections

Researchers from the Museums Victoria Research Institute have constructed one of the largest distributional and evolutionary DNA datasets seen in marine science and now seek to answer questions such as: Where and when did deep sea life begin? How did it spread throughout the oceans that cover over 70% of the Earth’s surface?

A rare brittle-star from deep-water off New Caledonia. Photo by Caroline Harding/Museums Victoria.

One of the largest distributional and evolutionary DNA datasets seen in marine science has been constructed by researchers from the Museums Victoria Research Institute. They have spent 15 years sequencing deep-sea fauna from museum collections across the globe. Shedding new light on our understanding of deep sea life, the research program seeks to answer questions such as: Where and when did deep sea life begin? How did it spread throughout the oceans that cover over 70% of the Earth’s surface?

The pursuit of big ambitions like mapping patterns of deep-sea biodiversity across the globe creates massive datasets. But these large datasets come with substantial computational requirements, which are not always available. Dr Tim O’Hara, Senior Curator at Museums Victoria, is well aware of this challenge:

“We didn’t have in-house access to the computing infrastructure required to process such large amounts of raw genetic data. We had to find the computing power we needed to process our raw genetic data, create phylogenies (trees of life), and run models that explore evolutionary and biogeographic relationships.”

Tim uses museum collections to answer large-scale questions about the distribution of seafloor animals around the globe, and leads the museum’s brittle-star (ophiuroid) research program. Ophiuroids are widespread on seafloors across the globe, making them an ideal model species to understand distribution patterns across the last 100 million years.

“We extract nuclear and mitochondrial DNA to construct an enormous tree of life, which now contains 2700 samples. This enables us to determine where species originated and spread across the oceans. Since no one has really achieved this before, we are expecting to make a series of novel and interesting discoveries.”

Tim’s team at Melbourne Museum requested support through the Australian BioCommons Leadership Share, or ABLeS. The program was specifically designed to support researchers like Tim, who don’t have local access to the digital infrastructures they need and aren’t regular users of high performance computing facilities. By providing access to appropriate and scalable bioinformatics resources, ABLeS empowers researchers without a background in computational research and who are not currently supported by merit-based allocation schemes to conduct their research
Read more about the technical details of the support ABLeS provides The Ophiuroid Project or read more about Tim’s research.

Melissa Burke 29/5/24 Melissa Burke 29/5/24

Phylogenetics collaboration takes researchers back to basics with new training

A new online tutorial is taking researchers back to basics to uncover the principles of phylogenetics and how tree-building methods work thanks to a longstanding collaboration between Professor Michael Charleston from the University of Tasmania and Australian BioCommons.

Charles Darwin's first sketch of an evolutionary tree. Source: Wikimedia commons

A new online tutorial has been created to take researchers back to basics to uncover the principles of phylogenetics and how tree-building methods work. A longstanding collaboration between Professor Michael Charleston from the University of Tasmania and Australian BioCommons has delivered this self-guided tutorial featuring videos and hands-on exercises. To maximise its impact, the resource was tailored specifically to be shared globally via the Galaxy Training Network, and will form the basis of an upcoming live training workshop.

Using real-life data, and tools available in Galaxy and SplitsTree, the tutorial demonstrates the principles behind a variety of methods used to estimate phylogenetic trees from aligned sequence data or distance data. With a conversational style Michael discusses why phylogenetics is important, unpicks phylogenetics terminology from the roots to the tips and explains concepts such as multiple sequence alignment, how alignments are used to build trees, and phylogenetic networks.

Having the materials readily available online is already bringing benefits to Michael’s teaching at the University of Tasmania.

“Having the materials online with exercises in Galaxy is just fantastic for my university teaching. It means that students don’t have to take notes and frees them up to engage more deeply in class. Once they understand the concepts they can easily try out basic phylogenetic analyses and see how the tools work without needing to know how to code. ” - Professor Michael Charleston, University of Tasmania

Michael first created the concept of a workshop that explained the principles behind building phylogenetic trees in 2019 when experts from around Australia came together to consult on his materials ahead of a national workshop. Rather than providing an introduction to the topic, mathematician Michael’s deeper explanations of the underlying theories for people already creating phylogenetic trees found a unique niche. After amassing 24,000 views on the BioCommons YouTube channel, the need for an updated standalone tutorial was obvious.

Michael worked closely with the BioCommons training team over the last year and a half to develop this tutorial by tailoring and refreshing the materials and activities for the self-contained and easy to use Galaxy platform. This activity is part of the BioCommons’ commitment to making our training materials FAIR.

Try out the Phylogenetics: back to basics tutorial in the Galaxy Training Network.

Or if you prefer live training, join us for a workshop based on the tutorial in July.

Christina Hall 30/4/24 Christina Hall 30/4/24

Training with real-world genetic data to advance Australia’s urgent conservation goals

Katarina (right) working closely with Mikaeylah (left).

A diverse group of researchers converged at The University of Melbourne this month to work through their unique bioinformatics challenges under the guidance of The University of Auckland’s Dr Katarina Stuart. The genetic outlier analysis workshop invited participants to bring along their own datasets to analyse over two days. Working with real-world data offered attendees the opportunity to apply new techniques in their field, learn how methods may need to be tweaked, and importantly progress their research.

Mikaeylah Davidson, a PhD Candidate in the Faculty of Science’s One Health Research Group at The University of Melbourne, relished the opportunity to bring her own data:

Being able to engage with the data I'm actively working on was incredibly beneficial, as it provided me with the chance to seek assistance in troubleshooting issues I am currently encountering, as well as gaining insight into recurring challenges and how to address them effectively. This hands-on approach was incredibly helpful.

Mikaeylah’s research explores the potential of selective breeding as a tool to combat wildlife disease. While selective breeding has been used extensively for genetic improvement in domesticated animals, its application in conservation remains largely unexplored. Mikaeylah’s PhD is focused on the Southern Corroboree frog, which is critically endangered due to the introduction and spread of the deadly amphibian chytrid fungus.

Aiming to leverage the existing conservation breeding program based within the zoos, she hopes to identify phenotypic and genetic traits associated with resistance to the chytrid fungus. If successful, this could pave the way for the implementation of a selective breeding program aimed at reducing detrimental alleles and amplifying beneficial ones within the population. Ultimately, the goal is to breed Southern Corroboree frogs that have a heightened tolerance to the fungus, enabling their survival in the wild despite the presence of chytrid.

The highly practical workshop stepped through the use of command line programs, while providing the opportunity to ask questions of their use, functions and applicability. Multiple genetic outlier analysis methods were explored while learning how and when different methods should be used. The pros and cons of different methods helped explain which are best suited to different data types. This type of information can be very difficult to find without experience, or many hours of working though various software and protocols, according to Elliott Schmidt, a PhD Candidate from James Cook University:

I believe that this workshop has saved me many hours of troubleshooting my genetic outlier analysis. Coming away from the workshop with example scripts composed using my own data has given me confidence that my approach to analysing my data is appropriate and efficient, and can now be completed independently.

Elliott flew down from Townsville to progress his research into how evolutionary perspectives can be incorporated in conservation of a coral reef fish. His PhD project explores how different populations of a coral reef damselfish, Acanthochromis polyacanthus, distributed across the Great Barrier Reef may respond differently to warming ocean temperatures. He’s tackling this question by investigating local adaptation, differences in developmental plasticity, and population genetics. By incorporating physiological experiments with population genetics Elliott hopes to identify potential differences in vulnerability to warming temperatures between different populations as well as provide explanations for these differences via genetic analysis.

Hearing about the challenges faced by researchers working with different datasets was a highlight, and many of the 11 participants reported valuing the opportunity to engage in discussions with peers. Working intensively with their expert trainer and each other’s data, their ability to query and interpret varied datasets was honed. Mikaeylah particularly enjoyed the highly interactive elements:

Working through our real data enabled me to further my understanding of my own, while also offering insights into the challenges others face with their datasets. I found the exchanges on results interpretation very helpful, and also interesting, as they allowed me to learn how to interpret diverse datasets and troubleshoot different issues which may arise.

This workshop was part of a series of events made available through a collaboration between the Genetics Society of AustralAsia (GSA) and Australian BioCommons. There was also an online genetic outlier analysis workshop held in February, and another in person workshop will be held this July in conjunction with the GSA2024 Conference in Sydney. The workshops were supported by GSA’s Workshop Support Program that aims to help share knowledge and/or exchange ideas across genetics.

More information on the next hands-on workshop: Genetic outlier analysis (Sydney).