News

Subscribe to the Australian BioCommons monthly newsletter or read previous editions

Guest User 29/4/24 Guest User 29/4/24

Introducing the Australian Tree of Life Informatics Capability

This program will equip researchers and decision-makers with the tools to leverage cutting-edge genomics technologies to more effectively manage and safeguard Australia’s precious biodiversity and agricultural resources.

Powering research and decision-making for Australian biodiversity and agriculture

Reposted from Bioplatforms Australia

A mobile phone with colourful data appearing in the air above the phone

The Australian Tree of Life Informatics Capability is establishing a new digital framework to bridge the gap between the generation of genomics data and its application in on-the-ground actions. This program will equip researchers and decision-makers with the tools to leverage cutting-edge genomics technologies, enabling them to more effectively manage and safeguard Australia’s precious biodiversity and agricultural resources.

Australia is one of a few megadiverse regions in the world. It is home to around 10 percent of the world’s species, with around 80 percent of Australia’s native species not occurring naturally anywhere else. National strategic plans for biodiversity and biosecurity emphasise the importance of making informed, data-driven decisions to support this unique environment and the primary industries that flourish in it.

Various endeavours, including the Bioplatforms National Initiatives, are currently producing essential genetic and genomic data, such as reference genomes. Whole genome sequencing acts as a cornerstone resource, facilitating discoveries such as identifying previously unknown species, uncovering novel genes for innovative applications, and understanding organism functions in nature and agriculture, and exploring their variability and interactions.

Generating genomic data for all relevant Australian species, and making it relevant to real-world application is an immense undertaking, requiring that we intensify our efforts. Our challenge is to develop a system that continues to foster enhanced collaboration, expedites data generation, assembly and analysis, and provides specialised platforms tailored to effectively deciphering this data for real-world use.

The Australian Tree of Life Informatics Capability is addressing this challenge by establishing two new infrastructures:

1. The Australian Tree of Life – Genome Engine

The Genome Engine will accelerate the assembly and annotation of referential genomic data for species relevant to Australia. Building on existing Bioplatforms investments in data generation (via National Initiatives) and data analytics services (via Australian BioCommons), it will allow Australian species to be studied from molecular to population scales. Researchers will be provided access to automatically produced genome assemblies, annotations and published Genome Notes soon after the raw sequencing data has been created.

The infrastructure will leverage approaches developed by the UK-based Wellcome Sanger Institute’s Darwin Tree of Life project and Galaxy infrastructure supporting the Vertebrate Genomes Project, bringing their workflows and methodologies to Australia.

2. The Australian Tree of Life – Applied Data Laboratories

Applied Data Laboratories will generate meaningful and actionable information for decision-makers based on genomic resources, such as those made available by the Genome Engine.

Applied Data Laboratories will be developed in consultation with end-user communities such as those involved with the Plant Pathogen, Pest Management, and Functional Fungi Bioplatforms National Initiatives. Our objective is to allow more researchers, industry and government professionals, and policy makers to harness the power of genomics to inform on-the-ground actions that secure Australia’s primary industries, nature, and biodiversity.

This work will build on a concept from the Threatened Species Initiative (TSI), where Bioplatforms, in partnership with the University of Sydney and RONIN, has invested in the development of the TSI Biodiversity Portal. Scheduled for release in mid-2024, the Portal will empower threatened species managers to sequence and interpret population genetics data, generating reports tailored to inform species recovery actions.

Together, this digital capability will help to bridge the gap between generating and applying genomic data, significantly improving Australia’s capacity to leverage recent advances in next-generation sequencing. This will play a crucial role in preserving our unique biodiversity and safeguarding Australia’s primary industries, food systems and environments.

This digital research infrastructure initiative is enabled by the National Collaborative Research Infrastructure Strategy (NCRIS).

For further information and updates, please contact:

Dr Nigel Ward – A/Director – Platforms, Australian BioCommons

nigel@biocommons.org.au

Sarah Richmond – General Manager Science Program, Bioplatforms Australia

srichmond@bioplatforms.com

Patrick Capon 29/4/24 Patrick Capon 29/4/24

Empowering life science researchers: New software on Galaxy Australia breaks down barriers

Galaxy Australia has been updated with two powerful software packages installed and fully subsidised for Australian researchers to use.

Galaxy Australia has been updated with two powerful software packages installed and fully subsidised for Australian researchers to use.

Specialised bioinformatics tools, like the newly available Fgenesh++ and Cell Ranger, are frequently required for life sciences data analyses. Individual researchers are often unable or not confident to accept licensing conditions and associated charges, or are unable to upload a licence file. Now, Galaxy Australia has opened the door to use these proprietary bioinformatics tools by negotiating licences which come at no additional cost to the user. Researchers using Galaxy Australia also receive fully subsidised access to a national high-performance computing network, enabling complex data analyses to be performed in the user-friendly web interface.

Fgenesh++ is a bioinformatics pipeline for automatic prediction of genes in eukaryotic genomes with extensive guidance available. It produces fully automated genome annotations of a quality similar to manual annotation, and is extremely fast compared to some other automated genome annotation pipelines. In response to requests from the genomics community, BioCommons licenced Fgenesh++ from Softberry and provide fully subsidised access for Australian-based research groups and research consortia via our Fgenesh++ Service. Now, individuals who would like access can also apply to use it through Galaxy Australia.

The second software package, Cell Ranger, was added to Galaxy Australia following requests from the single cell omics community. Cell Ranger is a set of analysis pipelines that count how many times something occurs within a cell. It processes 10x Genomics Chromium single cell data to align reads, generate feature-barcode matrices, perform clustering and other secondary analyses, and much more (see the list of example workflows and supported libraries). Cell Ranger contains in-built reference genome data, and can be integrated with the interactive CellXgene environment in Galaxy Australia for data visualisation.
If you are an Australian researcher interested in using either of these powerful software packages via Galaxy Australia, apply for access today.

Christina Hall 23/4/24 Christina Hall 23/4/24

Have your say on new activities to improve data submission to global repositories

We’re collaborating with the European Bioinformatics Institute (EMBL-EBI) to improve submission to data repositories. With a range of activities in the works, we want to hear what you’d like us to support and how you might participate.

Submission of 'omics data and associated contextual metadata to global data repositories is considered best practice when it comes to long term preservation and subsequent findability and reusability of these data.

Our community consultations have teased out some of the challenges faced by Australian-based researchers in the submission process. A summary of the challenges in data and metadata submission and a set of recommendations for how to address these challenges has been published, Omics Data Publishing to International Repositories from Australia.

Australian BioCommons is collaborating with the European Bioinformatics Institute (EMBL-EBI) to address these challenges, and we are currently investigating the value of bringing EMBL-EBI / European Nucleotide Archive (ENA) team members to Australia for a variety of events (in early 2025) including:

Training workshops for Data Submission:
- How to submit data to ENA, including:
  - Genome assemblies and annotations
  - Metagenome-assembled genomes (MAGs)
  - OTU/eDNA-derived data
Sessions to provide feedback on existing documentation:
- An opportunity to provide feedback directly to the ENA team on existing documentation for data submission, retrieval, and analysis. Your insights and suggestions will help improve the user experience and address common pain points.
Hands-on sessions to trial community-driven documentation design:
- An opportunity to work alongside the ENA team and fellow researchers in a collaborative effort to trial community-driven documentation design for the data submission process. Your direct involvement will contribute to improving the usability, and accessibility of EMBL-EBI resources.

We’d love to know what you think of these ideas, and whether you’d like to participate by completing the Expression of Interest form by 14 June 2024.

Melissa Burke 22/4/24 Melissa Burke 22/4/24

Are you ready for AI?

There’s never been a better time to up-skill on machine learning and AI. We’ve summarised our upcoming training events and other national opportunities to get you started.

There’s never been a better time to get hands on with machine learning and AI, with two upcoming BioCommons events on this topic.

On 8 May, Dr Michael Kuiper from the CSIRO presents a webinar on how AI tools can be used to accelerate research. Michael will explore how AI is reshaping scientific exploration and innovation, and how it can accelerate research processes, from data analysis and code writing to hypothesis development

Then on 11 June, Dr Ben Goudey from The Florey leads a hands-on workshop on Machine Learning in the Life Sciences. Using R-based machine learning workflows, this workshop demonstrates what machine learning is, its advantages and disadvantages and the types of scenarios where it may be the right tool for the job.

Nationally, there are many more opportunities to learn about and shape the future of machine learning and AI for digital research applications:

CSIRO’s National Artificial Intelligence Centre provides additional resources, workshops and initiatives to guide, nurture and inform the use of AI

The ML4AU Community of Practice, co-facilitated by ARDC and Monash Data Science and AI Platform, enables collaborative initiatives and provides training on emerging needs for ML capabilities and expertise in research

AI and Comms Community of Practice provides a forum to discuss and trial AI tools for science communication in research.

Patrick Capon 27/3/24 Patrick Capon 27/3/24

Community forum helps to shape research infrastructure for computational proteomics

The “Connections in Computational Proteomics” forum recently brought together local and international experts to network and discuss the latest trends, challenges and advances in computational proteomics.

The “Connections in Computational Proteomics” forum recently brought together local and international experts to network, discuss and learn. Capitalising on the convergence in Melbourne for the annual Lorne Proteomics conference, 27 researchers and service providers discussed the latest trends, challenges and advances in computational proteomics.

The forum was open to all, plus hosted the first in-person gathering of the Proteomics Bioinformatics community. BioCommons engages with this coordinated interest group to collaboratively address community-identified challenges, including resolving gaps in the digital infrastructure available for proteomics research. The open invitation was taken up by international visitors, students and plenty of new voices who joined the ongoing conversations.

Participants explored how they could tackle challenges inherent in computational proteomics, and shared how they are addressing current problems. Dr Nikeisha Caruana, Research Fellow in Bioinformatics at the Bio21 Institute, said that:

The forum provided an environment where those in computational proteomics and the surrounding fields could come together and brainstorm solutions to current technological obstacles. The field is relatively new professionally and many of us are scattered around Australia, and it created a fantastic environment for networking and building potential collaborations.

Paula Burton, CEO and Co-Founder of Mass Dynamics, enjoyed “talking all things computational proteomics” at the forum, especially the “great talks by [international experts] Stefan Tenzer and Mathias Wilhelm, and the vibrant discussions had on the challenges we’re facing as a field.”

The group collaboratively identified several key challenges related to proteomics experiments, particularly in terms of ensuring experiments are well-designed and reproducible. Output data must be clear, organised, and include metadata that provides sufficient context for others to reuse or repurpose the dataset.

Participants were offered a preview of the Galaxy Australia Proteomics Lab, a customised view of Galaxy Australia that provides rapid access to a range of sophisticated proteomics resources while retaining the full power of Galaxy Australia. The preview was “my favourite aspect of the forum,” according to Dr Rohan Lowe, Facility Manager of the La Trobe La Trobe Proteomics and Metabolomics Research Platform. Rohan particularly enjoyed “the chance to suggest improvements before it is fully launched for all Australian researchers to use." Stay tuned for more on Proteomics Lab in the coming months!

The forum closed with a discussion on the next steps for the Proteomics Bioinformatics community. The group have prepared a forum report, and are excited to get to work addressing the challenges they identified!

If your work is in computational proteomics or a related field, you are invited to join the conversation and start collaborating! Head to the Proteomics Bioinformatics webpage to learn more and get involved.

Support from the Australasian Proteomics Society (APS) for this community event is warmly acknowledged. BioCommons thanks the APS for inviting international guests, and for sharing the event details on the Lorne Proteomics conference registration page to assist in getting the word out.

Patrick Capon 27/3/24 Patrick Capon 27/3/24

Turning raw data into publication-quality graphics in a flash

Visualising complex statistical data is rarely a simple task. Learn how Lauren Carpenter leveraged Galaxy Australia to rapidly produce publication-ready heatplots of her survey data.

Visualising complex statistical data is rarely a simple task. Researchers frequently need to learn to use packages for complex visualisation software to produce high quality graphics - but there is an alternative. Recently, Lauren Carpenter, PhD candidate at the University of Queensland, leveraged Galaxy Australia’s web interface to rapidly produce publication-ready heatplots of her survey data. Galaxy’s promise of ease of use held true for Lauren.

I was able to quickly and simply visualise my data, and easily controlled the stylistic elements that influenced the readability of the heatmaps. Galaxy saved me a lot of time, it was user-friendly, and allowed me to produce visually consistent heatmaps for different data sets in the same context.

Lauren’s research on the employability of first year science undergraduates is heavily text based, which can be challenging to translate into a digestible figure. After analysing her data in NVivo, she used coding packages such as R Studio to visualise her data. Despite already having experience in R, Lauren found herself losing large chunks of her valuable time watching online tutorials and reading guides, and struggled to prevent data point clustering. Lauren needed a customisable tool to ensure that her smaller data points remained visible in the final output.

Rather than continuing down this time-consuming path in R, Lauren asked her PhD colleagues at UQ’s School of Chemistry and Molecular Biosciences for advice. After their recommendation of Galaxy Australia as ‘a very user-friendly program’ which they frequently use to analyse their transcriptomics and genomics data, Lauren decided to use Galaxy herself for the first time with great success.

I found the variables that worked for my data through a short trial-and-error process. I would definitely recommend Galaxy Australia to other researchers, as I found it intuitive, and I was able to quickly achieve the heatmap visualisation that I had originally envisioned.

Galaxy Australia contains popular data visualisation tools such as Krona and Circos plots, and contains a framework for further visualisation options. All these tools, plus powerful computing resources, are fully subsidised for Australian researchers to use without requiring prior programming experience.
Open up Galaxy Australia and get started visualising your data today!

Patrick Capon 27/3/24 Patrick Capon 27/3/24

From corals to the classroom: an interview with Dr Ashley Dungan

We chat with Ashley to find out more about her research and why she chose to work with the National Bioinformatics Training Cooperative to uplift the skills of fellow researchers.

It can be a struggle to keep up with the latest bioinformatics tools. Researchers have diverse needs and limited time to shop around for new techniques to analyse their data, let alone troubleshoot a new platform. What better way is there to learn a new approach than from researchers actively using useful tools in their own work? BioCommons collaborates with experts to deliver training workshops that help researchers glean insights and practical tips directly from their peers.

Dr Ashley Dungan, Research Fellow at The University of Melbourne, worked with BioCommons to train researchers from 22 different Australian institutes and organisations to use the bioinformatics platform, Quantitative Insights Into Microbial Ecology 2 (QIIME 2). Ashley originally developed the workshop for a local audience, alongside Melbourne Bioinformatics staff Dr Gayle Philip and Dr Vicky Perreau. Given the success of this all women team, BioCommons was keen to assist them to bring the training to a national audience. You can read about the impact of that training in our related story.

We interviewed Ashley to find out more about her research and why she chose to work with the National Bioinformatics Training Cooperative to uplift the skills of fellow researchers.

Ashley, can you tell us a little bit about your research?

I’m a Research Fellow in conservation microbial ecology. Broadly, I’m interested in the functions of bacteria in a range of systems with the end goal of manipulating those communities to achieve a better outcome for the host/system. To put this in the context of conservation, I’m interested in protecting our biodiversity and preventing the loss of species and ecosystems by providing animals with beneficial bacteria, or probiotics. So far, most of my research has been in coral-associated bacteria.

What motivates you to provide training to other researchers?

I didn’t start off my scientific career as a microbiologist. When I joined my PhD program, the focus was on coral probiotics and I was daunted by the prospect of having to do any bioinformatics. I found that most training resources were written by experts in a language that was really only available to other experts. Where was the dummies guide? Other resources were hidden behind a paywall or required attendance somewhere in Europe or North America. The training I was able to attend wasn’t immediately useful to me, or I’d have to fully rewrite the code (which I wasn’t skilled enough to do).

I was lucky that a fellow PhD student at the time (Dr Leon Hartman, who now works at the Walter and Eliza Hall Institute of Medical Research) walked me through everything and gave me lots of code. But the reality is that most students/scientists won’t have this type of resource. So I wanted to come up with a training solution that:

Was written for biologists by biologists, avoiding computer science jargon wherever possible
Was free
Could be attended in-person, virtual, or do it yourself
Where attendees could immediately run the code for their own use.

What’s the best part of training other researchers?

Data analysis isn’t easy but when you can see a community of like-minded people together, that’s powerful. There is nothing more satisfying than giving people the confidence to incorporate new techniques and ask new questions in their research. I’m proud of the team that put this together – it really wouldn’t have happened without Gayle and Vicky. And how cool is it that we are women doing microbiology and bioinformatics! The odds of that happening, even in 2024, in a bioinformatics workshop are still exceedingly low but we are committed to breaking down barriers for women in science.

How did you scale your training to a national audience?

Gayle and Vicky helped me create the QIIME 2 workshop that we ran locally for University of Melbourne participants. But we wanted to bring this training to all Australian researchers, and ensure that anyone could attend virtually. I worked closely with Dr Melissa Burke (Training and Communications Officer at BioCommons) to adapt the workshop to ensure it was fully accessible online. Melissa then coordinated a half-day workshop where we had 45 participants from 22 different institutes/organisations around the country join us, including 60% who identified as female or non-binary. Working with Melissa and the National Bioinformatics Training Cooperative was a fantastic experience and I highly recommend that others interested in providing bioinformatics training get involved.

What’s next for you?

I’ve now trained Laura Geissler (my PhD student) to run the QIIME 2 workshops and she will take over the sessions hosted by Melbourne Bioinformatics. Looking ahead, all my fellowship applications now include creating a new workshop alongside doing primary research. First up, I’d like to create workshops focused on whole genome assemblies and metagenomics.

You can learn more about the BioCommons national training program on our website, or read how Ashley’s QIIME2 training led to an Australian first in respiratory disease research.

Patrick Capon 27/3/24 Patrick Capon 27/3/24

Microbial insights into respiratory disease enabled by national training

Learn how training on the next-generation microbiome bioinformatics platform, QIIME 2, helped progress Australian research on infectious respiratory diseases.

Considerable microbial diversity was seen on this MacConkey agar plate collected from a study patient.
Image credit: Olusola S Olagoke

Many research programs are driven by the goal of improving the lives of people impacted by chronic diseases. Genetic insights are key and researchers need to apply an arsenal of new and sophisticated tools to progress our understanding of human health. Participants in the recent workshop on the next-generation microbiome bioinformatics platform, QIIME 2, included a microbiologist working on infectious respiratory diseases, and the resulting analysis has just been published.

Researchers from 22 different Australian institutes and organisations gathered online for the Introduction to Metabarcoding using QIIME 2 workshop hosted by BioCommons. A/Prof Erin Price, from the University of the Sunshine Coast, joined the workshop wanting to learn how to use QIIME 2 to investigate pleural infections. The recent publication, Performance of next-generation molecular methods in the diagnosis of pleural infections and their aetiology, acknowledged the workshop, and Erin is positive about the uplift the training provided.

The workshop provided a fantastic, well-paced, clear, thorough, hands-on introduction to QIIME 2. I was impressed with how much ground they covered, and I left feeling very confident that I could use QIIME 2 for my own work.

Erin’s ultimate research goal is to improve outcomes for people impacted by respiratory diseases such as chronic obstructive pulmonary disease, bronchiectasis, cystic fibrosis, lung cancer, and pleural infections. Her team applies omics methods to better understand microbial prevalence, origin, transmission, evolution, ecology, diversity, and antimicrobial resistance. To date, only a handful of studies have examined the pleural infection microbiome, and none have been in Australian cohorts. But as Erin explains, her team has changed that:

We used QIIME 2 bacterial (16S rRNA) metataxonomics to compare microbiome profiles with shotgun metagenomics, which allowed us to compare our Australian pleural infection cohort with international studies that predominantly used 16S rRNA metataxonomics.

QIIME 2 enables researchers to start an analysis with raw DNA sequence data and finish with visualised data and statistical outputs. Expert user, Dr Ashley Dungan, generously offered to lead the online workshop to share her skills nationally with peers. As a Research Fellow at The University of Melbourne, Ashley knew others could benefit from using this free, open source, community-developed and extensible tool:

QIIME 2 helps to quantify the changes in microbial communities when an experimental system is exposed to a stressor like disease. Hundreds of samples are sequenced at a time and each is given a unique barcode. This barcode allows the sequencing reads to be binned by sample and go through data quality control and taxonomic assignment in QIIME2. QIIME2 can also then provide immediate visualisations and output files that can be directly used for further statistical analyses.

Read our related story where trainer Ashley explains what motivated her to run this training for a national audience, and what the experience was like.
Are you fluent in the use of a tool that you know others would benefit from using? Reach out if you’re interested in delivering training, or sign up to our Cooperative if you’d like to participate in other ways.

Melissa Burke 25/3/24 Melissa Burke 25/3/24

New handbook to improve FAIRness and sustainability of bioinformatics training materials

Featuring contributions from the Australian research infrastructure community, the ELIXIR FAIR Training Handbook will help trainers make their training materials easier to find and reuse.

Trainers can now make their training material easier to find and reuse, thanks to a new FAIR training handbook released by the ELIXIR FAIR training focus group. This process is globally recognised as essential for ensuring bioinformatics and data science training is effective and inclusive.

The comprehensive handbook gives trainers and training coordinators simple, step-by-step guidance and practical examples on improving the Findability, Accessibility, Interoperability and Reusability (FAIRness) of training materials. It serves as a reference to learn about general best-practices or to quickly look up specific how-tos.

The handbook is the culmination of a large collaborative effort across international data science and bioinformatics training communities. Dr Melissa Burke, BioCommons’ Training and Communications Officer, is a member of the focus group and co-lead of several sections of the handbook. She is also a co-author of the publication on which this guide is based. BioCommons is proud to have enabled additional contributions from the Australian community including Kathryn Unsworth (Manager, Skilled Workforce Development, ARDC), Dr Anastasios Papaioannou (Data Science Manager, Intersect), and Steven Morgan (Academic Specialist - Bioinformatics, Melbourne Bioinformatics) through their participation in the 2022 Biohackathon Europe.

Read the handbook to find out how to make your training materials FAIR and enable their reuse.
Find out more about how the BioCommons’ makes training materials FAIR.

Patrick Capon 29/2/24 Patrick Capon 29/2/24

Human Genomes Platform Project delivers collaborative vision for a national human omics research data ecosystem

Discover the toolbox of services designed to enhance Australian capabilities for secure and responsible sharing of human omics research data.

The Human Genomes Platform Project (HGPP) wrapped up in November 2023, having investigated and prototyped a toolbox of services designed to enhance Australian capabilities for secure and responsible sharing of human omics research data.

Extensive investigations since January 2021 into global best practice technologies for human omics in Australia focused on:

A customised user interface for discovering virtual cohorts, using a GA4GH Beacon (version 2) network
An online management system that removes the burden associated with data access committee approvals
Finely controlled identity and access management enabled by CILogon and COmanage
A comprehensive report and proposal for the development of a national repository for human omics data aligned with international efforts, such as the Federated European Phenome Genome Archive (FEGA).

To learn more about the toolbox contents and the project more generally, watch the final HGPP showcase.

Despite a desire to share data for research use, there are many siloed collections of human omics data in Australia that are often difficult for outside users to access. With this challenge in mind, the HGPP assembled a network of experts across biomedical research and digital infrastructure domains. The group explored and tested a selection of foundational infrastructure to pave the way for human omics data in Australia to be findable, searchable, shareable, and linkable to analytical capabilities, all while ensuring the privacy of individuals is protected and data processing is performed ethically, securely and safely.

Looking to the future and building on the HGPP, the Australian BioCommons Human Genome Informatics initiative has ambitious plans to continue exploring and establishing national infrastructure to propel human omics research in Australia. New HGI pursuits include building the Australian Cardiovascular disease Data Commons and the recently announced GUARDIANS project.

Key Outputs from HGPP

Flyer describing the HGPP
Webinar - establishing Gen3 to enable better human genome data sharing in Australia
Webinar - Protection of genomic data and the Australian Privacy Act: is genomic data ‘personal information’?
Webinar - Improving discovery and access management of genomic data
Subproject reports including:

Virtual Cohort Assembly discovery phase and pilot phase reports
Data Access Committee Management Systems discovery phase and candidate solutions evaluation reports
Federated Identity and Access Management discovery phase and candidate solutions evaluation reports
Data and Metadata Archiving feasibility report

Documentation for infrastructure managers including:

The HGPP formed part of the Human Genome Informatics initiative and was funded by NCRIS via the Australian Research Data Commons (https://doi.org/10.47486/PL032) and Bioplatforms Australia. Contributions were also made by partner organisations: Australian Access Federation, Garvan Institute for Medical Research, National Computational Infrastructure, QIMR Berghofer Medical Research Institute, The University of Melbourne Centre for Cancer Research, Children’s Cancer Institute, and ZERO Childhood Cancer.