News

Subscribe to the Australian BioCommons monthly newsletter or read previous editions  

Patrick Capon Patrick Capon

Advancing the Nextflow conversation: Connect with Seqera’s Lead Developer Advocate in Melbourne

Dr Geraldine Van der Auwera is visiting Melbourne in September to support Australia’s activities around Nextflow and Seqera Platform and connect with users.

Flyer advertising Geraldine's visit

Dr Geraldine Van der Auwera, Lead Developer Advocate at Seqera, is visiting Melbourne in September to strengthen ties and support the growth of bioinformatics activities in Australia. She will meet with key stakeholders and deliver a public webinar to share the latest technical innovations and opportunities to engage with Nextflow and Seqera Platform (formerly Nextflow Tower).

The ongoing relationship between Australian BioCommons and Seqera is uplifting Australian researchers to access and deploy Seqera’s products, including Nextflow and Seqera Platform. Geraldine is visiting Melbourne to discuss future Nextflow-related activities with BioCommons and the Australian Nextflow Ambassadors, Dr Georgie Samaha and Dr Ziad Al Bkhetan. They want to know if an informal Australian Nextflow network would benefit life scientists and bioinformaticians. Share your thoughts by filling out a brief survey: Assessing interest in an Australian Nextflow network.

You can hear more from Geraldine when she delivers a BioCommons webinar Building the future of bioinformatics with Nextflow: Technical innovation, community engagement, and career development opportunities on 19 Sep 2024.  You can also meet with Geraldine in spare moments around her GA4GH Plenary attendance. Please email comms@biocommons.org.au if you would like to be connected.

P.S. If you’re looking to get hands-on with Nextflow, apply to join the Hello Nextflow! workshop on by 10 September. This workshop is being offered by BioCommons, Seqera and the Sydney Informatics Hub.

Read More
Patrick Capon Patrick Capon

New resources power long-running workflows at Pawsey Supercomputing Research Centre

In response to community requests, new resources supporting cutting edge bioinformatics workflows are available on Pawsey’s Setonix supercomputer.

The Setonix supercomputer

Pawsey’s Setonix supercomputer (supplied by Karina Nunez).

Specialised nodes are now available at the Pawsey Supercomputing Research Centre that are designed to power long-running scientific workflows. Responding to researcher demand, new Workflow Nodes have been custom built on Setonix to optimise and support workflows managed by tools like Nextflow and Snakemake that surpass the regular 96 hour wall-time constraint.

Researchers voiced their challenges in running long workflows, including numerous reports from the BioCommons computational workflows community that they were running out of wall-time - the clock time it takes for a computation to run from start to finish. One of these researchers was Lauren Huet, Bioinformatics Research Officer at the Minderoo OceanOmics Centre at UWA:

Our Ocean Genomes project is addressing a key gap where over 95% of marine vertebrates lack sequenced genomes. Building such a comprehensive reference genome library requires intensive compute power, and the workflows can be quite long. This project would not be possible without the capacity to scale up to process tens or hundreds of genomes in parallel.

Dr Sarah Beecroft, Life Sciences Supercomputing Specialist at Pawsey, led the team effort to build dedicated Workflow Nodes on Pawsey’s Setonix - the most powerful research computer in the Southern Hemisphere. 

Setonix’s Workflow Nodes provide a stable and robust environment for workflow orchestration. Users can launch their master jobs interactively and keep their sessions alive for extended time periods, enhancing both productivity and performance. I’m really excited to see the new research that is enabled!

Lauren and the OceanOmics team are already benefiting greatly from the Workflow Nodes:

It’s been a game-changer for our research! The nodes enable us to run Nextflow pipelines directly in the terminal, offering unparalleled flexibility for developing and testing our workflows. The capability to execute long-running pipelines without interruptions has significantly increased our throughput, allowing us to produce results faster and more efficiently.

As a member of the BioCommons BioCLI project, Sarah is passionate about making command-line infrastructure accessible and well documented. Together with other supercomputing experts, the team has produced a new comprehensive technical user guide for users looking to run their workflows on the Setonix Workflow Nodes.

Learn how to run workflows on the Workflow Nodes in Pawsey’s user support documentation, or join the next meeting of the BioCommons computational workflows interest group to influence future research infrastructure developments. 

Read More
Patrick Capon Patrick Capon

WorkflowHub and nf-core collaboration enhances workflow accessibility

WorkflowHub now automatically registers workflows developed by the nf-core community

This media release was prepared by ELIXIR-UK with assistance from BioCommons and is re-published here without modification.

WorkflowHub, a prominent registry for computational workflows, now automatically registers workflows developed by nf-core, a community-driven initiative focused on developing best-practice workflows. 

The marriage of these two resources will not only allow new workflows to be included automatically, but also updated versions of existing ones.

Additionally, during the import process, workflows are automatically annotated based on the metadata provided by the nf-core community in their native development repositories. This ensures that the workflows are visible, well-documented and easily understandable, following the principles of FAIR (Findable, Accessible, Interoperable, Reusable) data practices.

Contributions from the nf-core community are now streamlined and automated, freeing up valuable time and resources for both WorkflowHub and nf-core teams to focus on creating high-quality workflows without the added burden of managing findability and accessibility.

Developments in WorkflowHub are steered by the joint Product Owners Prof Carole Goble, ELIXIR-UK, Dr Frederick Coppens, ELIXIR-Belgium, and Dr Johan Gustafsson, Australian BioCommons. Recognising the increasing importance of Nextflow as a standard for defining bioinformatics workflows, the team engaged with nf-core to automatically ingest nf-core workflows to WorkflowHub.

About WorkflowHub

WorkflowHub is a leading registry for computational workflows, providing a platform for discovering, sharing, and using workflows across various scientific domains. It aims to enhance the accessibility and usability of computational workflows, promoting collaboration and innovation in the research community.

About nf-core

Nf-core is a community-driven initiative focused on developing best practice workflows using the Nextflow workflow management system. It provides a platform for collaborative development and sharing of high-quality workflows, fostering reproducibility and efficiency in computational research.

About ELIXIR-UK

ELIXIR-UK is part of the European ELIXIR infrastructure, which supports life science research and its translation to medicine, the environment, and society. By integrating national bioinformatics resources, ELIXIR-UK aims to provide a sustainable infrastructure for biological information, ensuring that data is effectively managed, analysed and shared across the scientific community.

The UK Node, as well as the Belgium Node of ELIXIR, endorse WorkflowHub as a Node Service

Further details reach out to us at contact@elixiruknode.org

About BioCommons

Australian BioCommons is a digital infrastructure capability enhancing Australian life sciences research. BioCommons links Australian researchers with the tools, platforms, and expertise they need to undertake world-class research into the molecular basis of life.

BioCommons facilitates links between the Australian research landscape and international platforms and communities, and contributes directly to WorkflowHub’s development and governance. This contribution forms part of an ongoing collaboration strategy between BioCommons and ELIXIR to share technical experience and maintain a more global perspective. 

For further details, reach out to us at comms@biocommons.com.au

Read More
Patrick Capon Patrick Capon

Streamlining metagenomics and single cell transcriptomics data processing with new workflows

Discover three new resources on the fully subsided Galaxy Australia service for anyone to transform raw metagenomics and single cell transcriptomics data into an analysis-ready state.

Three new computational workflows are transforming raw metagenomics and single cell transcriptomics data into an analysis-ready state. Perfect for fresh experimental data or analysis of publicly available datasets, the workflows come with best practice tools installed and well documented how-to guides. While originally developed for use by a core facility generating huge quantities of new data, the new resources are available on the fully subsidised Galaxy Australia service so that anyone can put them to work. 

The workflows were developed through a collaboration between Griffith University’s Central Facility for Genomics (GU-CFG) and Australian BioCommons. With an expected influx of spatial omics and single cell data from new instrument installations, these common workflows were the ideal solution to do the heavy lifting of data preprocessing before delivery to end users.

Dr Sarah Williams, Senior Bioinformatician at QCIF, said that the team built their workflows within the Galaxy Australia platform as:

“The web accessible interface is intuitive to use. Histories can be quickly shared along with custom reports for GU-CFG to provide to their end users, and researchers can then choose to continue with further analyses without leaving the Galaxy interface.” 

Valentine Murigneux, Bioinformatician at QCIF added:

“Researchers can just bring their data into Galaxy and use the workflows straight away, as they come with the best practice tools pre-selected and installed ready to go on Galaxy, saving time often spent researching the best tools for the job and installing the databases required.”

One example of this benefit is the inclusion of CellRanger within the single cell RNA (scRNA) sequencing workflow. CellRanger is a proprietary tool for 10X single cell data, and access is now fully subsidised for Australian-based researchers through an application process within Galaxy Australia.

All three workflows have extensive user guides available, so if you are looking to efficiently process raw metagenomics or single cell transcriptomics data, be sure to check out the new workflows via WorkflowHub:

You’ll find all of the background information you need about the workflows in their how-to guides:

Looking ahead, BioCommons are establishing two new activities - the ‘Methods Commons’ and ‘BioCLI’ to continue where BYOD has left off. Stay tuned for more in this space!

Read a full summary of the BYOD Expansion project


The Australian BioCommons BYOD Expansion Project was funded through NCRIS investments from Bioplatforms Australia and the Australian Research Data Commons (https://doi.org/10.47486/PL105) that were matched with co-investments from AARNet, Melbourne Bioinformatics, NCI, Pawsey, QCIF via the Queensland Government RICF fund, The University of Sydney, AGRF, Griffith University and Monash University.

Read More
Patrick Capon Patrick Capon

 Wrapping up the ‘Bring Your Own Data’ project and a look to the future

Read how BYOD enabled highly accessible, available, and scalable data analysis and sharing capabilities for the benefit of Australian life science researchers. 

Important Outputs

Four new national services, major expansions to Galaxy Australia, 15 training workshops and webinars, many specialised workflows, and even more stories of impact, all thanks to collaborative efforts of 12 organisations. It’s fair to say that the Australian BioCommons ‘Bring Your Own Data’ (BYOD) project met its aim to enable highly accessible, available, and scalable data analysis and sharing capabilities for the benefit of Australian life science researchers.

Winding down at the end of 2023, the BYOD Expansion project’s legacy will continue through the delivery and constant improvement of our national services

The project began in June 2019 thanks to investment from BioPlatforms Australia and the Australian Research Data Commons (ARDC), and brought together a large group of collaborators and co-investors including AAF, AARNet, Melbourne Bioinformatics, NCI, Pawsey, QCIF via the Queensland Government RICF fund, The University of Sydney, AGRF, Griffith University and Monash University. There were three focus areas: web-based bioinformatics workbenches for life sciences researchers, a complementary command line interface (CLI)-focused platform, and creation of data infrastructure connecting ‘omics instruments and reference datasets to the analysis infrastructure. Work in these areas has had a wide ranging and extremely positive impact on the life sciences research landscape, as showcased in the words of infrastructure end users.

TESTIMONIALS

Tool Finder will be a really useful resource for researchers, particularly those who are just getting started and want to understand what software is available for their analysis and what computing platform would be most suitable. It’s awesome to have all of that information on hand in the one place! 

Dr Parice Brandies, The University of Sydney

Galaxy Australia is intuitive to use, it’s easy because students don’t have to install software, it has lots of really good documentation and visualisation, and all of this helps the students to understand what they are doing and more importantly why they are doing it.

Dr Kylie Munyard, Curtin Medical School

The Fgenesh++ service has helped us easily and efficiently annotate multiple diverse genomes to a high standard.

Dr Kate Farquharson, The University of Sydney

For my PhD project I assembled close to 4000 RNA-Seq datasets from samples from all over the world - a task that would have been impossible without Galaxy Australia.

Dr Rhys Parry, University of Queensland

So much software gets left without regular updates and from year to year you realise that it isn’t maintained or updated. So we look for things that are stable - this is the reason we call on the Australian Apollo Service.

Assoc Prof Charles Robin, University of Melbourne

We are looking at how a particular genus of plant viruses evolved to only infect plants. We make virus-like particles in order to determine the structure of viruses and also for drug discovery and biomedical use. AlphaFold was used to check for evidence of a core structural domain of a putative coat protein and the fact that it was there gave us the confidence to go on and make virus-like particles.

Dr Frank Sainsbury, Griffith Institute for Drug Discovery

TIaaS helps keep workshops on track. Trainers have live insight into how participants’ jobs are running and can identify sticking points almost before they happen. The special training queue means that everyone has a consistent experience. Even large jobs submitted simultaneously from all around Australia run fast.

Dr Melissa Burke, Australian BioCommons

The Bioimage was a great place to enter the world of bioinformatics and really helped me to upskill on the command-line. I was able to jump right in and make use of Nextflow pipelines, Singularity containers and interactive Rstudio sessions.

Alexandra Boyling, ANZAC Research Institute

Looking ahead, BioCommons are establishing two new activities - the ‘Workflow Commons’ and ‘BioCLI’ to continue where BYOD has left off. Stay tuned for more in this space!

Read a full summary of the BYOD Expansion project


The Australian BioCommons BYOD Expansion Project is funded through NCRIS investments from Bioplatforms Australia and the Australian Research Data Commons (http://doi.org/10.47486/PL105) that are matched with co-investments from AARNet, Melbourne Bioinformatics, NCI, Pawsey, QCIF via the Queensland Government RICF fund, The University of Sydney, AGRF, Griffith University and Monash University.

Read More
Patrick Capon Patrick Capon

A new best-practice workflow for easy and efficient genome assembly

An off-the-shelf bioinformatics workflow for genome assembly from HiFi read data is now available and has been specifically tailored for Australian researchers through a collaboration between BioCommons and the Australian Genomics Research Facility.

An off-the-shelf bioinformatics workflow for genome assembly from HiFi read data is now available and has been specifically tailored for Australian researchers. The new custom-built genome assembly workflow:

Assembling genomes from HiFi reads is a common roadblock for researchers. Now, researchers can access a customised solution following a successful collaboration between two Bioplatforms Australia facilities, the Australian Genomics Research Facility (AGRF) and the Australian BioCommons. Dr Kenneth Chan, Bioinformatics Manager at the AGRF, said that:

This custom-built genome assembly workflow provides a standardised approach that follows best practice in terms of workflow design, documentation and user support. Now when AGRF generates HiFi long read sequencing data for researchers we can direct them to this workflow solution with confidence that it will suit their needs.

The workflow is written in NextFlow and employs assembly software specific for HiFi sequencing reads. It features pre-assembly quality control for the raw sequence data, a primary assembly stage using the Improved Phased Assembler from PacBio, and a post-assembly quality control stage.

Outline of the tools and processes within the HiFi genome assembly workflow

Community scale research requires reproducible, best-practice, bioinformatics workflows that can be run on a multitude of computational systems. The new custom-built workflow has been optimised across several national research consortiums, and can run on the Gadi supercomputer at NCI Australia, the Setonix supercomputer at Pawsey, Amazon Web Services, and the in-house computational systems at the AGRF. Looking to the future, the workflow has been prepared for use on NextFlow Tower as the BioCommons and our infrastructure partners are in the process of setting up a national NextFlow Tower service

Researchers can find the new workflow easily on WorkflowHub. If you are interested in contributing to future efforts in the workflows space, the Australian BioCommons coordinates a community for computational workflows in bioinformatics. Anyone is welcome to join the conversation and contribute!

Read More
Christina Hall Christina Hall

Enabling reproducible and portable workflows: Janis

As part of the Australian BioCommons ‘Bring Your Own Data’ Expansion Project, a specialised framework is under development that creates simple workflow definitions that enables researchers and clinicians to work with different workflow languages.

Being able to repeat analytical workflows consistently and accurately is critical in transferring and scaling methodologies, whether between research groups or for large-scale clinical use.

As part of the Australian BioCommons ‘Bring Your Own Data’ Expansion Project, a specialised framework is under development that creates simple workflow definitions that enable researchers and clinicians to work with different workflow languages.

‘Janis’ provides both a consistent language for describing workflows and a framework that can translate workflows between existing languages such as CWL and WDL. Originating from the Portable Pipelines Project between Melbourne Bioinformatics, the Peter MacCallum Cancer Centre, and the Walter and Eliza Hall Institute of Medical Research, BioCommons quickly saw the potential of supporting such a useful tool.

Janis will help make the work that biomedical researchers do more portable and less dependent on any particular technology.

A/Prof Bernie Pope
Australian BioCommons A/Director: Human Genome Informatics
Melbourne Bioinformatics Human Genomics Lead
Victorian Health and Medical Research Fellow

Janis is already being used by a number of groups, such as the Molecular Pathology Laboratory at the Peter MacCallum Cancer Centre where it helps manage workflows for the analysis of clinical cancer data on a significant scale. Melbourne Bioinformatics software engineer, Grace Hall, is currently working to extend Janis's ability to convert workflows from the popular Galaxy platform. Grace points out that research groups currently invest a lot of time and effort in creating workflows; translating them with Janis will make them easier to share, and maintain with minimal effort into the future. Richard Lupat, a bioinformatics software engineer from the Peter MacCallum Cancer Centre and one of the original Janis developers, says that translating to and from Nextflow, another widely used workflow language, will be the next goal.

More information about Janis can be found here.

This story was based on a news item first published in the Melbourne Bioinformatics newsletter.

The Australian BioCommons BYOD Expansion Project project is funded through NCRIS investments from Bioplatforms Australia and the Australian Research Data Commons (http://doi.org/10.47486/PL105) that are matched with co-investments from AARNet, Melbourne Bioinformatics, NCI, Pawsey, QCIF via the Queensland Government RICF fund, The University of Sydney, AGRF, Griffith University and Monash University.

Read More
Christina Hall Christina Hall

Finding and reusing workflows made simple

In an effort to support the discovery and reuse of workflows, Australian BioCommons has established a presence on WorkflowHub. BioCommons and its partners have so far registered 31 workflows which have already accumulated more than 16,000 views and 250 downloads.

Bioinformatics workflows bring together software packages into complex multi-step processes that standardise analysis.

Reuse of workflows can benefit life science researcher communities by accelerating their research, reducing replication of effort and supporting the application of best practice bioinformatics. Discoverability and reuse of workflows can also increase the visibility and recognition for bioinformaticians who invest large amounts of time, effort and intellectual property in workflow development.

The discovery and re-use of workflows is supported by the WorkflowHub registry that facilitates the description, sharing and publishing of scientific computational workflows, with applications ranging from eukaryotic and bacterial genome assembly to shotgun metagenomics.

After consultations with local researcher communities and participation in international forums discussing the shared challenge of workflow discoverability, the Australian BioCommons responded by establishing a presence on WorkflowHub. Converging on this resource together with our infrastructure partners and by working with the ELIXIR development team, the BioCommons team has worked to make the platform more fit-for-purpose for Australian bioscientists. The BioCommons and its partners have so far registered 31 workflows, which have already accumulated more than 16,000 views and 250 downloads.

The BioCommons has also established WorkflowFinder, which draws metadata from WorkflowHub and combines this with useful local information. In time this service will allow researchers to click and deploy workflows to local platform services. Right now it offers an interactive table that allows you to search for workflows registered by Australian BioCommons partners on WorkflowHub and details useful information such as where workflows have been successfully run in Australia.

Read More