News

Subscribe to the Australian BioCommons monthly newsletter or read previous editions  

Rahul Ratwatte Rahul Ratwatte

Key elements to unlocking deep learning for structural biology identified by the Australian research community

The roadmap outlines key deliverables that will expedite the availability and accessibility of structural biology approaches to researchers nationwide.

In an inspired demonstration of collaboration, the Australian Structural Biology Computing Community has come together to publish the Australian Structural Biology Deep-Learning Infrastructure Roadmap. Taking a holistic view, that includes the existing challenges, critical research bottlenecks, and looking forward to a potential national strategy, this new research infrastructure roadmap has been developed by the community, for the community. 

Enabled by advances in deep learning methods for protein structure prediction and de novo protein design, computational structural biology has rapidly emerged as a powerful technology driving innovation in both fundamental and translational science. The technology underpins breakthroughs in drug design, diagnostics, personalised medicine, and synthetic biology, though a limitation has been that effective use requires concentrated interdisciplinary expertise and access to specialised hardware.

To understand these challenges, the Australian Structural Biology Computing (ASBC) Community was formed and has come together to lead a national, collaborative approach. This community-driven initiative, partnered with Australian BioCommons, brings together a diverse group of experts from leading institutions around the country. Authors of the roadmap represent Structural Biology Facility, Mark Wainwright Analytical Centre at the University of New South Wales, Pawsey Supercomputing Research Centre, University of Queensland, Walter and Eliza Hall Institute of Medical Research (WEHI), the National Computational Infrastructure, Sydney Informatics Hub and the School of Medical Sciences at the University of Sydney, School of Biomedical Sciences at the University of Melbourne, and the Monash Biomedicine Discovery Institute at Monash University. 

The roadmap outlines key deliverables that will expedite the availability and accessibility of structural biology approaches to researchers nationwide: 

  • A dedicated community space to foster collaboration and share best-practice recommendations for software deployments, benchmarking, validations and insights developed within the community. 

  • Community training resources to on-board diverse stakeholders within the context of computational structural biology and strengthen the national impact of community expertise. For example, the Leveraging deep learning to design custom protein-binding proteins webinar series.

  • National computational infrastructure built on increased hardware investment and a user platform to facilitate efficient, high-throughput utilisation of national computing resources and drive translational outcomes enabled by curated and validated computational structural biology technologies. 

  • Alignment, integration and engagement with global best-practice efforts for computational structural biology infrastructure and research. 

A robust, sovereign capability in computational structural biology and protein design will position Australian universities, research institutes, and industry at the forefront of global innovation.

Read the Australian Structural Biology Deep-Learning Infrastructure Roadmap 

Join the Australian Structural Biology Computing Community 

Watch the Community’s webinar series Leveraging deep learning to design custom protein-binding proteins

Read More
Melissa Burke Melissa Burke

Multi-model 3D visualisation enhances Nextflow pipeline for protein structure prediction

Community driven enhancements to Nextflows’ nf-core proteinfold pipeline have simplified the parallel execution, visualisation and comparison of multiple models for protein structure prediction including AlphaFold2, ColabFold, ESMFold.

Three 3D protein structures are overlayed on one another on a black background. The structure features many helices.

Predicted protein structures for LmrP visualised using the proteinfold pipeline

Advances in AI are taking protein structure predictions to a whole new level, accelerating research and enabling deeper analysis of protein structure and function. The nf-core community is embracing these developments by building the Nextflow proteinfold pipeline that integrates models such as Alphafold2, Colabfold and Esmfold and simplifies their use on a variety of computing infrastructures. 

BioCommons’ Dr Ziad Al Bkhetan, Product Manager - Bioinformatics Platforms and Australian Nextflow Ambassador, identified an opportunity to optimise the existing nf-core proteinfold pipeline for Australian researchers using the Australian Nextflow Seqera Service. Ziad initiated this effort by reaching out to the original developers from the Center for Genomic Regulation (CRG) in Spain with an offer to reconfigure the pipeline and add new features. This sparked an international collaborative effort that connected researchers and experts from Australian BioCommons, the CRG, the Sydney Informatics Hub (SIH) at the University of Sydney and the Structural Biology Facility (SBF) at UNSW, at several hackathons and summits to enhance the pipeline. The enhanced, community-driven pipeline is now available to all through nf-core’s curated set of open‑source analysis pipelines. 

The pipeline borrows a useful reporting and visualisation feature already implemented in Galaxy Australia. Front-end developer for BioCommons, Minh Vu, augmented the pipeline to implement this feature which allows the parallel execution of multiple models and generation of reports that visualise the resulting structures simplifying comparison and benchmarking of the outputs. Several state-of-the-art tools such as AlphaFold2, ColabFold, ESMFold are included in the pipeline with additional models including RoseTTAFold-All-Atom, HelixFold3, Boltz, RosettaFold2NA and AlphaFold3 to be added soon.

The ability to run different models through the pipeline without writing new code removes the impediment of command line or complicated compute infrastructure. Reflecting on the project in the Nextflow Podcast, Phil Ewels, Product Manager for Open Source at Seqera, said:

 “With almost no setup and no real prior experience, you can run these state of the art models and compare them all in a dynamic visual report. That’s pretty amazing.”

While it is designed to integrate with Seqera Platform, there’s no requirement to use it that way. Running the Nextflow pipeline on the command line gives the exact same reports. The code is freely available for others to use or improve via the nf-core repository of pipelines.

Ziad’s presentation about the collaboration and these new features was spotlighted as a highlight of the recent Nextflow Summit in Seqera’s Nextflow Podcast. Bioinformatics Engineer at Seqera, Dr Florian Wünnemann acknowledges there is great value in improving shared resources:

 “I think it really represents the best of the Nextflow community: they are developing tools and not just keeping it for themselves, but directly giving them back to the larger community.”

Rob Syme, Scientific Support Lead at Seqera Labs, believes the work speaks to the Nextflow and Seqera ethos of giving scientists and researchers the tools they need to build other tools.

“I love this project: it was an amazing outcome that required no input from Seqera or Nextflow. Yes, Seqera Platform could absolutely build an alignment viewer into the platform, but it wouldn’t be as good as if researchers themselves develop it. It wouldn’t be as good as the one that Ziad and the team have developed because research moves so incredibly quickly.” 

The collaboration within the international nf-core community has been a rewarding experience for all involved parties, and CRG has forged a new working relationship with BioCommons to continue development and maintenance of the pipeline. CRG’s Dr Cedric Notredame said of the experience:

“The collaboration with BioCommons has been so valuable. It has showcased the effectiveness of nf-core as a collaborative tool. Thanks to this framework, all of our teams were able to simultaneously contribute to the pipeline with minimal technical coordination. The pipeline is now one of the most complete go-to resources, covering the needs of a wide community of biologists interested in structural aspects of genomics.”

The improvements made to the visualisation code during the project will also be fed back into the Galaxy codebase. BioCommons’ close connections with research communities means that the national Structural Biology Computing community is now testing and finessing the pipeline, and supporting the creation of user documentation.

Sharing what’s been learnt through a publication about the nf-core/proteinfold pipeline is on the horizon, and a pilot Australian ProteinFold Service is under development.

Read More
Christina Hall Christina Hall

An Australian community for computational structural biology

A passionate group of structural biologists has formed the Australian Structural Biology Computing Community, to share computational knowledge, methods, and resources. 

The active new community is receiving support from a range of partners and advocates, including L-R Johan Gustafsson (BioCommons), Steven Manos (BioCommons), Kate Michie (UNSW) and Andrew Gilbert (Bioplatforms Australia)

The explosion of possibilities presented by deep learning approaches in structural biology research has created many new opportunities and challenges. A passionate group of structural biologists has formed the Australian Structural Biology Computing Community, to approach this new era as part of a community that shares computational knowledge, methods, and resources. 

This community-driven approach brings together a diverse group of people, with initial contributions forming around leads from the Structural Biology Facility at UNSW, and an academic panel of experts from Monash University, Walter and Eliza Hall Institute of Medical Research (WEHI), University of Western Australia (UWA), Australian National University (ANU), Bio21 Institute of Molecular Science and Biotechnology (Bio21), University of Melbourne, La Trobe University, University of Queensland (UQ) - IMB, University of Sydney, Griffith University, Swinburne University of Technology, CSIRO, and the University of Adelaide. Anyone involved in structural biology in Australia is invited to join and there are lots of different ways to get involved.

The Community for Structural Biology Computing in Australia webpage is a useful new resource for all users of computing for structural biology research in Australia. The page is constantly evolving and expanding, and it currently focuses on the use of deep learning methods in Structural Biology. It includes practical guides on topics like “Best practices for presenting and sharing AlphaFold models in a paper” as well as news items and announcements for relevant courses and meetings. 

Australian BioCommons supports the community by hosting quarterly online meetings that aim to tease out how computational structural biologists’ challenges might be addressed with community-scale responses and national research infrastructure solutions. If you join the mailing list via the community webpage, you will receive updates and invitations to community meetings and the discussions in Slack.

BioCommons began providing broad, fully subsidised, access to structural prediction in 2022 by making AlphaFold2 available within its Galaxy Australia service. The Australian AlphaFold2 Service provides both an easy-to-use interface and dedicated GPUs to Australian researchers. When BioCommons hosted the international 2023 Galaxy Community Conference, the keynote speech by Chief Scientist of the Structural Biology Factility at UNSW, Kate Michie, generated much excitement around forming an Australian community of practice for computational structural biology as an avenue for collectively addressing the challenges presented by deep learning in structural biology. 

BioCommons has supported key research stakeholders to refine the new community’s purpose, began running quarterly community meetings, and helped to establish the shared community spaces like the Australian Structural Biology Computing website and GitHub. As well as facilitating consultations with infrastructure partners and the broader computational infrastructure community, a group of national panel of experts has been identified. 

This community collaborates with their peers to:

  • Collectively create and maintain community forums and centralised collaboration platforms to support collaboration and knowledge sharing (i.e. methods and documentation);

  • Foster collaboration between structural biologists, computer scientists, and data scientists, thereby creating interdisciplinary teams to help tackle complex challenges, validate results and ensure robust applications of deep learning methods;

  • Lead the review, prioritisation, testing, optimisation, and sharing of deep learning codes, software and approaches that are of broad relevance and interest to the Australian research community;

  • Develop quality assessment tools to help evaluate the quality of calculated structures, and help guide researchers towards reliable predictions; and,

  • Address the ethical implications of AI-driven structural predictions, as well as discuss transparency, bias and interpretability to ensure responsible use of these technologies.

The Australian Structural Biology Community is poised to tackle a set of pilot activities aimed at fast tracking a national response to the challenges facing computational approaches in structural biology. A much anticipated future output is an infrastructure roadmap document that will formalise and describe the high level requirements of the community. This collaborative effort between the new Australian Structural Biology Community, Australian BioCommons, and BioCommons infrastructure partners will support the Australian Structural Biology community as new needs arise relating to bioinformatics tools, software, infrastructure or training.

Keep in touch by subscribing for updates at the Community for Structural Biology Computing in Australia webpage

Read More
Christina Hall Christina Hall

Repurposed hardware boosts national capacity and powers innovation

QCIF Ltd has made high-performance hardware available to the Australian BioCommons, giving the hardware a second life and uplifting national capacity for running AlphaFold 2 jobs in Galaxy Australia while supporting innovation through other GPU-enabled tools.

This story is co-published with QCIF Ltd

After successfully completing a previous project, QCIF Ltd made available high-performance hardware to the Australian BioCommons, giving the hardware a second life in enabling research and uplifting national capacity for the benefit of the scientific community. 

Well suited for running AlphaFold 2 jobs, the five General-Purpose Graphics Processing Units (GPGPUs) are now being used to enhance the national compute network behind the Galaxy Australia service.

The impact of this repurposing goes beyond infrastructure improvements. It has significantly expanded Galaxy Australia's capacity to support research and innovation by enabling the use of other GPU-enabled tools that offer major benefits to the scientific community. GPU processing can provide massive improvements in computational efficiency, decreasing processing times to less than 5% of conventional equivalents.

Dr Cameron Hyde, a bioinformatician at QCIF who supports the development of national software platforms like Australian BioCommons' Galaxy and Apollo services, co-authored the original AlphaFold 2 wrapper that enabled the tool to run within Galaxy Australia, ensuring both a friendly user-interface as well as instant access to the GPU clusters required to power the tool. He shared his enthusiasm for the new possibilities unlocked by the repurposed hardware which was originally part of an investment made in 2021 by the Australian Research Data Commons (ARDC) to support national platform projects and now directly enhances the bioinformatics services he helps deliver to Australian researchers. “Now that we have five GPU nodes of our own, we have room to experiment and explore new GPU-enabled tools. This gives us room to innovate beyond AlphaFold and accelerate scientific discovery in other research domains.”

For example, Galaxy Australia’s lead Bioinformatician Michael Thang has been using the hardware to explore running Nanopore’s “Dorado” on Galaxy Australia. Dorado is a high-performance basecaller for Oxford Nanopore Technology sequencing data. This innovation would enable researchers to conduct their entire analysis, from raw sequencing data through to assembled genome, all within the Galaxy Australia service.

Collaboration driving innovation

Developed by Google DeepMind, AlphaFold is an AI system that predicts a protein’s 3D structure from its amino acid sequence with accuracy comparable to experimental methods. In 2020, Australian BioCommons identified an opportunity to democratise access to this powerful tool by making AlphaFold 2 available through Galaxy Australia. This gave Australian researchers much greater accessibility to AlphaFold 2, allowing life scientists to easily visualise proteins in a manner inaccessible to all but dedicated structural biology researchers. This advance has supported research into protein-protein interactions, activation and inhibition mechanisms, and drug design.

By 2025, use of AlphaFold 2 has surged, evolving from an analytical tool for individual proteins into a routine screening tool for studying protein-protein interactions. To support this shift, Dr Hyde collaborated closely with Australian Structural Biology Computing Community to develop extensions to the AlphaFold Galaxy tool, including new output formats, input parameters, and an option to re-use intermediate files for improved efficiency.

Supported by the Australian BioCommons, AARNnet, QCIF Ltd, and The University of Melbourne, the optimised system now provides fully subsidised access for all Australian researchers via the Australian Alphafold Service. We extend our sincere thanks to the Australian Research Data Commons (ARDC) for providing the hardware to QCIF Ltd and enabling its reuse by Australian BioCommons.

Read More
Christina Hall Christina Hall

Helping the bioinformatics community harness the computing resources they need

A recent workshop explored strategies for the adoption, usage, and optimisation of GPUs and addressed researchers’ challenges to accessing high performance computing.

Exploring GPUs with Sarah at the ABACBS Conference workshop

The challenges researchers face when accessing high performance computing was addressed at a recent workshop led by a national group of experts. More than 40 researchers and bioinformaticians explored strategies for the adoption, usage, and optimisation of Graphics Processing Units (GPUs) at ABACBS 2024 in December.

The workshop was co-ordinated by Dr Andrew Lonsdale (Peter MacCallum Cancer Centre), Dr Sarah Beecroft (Pawsey Supercomputing Research Centre) and Dr Johan Gustafsson (Australian BioCommons). It featured a wide range of speakers covering practical insights on resource availability, portable code that runs efficiently on multiple GPU platforms (NVIDIA, AMD, and Intel), and real-world use cases.

The availability of GPUs in the Australia context was introduced by Andrew Lonsdale (Peter Mac), and Georgie Samaha (BioCommons) offered practical guidance on accessing GPUs via national resources and access schemes. Sarah Beecroft (Pawsey) provided a concise introduction to high performance computing principles including GPU access, to ensure everyone shared similar foundational knowledge. George Bouras (University of Adelaide) then demonstrated how to integrate machine learning frameworks like PyTorch with the Slurm scheduler, while Edward Yang (WEHI) presented best practices for writing interoperable, maintainable GPU code.

Highlighting real-world applications, Keiran Rowell, Nathan Glades, and Josh Caley (UNSW Structural Biology Facility) showcased how GPUs are empowering researchers to handle large, complex structural biology datasets, and Wytamma Wirth (University of Melbourne) illustrated the power of online GPU resources to accelerate in Bayesian phylogenetics analyses.

Feedback showed that attendees appreciated the variety of speakers and the balance of technical depth with practical applicability. They particularly highlighted the workshop’s interactive approach, which included live polling and ample Q&A sessions.

Half of the attendees reported they were already using GPUs, and the other half planned to adopt them soon. Two-thirds were keen to integrate existing GPU-enabled bioinformatics tools, while one-third aimed to develop new GPU-accelerated algorithms, underscoring the community’s readiness to embrace GPUs both as a means of immediately accelerating current workflows, and as a basis for innovative tool development.

This workshop aimed to support researchers with the knowledge, skills, and resources they need to unlock the full potential of GPU computing, and participants reported that they were keen to go on to investigate GPU applications beyond AI/ML, gain deeper insights into GPU architectures, and participate in more hands-on training sessions.

Attending the ABACBS Conference is a great way to connect and strengthen the GPU-enabled bioinformatics community in Australia, and Australian BioCommons sponsors and attends the conference each year to hear from researchers and bioinformaticians and to share responses to their research infrastructure priorities. Stay tuned for more information about ABACBS 2025 in Adelaide!

Read More
Patrick Capon Patrick Capon

An international approach to harnessing AI opportunities in the life sciences

We recently hosted Prof Ewan Birney, Deputy Director General of the European Molecular Biology Laboratory (EMBL) and Director of EMBL’s European Bioinformatics Institute (EMBL-EBI) in Melbourne to share his views in a workshop: Exploring opportunities in Life Sciences AI.

From left to right: Prof Ewan Birney, Andrew Gilbert, Dr Jeff Christiansen, Prof Andrew Lonie

Many of us are exploring the opportunities that AI brings to life sciences. Australian BioCommons and Bioplatforms Australia recently hosted Prof Ewan Birney, Deputy Director General of the European Molecular Biology Laboratory (EMBL) and Director of EMBL’s European Bioinformatics Institute (EMBL-EBI) in Melbourne to share his views in a workshop: Exploring opportunities in Life Sciences AI. 

The workshop focused on existing activities in Australia and global AI developments, with a facilitated discussion on how to build effective collaborations between infrastructures and researchers in the AI-life sciences space, and what opportunities there are for Australia to collaborate more closely with the EU across life sciences infrastructure, data, informatics, training and research programs.

Prof Birney’s visit included a keynote address at the iconic Shine Dome in Canberra. Co-hosted by EMBL Australia and Bioplatforms Australia, the event included talks from international and Australian experts, and drew scientists and senior leadership from a range of organisations, including CSIRO, Monash University, UNSW, the Australian National University, the University of Canberra, the Australian Government Department of Education, NCRIS projects, Snow Medical and Research Australia, as well as postdoctoral researchers and students. Read more about Prof Birney’s presentation in EMBL Australia’s news item.

As well as helping BioCommons to coalesce its approach to facilitating AI application in life sciences, the visit has strengthened the connection between BioCommons and EMBL and has led to ongoing discussions about future cooperation. Meeting in person was an excellent opportunity to clarify our intersecting needs, and establish a foundation to work together more closely into the future.

Read More