News
Subscribe to the Australian BioCommons monthly newsletter or read previous editions
Pawsey boosts Galaxy Australia’s capabilities with COVID-19 grant
Australian researchers can now rapidly analyse their SARS-CoV-2 data using published tools and workflows by using a new dedicated Galaxy COVID-19 compute node hosted at Pawsey Supercomputing Centre. The ability of Galaxy Australia and Pawsey to jointly deliver this enabling data analytics platform has been made possible as part of the COVID-19 Accelerated Access Initiative in which Australia’s national HPC facilities responded quickly to the pandemic with streamlined, prioritised and expedited access to computation and data resources. NCI Australia and the Pawsey Supercomputing Centre have now announced that the Galaxy COVID-19 compute node would be hosted on Pawsey’s newly deployed Nimbus Cloud, guaranteeing tailored resources for urgent public health research.
Australian researchers can now rapidly analyse their SARS-CoV-2 data using published tools and workflows by using a new dedicated Galaxy COVID-19 compute node hosted at Pawsey Supercomputing Centre. The ability of Galaxy Australia and Pawsey to jointly deliver this enabling data analytics platform has been made possible as part of the COVID-19 Accelerated Access Initiative in which Australia’s national HPC facilities responded quickly to the pandemic with streamlined, prioritised and expedited access to computation and data resources. NCI Australia and the Pawsey Supercomputing Centre have now announced that the Galaxy COVID-19 compute node would be hosted on Pawsey’s newly deployed Nimbus Cloud, guaranteeing tailored resources for urgent public health research.
The new resource will deliver the benefits that were highlighted in the recent joint publication written by the international Galaxy team from Australia, Germany, Belgium and the USA who demonstrated how the Galaxy platform can facilitate the exchange of data and reproducible workflows between authorities, institutes and laboratories, ensuring that progress is no longer limited by access to samples and data. The compute allocation at Pawsey has been set up to exclusively underpin the use of the COVID-19 related tools and workflows outlined in that publication, on Galaxy Australia.
A national call out for Australian researchers tackling the COVID-19 pandemic resulted in seven projects receiving access to computation and expertise. Read the Leaders in Australian Computing Research Begin Battle with COVID-19 media release here.
Progress on national plan for Containers and K8s
Members of our team recently advanced the Australian BioCommons Software and Containers project by participating in two strategic meetings at Pawsey Supercomputing Centre in Perth. The March events drew together partners who are contributing to the national roll out of a common bioinformatics software containerisation and meta-data standard, and a common implementation standard of the open-source container-orchestration system Kubernetes for use by Australian life scientists.
Members of our team recently advanced the Australian BioCommons Software and Containers project by participating in two strategic meetings at Pawsey Supercomputing Centre in Perth. The March events drew together partners who are contributing to the national roll out of a common bioinformatics software containerisation and meta-data standard, and a common implementation standard of the open-source container-orchestration system Kubernetes for use by Australian life scientists.
Kubernetes (K8s) is an open-source system for automating deployment, scaling, and management of containerized applications. It groups containers that make up an application into logical units for easy management and discovery. Kubernetes is currently used by many individual research groups in Australia, but there is no currently no communication, coordination or local support for the service.
Two days of strategic discussions took place during the Australian BioCommons Software and Containers Workshop and the ARDC-funded National Kubernetes Core Services Planning Meeting. The workshop reviewed the ongoing progress of a large collaborative project led by the Australian BioCommons to identify the use cases for a coherent environment around containers and to coordinate the essential elements required to build the interoperable national infrastructure. The Planning Meeting discussed the implementation of a national eResearch program for supporting container orchestration and the use of Kubernetes.
A large group representing significant stakeholders including Pawsey, Australian BioCommons, ARDC, NCI, AARNet, ELIXIR Europe (remote attendance), CSIRO, Monash University, Melbourne University, Intersect, University of NSW, University of Tasmania/ TPAC, University of Queensland/ RCC, QCIF and University of Sydney worked together to define their next steps in the national Kubernetes plan.
These strategic meetings followed on from ten days of hands-on training by HPCNow! to upskill the Pawsey operations and end-user support teams in the Kubernetes system. Read more about Pawsey’s growing expertise in container orchestration solutions here.
We’ll keep you updated on how this coordinated effort will provide new bioinformatics solutions for Australian research community.
Rapid, collaborative and transparent analysis of novel coronavirus on Galaxy Australia
Researchers from universities in Germany, Belgium, Australia and the USA, have used publicly available novel coronavirus (COVID-19) genome data and published their analyses using Galaxy, an open source research platform.
The joint paper, written by the international Galaxy team, demonstrates how the COVID-19 genome data can be shared, analysed and reproduced in an efficient and transparent way. In the wake of the COVID-19 pandemic, the researchers showed how Galaxy could facilitate the exchange of data and reproducible workflows between authorities, institutes and laboratories dealing with the virus. The international Galaxy platform, through the provision of highly accessible, globally shared data and analytics platforms, has the potential to transform the way biomedical research is performed. By offering access to data and an open and reproducible analytics environment, the Galaxy platform ensures that progress is no longer limited by access to samples and data.
Published in partnership with ARDC
Researchers from universities in Germany, Belgium, Australia and the USA, have used publicly available novel coronavirus (COVID-19) genome data and published their analyses using Galaxy, an open source research platform.
The joint paper, written by the international Galaxy team, demonstrates how the COVID-19 genome data can be shared, analysed and reproduced in an efficient and transparent way. In the wake of the COVID-19 pandemic, the researchers showed how Galaxy could facilitate the exchange of data and reproducible workflows between authorities, institutes and laboratories dealing with the virus. The international Galaxy platform, through the provision of highly accessible, globally shared data and analytics platforms, has the potential to transform the way biomedical research is performed. By offering access to data and an open and reproducible analytics environment, the Galaxy platform ensures that progress is no longer limited by access to samples and data.
The Australian Research Data Commons (ARDC) and Bioplatforms Australia have partnered with Australian BioCommons to ensure that Galaxy Australia maintains tools, workflows and reference datasets for the Australian research community. The ARDC investments have seen the Genomics Virtual Laboratory and Galaxy Australia become essential services for training and analysis in data-intensive biomedical research. The total investment in grants and compute allocations to all Genomics Virtual Laboratory and Galaxy related projects since 2012 was approximately $6.43M, of which $5.18M was in project grants and an estimated $1.25M in underpinning compute and data storage resourcing. This support was augmented by national coinvestment of $6.69M ($4.23M project co-investment plus (estimated) $2.46M compute provisioning) from the University of Melbourne, Queensland Cyber Infrastructure Foundation, Research Computing Centre (UQ), Bioplatforms Australia.
Through coinvestment from the ARDC’s Platforms Programs over the next three years, work is underway to broaden Galaxy Australia’s capabilities by increasing the communities that can use the platform and the types of analyses the platform can perform. The most recent "BioCommons BYOD [Bring Your Own Data] Expansion Project" Platforms grant of $2.21M will bolster the contributions of Australian BioCommons, University of Melbourne, Bioplatforms Australia, Australia’s Academic and Research Network, Australian Access Federation, National Computational Infrastructure, Pawsey Supercomputing Center, Queensland Cyber Infrastructure Foundation, Melbourne Bioinformatics, and Sydney Informatics Hub.
Associate Professor Andrew Lonie, Director of Australian BioCommons, says digital technologies are proving transformational for researchers in the life sciences domain:
“The enhanced Galaxy Australia platform will position Australia at the forefront of bioinformatics infrastructure and substantially improve Australian researcher’s access to bioinformatics.”
ARDC’s Director of Platforms and Engagement, Dr Andrew Treloar says the breadth of ARDC investment in research platforms ensures Australia's world class research system continues to improve productivity, create jobs, lift economic growth and support a healthy environment:
“It’s fantastic to be investing in research-orientated platforms and services that integrate and provide access to a range of resources to researchers and industry. This is a great opportunity to collaborate with our partners and universities at the cutting-edge of research to provide Australian researchers with competitive advantage through data.”
During an outbreak like the COVID-19, the development and implementation of effective infection control and prevention measures relies on the global research community’s ability to share data in a timely manner and perform fast and reproducible analyses. Platforms like Galaxy Australia can enable and accelerate this process. Read more about Galaxy Australia’s role in the recent COVID-19 study below and visit the ARDC website to find out how the ARDC is supporting an expansion of the Galaxy Australia project.
Galaxy Australia contributes to global research effort into COVID-19
The recent public health emergency arising from the COVID-19 outbreak has demonstrated the necessity for a rapid, collaborative and international response. The development of fast and effective countermeasures relies on the global research community’s ability to share data and perform fast and reproducible analyses.
A joint paper by Galaxy teams from Australia, Europe and the United States demonstrated how the COVID-19 genome data can be shared, analysed and reproduced in an efficient and transparent way. The study “No more business as usual: agile and effective responses to emerging pathogen threats require open data and open analytics” re-analysed all COVID-19 genomic data available in the public domain using Galaxy platforms and open software tools. The publication highlighted the inadequate accessibility of raw data associated with COVID-19 research, and described how the work completed on Galaxy opened up the possibility for any researcher worldwide to perform their own analyses with the data, analysis pipelines and public computational infrastructure freely available.
The recent public health emergency arising from the COVID-19 outbreak has demonstrated the necessity for a rapid, collaborative and international response. The development of fast and effective countermeasures relies on the global research community’s ability to share data and perform fast and reproducible analyses.
A joint paper by Galaxy teams from Australia, Europe and the United States demonstrated how the COVID-19 genome data can be shared, analysed and reproduced in an efficient and transparent way. The study “No more business as usual: agile and effective responses to emerging pathogen threats require open data and open analytics” re-analysed all COVID-19 genomic data available in the public domain using Galaxy platforms and open software tools. The publication highlighted the inadequate accessibility of raw data associated with COVID-19 research, and described how the work completed on Galaxy opened up the possibility for any researcher worldwide to perform their own analyses with the data, analysis pipelines and public computational infrastructure freely available.
The international Galaxy network provides highly accessible, globally shared data and analysis platforms, and offers the potential to transform the way biomedical research is done. By offering access to data and an open and reproducible analytics environment, it can ensure that progress is no longer hampered by access to samples and data. The various Galaxy facilities provide community-based infrastructure for research in recognition of exactly what the paper’s authors found: “there is a global need to ensure access to free, open, and robust analytical approaches that can be used by anyone in the world to analyze, interpret, and share data.”
This recent work involved Galaxy Europe (operated by ELIXIR Germany), usegalaxy.be (operated by ELIXIR Belgium and the Flemish Supercomputer Center), Galaxy US and Galaxy Australia. The Galaxy Australia team rapidly implemented the COVID-19 analysis workflows across Galaxy Australia’s distributed national compute infrastructure, operated by Melbourne Bioinformatics and the Queensland Cyber Infrastructure Facility. Galaxy Australia is widely accessible through its deployment on the Australian Research Data Commons’ (ARDC) Nectar Research Cloud and receives funding from the Australian BioCommons.
All analyses performed by Galaxy teams are fully documented and accessible at:
https://github.com/galaxyproject/SARS-CoV-2/; https://doi.org/10.5281/zenodo.3685264
Accessible on Galaxy Australia at: https://usegalaxy.org.au/workflows/list_published; Tag: covid-19
Full paper available here: No more business as usual: agile and effective responses to emerging pathogen threats require open data and open analytics Galaxy and HyPhy developments teams, Anton Nekrutenko, Sergei L Kosakovsky Pond, bioRxiv 2020.02.21.959973 doi: https://doi.org/10.1101/2020.02.21.959973
See also:
Coronavirus data analysis: Galaxy Europe
Open collaborative infrastructure to tackle public health emergencies: ELIXIR Europe
Galaxy Australia is hosted by the University of Melbourne and the Queensland Cyber Infrastructure Foundation and is enabled by NCRIS via Australian BioCommons funding through Bioplatforms Australia and the ARDC.
Seamless sharing of childhood cancer data and analysis between researchers across international borders
The development of personalised treatments that target rare paediatric cancer subtypes can be enhanced through global collaboration. Comparing an Australian patient's tumour to a larger group of other tumours allows insights that lead to better outcomes. But geography and rules to protect personal data in different jurisdictions can make the sharing and comparing of essential data difficult or even impossible.
In an effort to fix this, the Australian BioCommons is part of an international collaboration that came together in Sydney this month. Members of the partnership between the Australian BioCommons, BioPlatforms Australia, ARDC, Children’s Cancer Institute, D3b and Seven Bridges have been working to provide an integrated bioinformatics research platform with compute, storage, and file metadata tagging all in one place.
The development of personalised treatments that target rare paediatric cancer subtypes can be enhanced through global collaboration. Comparing an Australian patient's tumour to a larger group of other tumours allows insights that lead to better outcomes. But geography and rules to protect personal data in different jurisdictions can make the sharing and comparing of essential data difficult or even impossible.
In an effort to fix this, the Australian BioCommons is part of an international collaboration that came together in Sydney this month. Members of the partnership between the Australian BioCommons, BioPlatforms Australia, ARDC, Children’s Cancer Institute, D3b and Seven Bridges have been working to provide an integrated bioinformatics research platform with compute, storage, and file metadata tagging all in one place.
“This multinational project is establishing internationally federated computational infrastructure to enable the harmonisation of pediatric cancer genomic data from Australia’s ZERO Childhood Cancer Program and the Gabriella Miller Kids First Data Resource Centre in the United States. ”
Australian initiatives such as Zero Childhood Cancer will leverage the benefits provided by Cavatica, through its expansion to AWS Sydney. Cavatica is a cloud-based platform for collaboratively accessing, sharing, and analysing cancer data.
Cavatica was launched in 2016 as a partnership between the Center for Data Driven Discovery in Biomedicine (D3b) at the Children’s Hospital of Philadelphia (CHOP), Seven Bridges, the Children’s Brain Tumor Tissue Consortium (CBTTC) and the Pacific Pediatric Neuro-Oncology Consortium (PNOC). Since then, it has expanded to being a collaborative platform for a number of initiatives, including the Gabriella Miller Kids First Data Resource Center (KFDRC).
AWS compute is leveraged allowing for high-throughput analysis, while workflows written in common workflow language (CWL) with docker to maximise portability and reusability. Additionally, via Cavatica’s Data Cruncher, analyses using various open-source R and Python packages can be shared through Jupyter Notebook. The KFDRC has used this platform to harmonise and process over 15,000 whole genomes, whole exomes, and RNA-seq, including alignment, somatic variant calling, copy number calls, structural variants, RNA expression and fusions.
During the visit, key members of the D3b team provided training in using Cavatica and Dr Allison Heath, Director of Data Technology and Innovation, Center for Data Driven Discovery in Biomedicine (D3b) at the Children's Hospital of Philadelphia kindly delivered an overview of Cavatica's features for the Australian BioCommons webinar series while she was in our time zone!
The intensive days together brought closer the reality of the platform’s readiness for use by Australian researchers in coming months. Stay tuned as Cavatica will soon be enabling seamless sharing of data and analysis methods between researchers in Australia and the United States.
See also: https://ardc.edu.au/news/developing-personalised-treatment-for-kids-with-cancer/
Watch the webinar Cavatica - the cloud-based platform for collaboratively accessing, sharing, and analysing cancer data
New investment to tackle the data challenges of bioscience researchers
A new investment from the Australian Research Data Commons (ARDC) will enable significant expansion of the Australian BioCommons ‘Bring Your Own Data (BYOD)’ Platform.
Earlier this year, discipline-focussed research-orientated platforms were invited to apply for investment to support better connections between data-related resources, industry and researchers. The Australian BioCommons submitted an application involving eight partner organisations: Bioplatforms Australia, Australian Access Federation, AARNet, National Computational Infrastructure, University of Sydney, University of Melbourne, QCIF and Pawsey Supercomputing Centre. The proposal, BioCommons Bring Your Own Data (BYOD) Expansion Project, detailed how this group would work together to build on the foundational work already being coordinated through the BioCommons.
A new investment from the Australian Research Data Commons (ARDC) will enable significant expansion of the Australian BioCommons ‘Bring Your Own Data (BYOD)’ Platform.
Earlier this year, discipline-focussed research-orientated platforms were invited to apply for investment to support better connections between data-related resources, industry and researchers. The Australian BioCommons submitted an application involving eight partner organisations: Bioplatforms Australia, Australian Access Federation, AARNet, National Computational Infrastructure, University of Sydney, University of Melbourne, QCIF and Pawsey Supercomputing Centre. The proposal, BioCommons Bring Your Own Data (BYOD) Expansion Project, detailed how this group would work together to build on the foundational work already being coordinated through the BioCommons.
We are delighted that the Australian BioCommons proposal has been successful in securing investment through the ARDC Platforms program.
The investment will enable the integration of data-generating instruments across genomics, proteomics and metabolomics, enhance accessibility to high-priority reference data, and manage access to compute infrastructures. It will support a wide range of Australian life science researchers by:
Improving and expanding the established highly accessible Graphical User Interface (GUI)-based BYOD platform (Galaxy Australia), that gives all life sciences researchers, including informaticians, access to a well-structured, worlds-best-practice bioinformatics workbench for research and training.
Developing a complementary Command Line Interface (CLI)-focussed BYOD platform, which will provide a scalable and flexible set of open programmatic resources to create, access and exchange workflows, tools and training across any national and institutional compute infrastructures.
Developing a pan-national data infrastructure that will connect -omics instruments and reference datasets to the analysis infrastructure, underpinned by a capability to transport data nationally and internationally.
More information on the ARDC Platforms program is available on the ARDC website.