SummaryLarge geophysical data has traditionally been difficult to manage in a consistent, open, and efficient manner. The demands of modern, large-scale computing techniques, coupled with the need for sound data and metadata management, mean that established data formats and access methods are no longer adequate.Geoscience Australia (GA) has been working with its partners to leverage and extend existing data standards to represent various geophysical data in modern scientific container formats including netCDF & HDF. The new data encodings support rapid and efficient data subsetting, either directly from a file or remotely via web services. These will underpin GA’s future data delivery pipelines for Australian government-funded geophysical data.NetCDF efficiently handles multi-variate raster, line, and point data, as well as n-dimensional data structures supporting more demanding applications such as AEM and airborne gravity data. Structural and metadata standards deliver interoperability, and existing and emerging data types are supported without loss of precision or other information.This extended abstract will cover: The rationale for Modernising GA’s geophysical data holdings into modern open standard container formatsAn outline of the netCDF4 file format and associated tools, and some of the benefits they provideThe open-source tools and methodology used to translate grid, line, point and other data into netCDF4, and to perform metadata synchronisationA brief description of a live use case exploiting web services
Following extreme flooding in eastern Australia in 2011, the Australian Government established a programme to improve access to flood information across Australia. As part of this, a project was undertaken to map the extent of surface water across Australia using the multi-decadal archive of Landsat satellite imagery. A water detection algorithm was used based on a decision tree classifier, and a comparison methodology using a logistic regression. This approach provided an understanding of the confidence in the water observations. The results were used to map the presence of surface water across the entire continent from every observation of 27 years of satellite imagery. The Water Observation from Space (WOfS) product provides insight into the behaviour of surface water across Australia through time, demonstrating where water is persistent, such as in reservoirs, and where it is ephemeral, such as on floodplains during a flood. In addition the WOfS product is useful for studies of wetland extent, aquatic species behaviour, hydrological models, land surface process modelling and groundwater recharge. This paper describes the WOfS methodology and shows how similar time-series analyses of nationally significant environmental variables might be conducted at the continental scale.
The effort and cost required to convert satellite Earth Observation (EO) data into meaningful geophysical variables has prevented the systematic analysis of all available observations. To overcome these problems, we utilise an integrated High Performance Computing and Data environment to rapidly process, restructure and analyse the Australian Landsat data archive. In this approach, the EO data are assigned to a common grid framework that spans the full geospatial and temporal extent of the observations – the EO Data Cube. This approach is pixel-based and incorporates geometric and spectral calibration and quality assurance of each Earth surface reflectance measurement. We demonstrate the utility of the approach with rapid time-series mapping of surface water across the entire Australian continent using 27 years of continuous, 25 m resolution observations. Our preliminary analysis of the Landsat archive shows how the EO Data Cube can effectively liberate high-resolution EO data from their complex sensor-specific data structures and revolutionise our ability to measure environmental change.
The Australian Geoscience Data Cube (AGDC) aims to realise the full potential of Earth observation data holdings by addressing the Big Data challenges of volume, velocity, and variety that otherwise limit the usefulness of Earth observation data. There have been several iterations and AGDC version 2 is a major advance on previous work. The foundations and core components of the AGDC are: (1) data preparation, including geometric and radiometric corrections to Earth observation data to produce standardised surface reflectance measurements that support time-series analysis, and collection management systems which track the provenance of each Data Cube product and formalise re-processing decisions; (2) the software environment used to manage and interact with the data; and (3) the supporting high performance computing environment provided by the Australian National Computational Infrastructure (NCI). A growing number of examples demonstrate that our data cube approach allows analysts to extract rich new information from Earth observation time series, including through new methods that draw on the full spatial and temporal coverage of the Earth observation archives. To enable easy-uptake of the AGDC, and to facilitate future cooperative development, our code is developed under an open-source, Apache License, Version 2.0. This open-source approach is enabling other organisations, including the Committee on Earth Observing Satellites (CEOS), to explore the use of similar data cubes in developing countries.