Abstract

A vast amount of data is generated in our society, encompassing an array of measurements, accompanied by location and time information. This multivariate spatio-temporal data is used to study important problems challenging today’s world including climate change and disease spread. This research introduces new methods and frameworks with accompanying open-source software to help in the analysis of multivariate spatio-temporal data.

The first contribution (Chapter 2) delivers diagnostic plots designed to understand the optimisers for high-dimensional projection pursuit. It provides computational tools to track the optimisation progress and coverage of the parameter space. The second contribution (Chapter 3) develops a new data structure, called cubble, for organising spatio-temporal data. Spatio-temporal data are often split into multiple tables, each with different observation units, or organised into a memory-inefficient single table combining all the data. The new data structure organises the spatial and temporal components of the data into a single object, efficiently, and allows pivoting separately into the spatial and temporal tables for different analyses. The third contribution (Chapter 4) introduces a data pipeline for constructing indexes from multivariate spatio-temporal data. While indexes from various domains use similar statistical methodologies to summarize data into an index, a standardised workflow to systematically assemble indexes is absent. This work bridges this gap by offering a unified framework and infrastructure to modularise the steps in constructing an index.