Research Abstract

Abstract

The healthcare landscape faces critical and compounding challenges in managing, analysing, and democratising access to rapidly expanding biomedical datasets. Despite advances in high-throughput sequencing, multi-omics platforms, and electronic health records, existing infrastructures remain fragmented, failing to simultaneously address scalability, efficiency, privacy, and accessibility for non-technical users. Furthermore, the inherently multi-dimensional nature of biomedical data, spanning genomic, clinical, and imaging modalities, exceeds what traditional two-dimensional desktop interfaces can effectively communicate, motivating the need for immersive three-dimensional environments that enable spatial reasoning and intuitive exploration of complex data relationships. This thesis addresses four field-level research gaps: (1) the absence of validated hybrid edge-cloud architectures for real-time biomedical analytics across portable and immersive environments, (2) the lack of adaptive, domain-aware compression methods exploiting patterns across patient cohorts, (3) limited integration of natural language interfaces with immersive analytics for biomedical data exploration, and (4) the absence of a unified evaluation framework for systematically assessing AI-driven immersive visualisation systems beyond traditional static metrics.

This research presents a unified framework comprising four interrelated contributions that transform how biomedical data are stored, compressed, analysed, and visualised. First, a novel hybrid edge-cloud framework strategically distributes computation across edge devices for local pre-processing and cloud infrastructure for scalable analytics, evaluated on real-world B-ALL and CLL patient cytometry datasets, achieving 56% data reduction, 57% faster upload times, enhanced privacy through local pseudonymisation, and projected annual cost savings of US ~$4,679 for large-scale genomic projects (40 TB datasets). Second, a Trie-Based shared dictionary compression methodology exploits global redundancy patterns across entire patient cohorts rather than individual files, evaluated on B-ALL (178 patients, 7.92 GB) and CLL (162 patients, 3.98 GB) datasets, achieving a 68.2% reduction in file size compared to 57.9% with traditional ZSTD, with an additional saving of US ~$676 per year over ZSTD alone at this scale, while maintaining compatibility with standard bioinformatics pipelines. Third, a generalised voice-driven immersive analytics architecture, instantiated as MediVerse, an AI-powered virtual reality analytics platform, addresses the inherent limitations of two-dimensional interfaces for exploring high-dimensional biomedical data, enabling non-computational researchers to interactively query and visualise multi-dimensional biomedical data through natural language, evaluated on paediatric leukaemia (247 patients) and de-identified paediatric tumour gene expression (101 tumour samples) datasets, achieving 95% retrieval accuracy, 1100 to 1740 ms end-to-end latency, and 95% correct visualisation selection. Fourth, IVEM (Immersive Visualisation Evaluation Model) introduces a unified metric suite with six operationalised metrics across three dimensions (Rendering Quality, AI Decision Quality, and Explanation Quality), validated on the same paediatric leukaemia and de-identified paediatric tumour gene expression datasets, enabling diagnostic precision and objective cross-system comparison for AI-driven immersive visualisation systems.

Collectively, these contributions demonstrate that complex biomedical workflows from data storage and compression to interactive voice-driven visualisation can be completed efficiently in immersive environments, eliminating technical barriers for frontline clinicians and researchers without computational backgrounds, while demonstrating that immersive three-dimensional environments offer measurable advantages over traditional interfaces for exploring the spatial and relational complexity inherent in multi-dimensional biomedical datasets. This work advances biomedical informatics, distributed computing, data compression, and human-computer interaction, providing validated architectural patterns, compression strategies, and evaluation frameworks that collectively enable efficient, interpretable, and accessible biomedical data ecosystems aligned with precision medicine and data democratisation goals.

Candidate: Rani Adam Primary supervisor: Assoc. Prof. Quang Vinh Nguyen Institution: Western Sydney University Year: 2026