BIGDATA: IA: Collaborative Research: Asynchronous Distributed Machine Learning Framework for Multi-Site Collaborative Big Brain Data Mining

Project: Research project


The research objective of this proposal is to address the computational challenges in the emerging multi-site collaborative big brain data mining. A new asynchronous distributed machine learning framework is proposed to integrate the distributed machine learning and multi-level data intensive computing with the emerging key computational techniques: asynchronous doubly stochastic proximal gradient, biological domain knowledge guided additive learning models, collaborative multi-dimensional data integration, asynchronous communication-efficient distributed algorithms, decoupled parallel backpropagation for faster deep learning, graph convolutional generative adversarial network, and asynchronous stochastic zeroth order algorithm. To this end, this project focuses on designing principled multi-site collaborative big data mining algorithms for analyzing multi-modal brain imaging genomics and human connectomics data to yield mechanistic understand-ing from gene to brain structure and circuitry, to function, and to phenotypic outcomes with the potential of leading to the next major brain science discoveries.
Specifically, the PIs will investigate: 1) collaborative genotype and phenotype association study using new asynchronous doubly stochastic proximal gradient algorithms and efficient additive learning model; 2) communication-efficient multi-site collaborative data integration models to integrate imaging genomics data for accurately predicting outcomes of interest; 3) collaborative deep learning algorithms speedup by the asyn-chronous mini-batch gradient descent and decoupled parallel backpropagation with applications in temporal cognitive change prediction; 4) new graph convolutional deep learning models for brain network mining; 5) evaluation and validation in multi-site collaborative brain imaging genomics and connectomics studies. It is innovative to integrate new distributed machine learning and data-intensive computing to brain imaging ge-nomics and connectomics that hold great promise for a systems biology of the brain. Given their rich research experience in machine learning, big data mining, neuroinformatics, brain science, the PIs are in a unique position to achieve the above ambitious yet feasible goals.
Effective start/end date1/1/1912/31/22


  • National Science Foundation (IIS-1837999)


Data mining
Learning systems
Imaging techniques
Data integration
Parallel algorithms
Learning algorithms
Deep learning