Statistical inference for random functions and measures
03 August 2018

How does one infer the dynamics of a DNA minicircle in solution? How does one align the neuronal firing patterns of several neurons across individuals? These questions are intrinsically statistical, but nevertheless escape the traditional tools of statistics. The ComplexData project investigated such questions from a mathematical and an applied context.

Cover image of Statistical inference for random functions and measures

The data revolution produces not only big data, but often complex data too: objects whose intrinsic structure requires a more sophisticated mathematical formalism than usual to analyse statistically. For instance, such objects may lie in spaces that are infinite dimensional and/or curved. The ComplexData project, funded by the ERC, developed theory and methodology for such data arising in contexts from biophysics to neuroscience.

Among its main contributions was a novel spectral framework to analyse time series of curves or surfaces – time series in Hilbert spaces — where the observation at each time point is a function itself. This was applied to provide the first rigorous statistical evidence that the base-pair composition of DNA strands influences their dynamical behaviour.

“While our work here was motivated by the dynamics of DNA in solution (seen as a curve moving in space, hence giving rise to a time series of curves), one can imagine a plethora of other situations where data fall in this framework,” says Prof. Victor Panaretos.

Another of the project’s key contributions was a novel framework to analyse physiological processes, such as neuronal firing patters, among several individuals where each individual has their own, and unknown, time scale. This corresponded to viewing the data as random elements of the so-called Wasserstein space, and using tools from optimal measure transport to define appropriate notions of means and common time-scales.

The project extended existing techniques, but also generated and applied new ones. “We were able, for instance, to provide rigorous statistical evidence on the nature and degree of association between the base-pair composition and mechanical behaviour of DNA minicircles at persistence length” Prof. Panaretos explains.

 

 

Victor Panaretos is Associate Professor of Mathematical Statistics at EPFL (Switzerland). He obtained his PhD at UC Berkeley (USA) in 2007, receiving the Lehman Award for his thesis. His research focusses on functional and geometrical statistics. An elected fellow of the International Statistical Institute, he serves on the board of several journals, and has been named the 2019 Bernoulli Society Forum Lecturer.

 

 

Project information

COMPLEXDATA
Statistics for Complex Data: Understanding Randomness, Geometry and Complexity with a view Towards Biophysics
Researcher:
Victor Michael Panaretos
Host institution:
Ecole polytechnique fédérale de Lausanne
,
Switzerland
Call details
ERC-2010-StG, PE1
ERC funding
681 146 €