Dr. Daniel Nickelsen

Ph.D
Rechnerorientierte Statistik und Datenanalyse
Telefon: +49 821 598 - 2242
Fax: +49 821 598 - 2236
E-Mail:
Raum: 3028 (L1)
Sprechzeiten: nach Vereinbarung oder per E-Mail
Adresse: Universitätsstraße 14, 86159 Augsburg

research projects

I did my PhD in statistical physics on stochastic processes in the non-equilibrium thermodynamics of microscopic systems as well as the energy cascade in fully developed turbulence. I always worked at the intersection between theoretical physics and applied mathematics. Because of that, my field of research has become quite broad over the years, steadily moving towards applied mathematics. Today, most of my research falls under statistical learning, with a particular interest in Bayesian inference.

 

Below a list of active research projects, projects with a star (*) is my current main focus. Student projects are possible for all topics.

 

Due to the rise of renewable energy, the energy sector is witnessing a transformation towards decentralised and fluctuating energy production as well as participative consumers. The electricity market is the financial link between these two sides, and as such is becoming an increasingly dynamic market place. This is particularly apparent on the intraday market, where electricity can be traded and delivered with practically zero lead time. In contrast to the day-ahead market, where (e.g. hourly) prices are determined for the following day, forecasts on the intraday market turn out to be notoriously difficult. In fact, reliable forecast methods do not exist yet, and it is even debated whether useful forecast are possible at all.
To tackle this problem, we turn to Bayesian models, which allow for a complete probabilistic description. Forecasts done in this way have the advantage that their uncertainty is estimated as well, which practitioners can use to determine to what extent they want to incorporate forecasts into their trading.
Python code and data available.

Insurance companies use the statistics of historical climate data to estimate risks with regard to elementary forces like floods and storms. This works fine as long as the data as approximately stationary in time. In particular in the last decade or so, due to climate change, it has become clear that maintaining the assumption of stationarity would be grossly negligent for risk estimates. Therefore, statistical methods are called for that can not only estimate the risk of extreme weather events, but also predicts the increasing trend of such risks.
The first step to tackle this problem is to obtain high resolution geological data that span a sufficiently long time period. With enough data, extreme value statistics becomes applicable to estimate risks of extreme events. To capture the increasing trend, we plan to use autoregressive or equation learning methods to replace the parameters of the extreme value distributions with estimated time functions.

Bayesian inference builds upon Bayes formula, which combines a model likelihood describing observations with a prior distribution on the parameters of the likelihood to yield a posterior distribution for these parameters. For the exact posterior distribution, a normalisation factor is required, which is known as prior-predictive value, marginal likelihood, or model evidence. The last term already hints at a useful property: rewriting Bayes formula on the level of models shows that the evidence is proportional to the probability of the likelihood being the correct model for the given observations. Due to this property, the model evidence is often attributed the principle of Occam's Razor, which is finding the perfect balance between model expressivity and interpretability. However, the model evidence often is the result of a particularly nasty integral, the overwhelming majority of Bayesian inference applications therefore omits the normalisation step and is content with an improper posterior distribution.
To overcome this difficulty, we are developing a non-equilibrium integrator (NEQI) that generalises thermodynamic integration such that the required sampling can do without burn-in and thinning. Furthermore, adapting stochastic variants of the second law of thermodynamics (so-called fluctuation theorems) allows to combine a multitude of sampling trajectories into one estimation for a deep exploration of the sample space as a consequence of the non-equilibrium nature of the procedure. This method is not limited to determining the model evidence, but can simultaneously estimate posterior averages with practically no added computational cost. We have a first python implementation of the method building on TensorFlow Probability using the acceleration package JAX, which is used for project 1) on probabilistic intraday forecasts.
Python code available.
Possible student project (MSc level or higher): The estimation of the model evidence is connected to a method called Annealed Importance Sampling (AIS). The exact correspondence between NEQI and AIS is unclear and can be subject of the project, as well as a comparison of the two methods in a simulation study (e.g. mixture models with multimodal sample space).

Deep learning thrives to ever larger models with millions of parameters, at the cost of interpretability. In many applications, however, small models with fast adaption to data shifts are more practical. In these cases the discipline of equation learning or symbolic regression is gaining popularity, where, instead of putting our trust in the generality argument of deep networks, nonlinear equations itself are estimated from the data. These equations include dynamical systems in particular. One approach to equation learning is using a basis function library to consider a vast amount of candidate models. Greedy algorithms like stepwise regression are then used to single out the most likely model. We propose a much less greedy algorithm which builds on a combination of the computationally very cheap coefficient of determination, $R^2$, and an explicit expression for the Bayesian model evidence.
Python code available.
Possible student project (BSc-MSc level or higher): The method has not yet been applied to real data wich can be subject of a student project. An improvement of the method where the identified terms as the learning proceeds are iteratively subtracted from the target data is another idea for a student project.

Large deviation theory (LDT) is devoted to asymptotic log-densities in the limit of large sums of random variables. These log-densities are, more precisely, known as rate functions or large deviation functions, and are often easier to determine or approximate than the full density, but yet capture the full tail information. A simple example is the central limit theorem with a quadratic rate function for the normal distribution. LDT builds on the convexity of the rate function and a linear scaling with the limit variable, both of which recently turned out to be violated for the Ornstein-Uhlenbeck process with higher integrated moments, a surprisingly simple exception of the otherwise broad applicability of LDT. Current research aims at determining the rate function for this and other examples of such anomalous scaling.
Python code available.
Possible student project (MSc level or higher): We have run an importance sampling simulation that reaches far into the tails of the densities of questions. The results of this simulation is waiting to be compared with a controversially discussed claimed exact rate function for the higher integrated moments of the Ornstein-Uhlenbeck process.

A fundamental assumption of statistical physics is that, loosely speaking, our world is a giant Laplace experiment, in the sense that any system chooses its available microscopic states with equal probability. The probability of macroscopic states therefore follow directly from counting the number of microstates forming that macrostate. The thought ensemble of systems in equally likely microstates that correspond to a macrostate is known as microscopic ensemble.
This foundation of statistical physics and thermodynamics is challenged by the attempt to derive the Laplace assumption from first principles, that is, quantum mechanics. Intriguingly, in these attempts, other distributions than the uniform distribution arise, summarised under the term "diagonal ensemble". In quantum mechanics, the Schrödinger equation is in essence the time evolution equation for the occupancies of microstates, which can be phrased as a large set of coupled linear ordinary differential equations. As the preparation of a quantum system in an initial state is generally an inherently random process, also the initial values for the Schrödinger equation are set randomly. This deterministic process with random initial values, also known as a Liouville process, exhibits a family of stationary distributions for which the equal probability condition can be checked in a novel way. An interesting mathematical structure emerges, which includes the uniform distribution and the distributions of the diagonal ensemble.
Python code available.
Possible student project (MSc level or higher): This project is almost done and is hibernating on my shelf for some time now. Outstanding steps are a careful check of some tricky statistics for soundness and to run some examples. The manuscript is half written. Student with an inclination to statistical physics are welcome to help finishing this project for publication as part of a thesis.

Suche