# Statistics

## Comparison of the goodness of regression models

### a) Questions to be addressed

How can the goodness of regression models be assessed?
Different in-sample and out-of-sample measures for the quality of regressions are to be presented and used on the basis of a data set.
For this purpose, different regression models (e.g. with a different number of explanatory variables) have to be applied, which are then used in their explanatory power for an in-sample period and subsequently for the forecast of an out-of-sample period.

### b) Data set to be used

A selection of different data sets on the topic will be provided by the supervisor.
From this, 2-3 data sets (e.g. stock returns and returns in commodity market) are to be selected and empirically considered.
Based on the data set a factor model has to be estimated: Returns (e.g. returns of the DAX or returns of the EUR/USD currency rate or returns of a commodity) of t+1 are explained by macroeconomic and microeconomic factors at time t.

### c) Additional information for processing

An analysis of both in-sample fit and out-of-sample forecast goodness of fit must be performed
A data set is divided into two parts (in the ratio of 2/3 and 1/3). The first part forms the in-sample period. Here, the best regression model is selected based on goodness-of-fit criteria (AIC, BIC, R2). Here, the selected goodness criteria have to be justified.
The second part of the data set (1/3) forms the out-of-sample. For this, "real" forecasts are made, since in fact an attempt is made to explain the value of the dependent variable for t+1 by information in t. This is done by using the "real" forecasts. Here, it is necessary to use a true out-of-sample forecasting methodology with R: Rolling Window or Expanding Window methods must be used here. These have to be programmed using the statistical software R.

### Introductory literature

• Introductory Statistics with R von Daalgard (in der Bibliothek verfügbar)
• Time Series Models for Business and Economic Forecasting von Franses, van Dijk und Opschoor, Cambridge University Press (kann per Fernleihe bezogen werden)
• Probability and Statistics with R von Ugarte, Militino und Arnholt (in der Bibliothek verfügbar)
• Angewandte Statistik: Methodensammlung mit R vonSachs/Hedderich (in der Bibliothek verfügbar)
• Introductory Econometrics von Wooldridge (in der Bibliothek verfügbar)
• The R Book von Crawley (in der Bibliothek verfügbar)

Resources for getting started in R are provided by the supervisor.

## Tests for the normal distribution

### a) Questions to be worked on

• Different tests of the normal distribution (Jarque-Bera, Kolmogorov-Smirnov, X2-fitting test) are to be presented and explained in their idea as well as mathematical implementation in own words.
• The tests to be used have to be selected by the students themselves: Three different tests have to be described and applied. Since there are about 40 different tests for the normal distribution, there is a wide choice. Tests should be selected, which are already implemented in R packages.
• As a starting point for the analysis of the normal distribution graphical analyses (QQ plots etc.) should be used.
• In the literature section, it must be discussed why the normal distribution is of great importance in general and in the financial market context in particular. Relevant literature has to be selected and summarized in own words and supported by own conclusions.
• The tests must be carried out empirically in R. In addition, a small simulation study conducted in R must also look at the 1st kind error and 2nd kind error of each test.
• Graphs of the tests and the graphical analyses are to be done using R.

### b) Data set and simulation to be used

• A selection of different data sets on the topic will be provided by the supervisor.
• From these, 2-3 data sets (e.g. stock returns and returns in commodity market) are to be selected and empirically considered.
• In a small simulation study the quality of the tests will be discussed. In particular, the methodological and empirical aspects of the error of the 1st kind and the error of the 2nd kind will be discussed.

## Tests for the i.i.d.- property of time series data

### a) Questions to be worked on

• Different tests of the i.i.d. property are to be presented and explained in their idea as well as mathematical implementation.
• The tests have to be explained in their idea and mathematically and have to be carried out empirically on a real dataset using the statistical software R.
• A small simulation study should also check whether the tests detect an existing dependency and which of the selected tests is "better".

### b) Data set and simulation to be used

• A selection of different data sets on the topic will be provided by the supervisor.
• From these, 2-3 data sets (e.g. stock returns and returns in commodity market) are to be selected and empirically considered.
• A small simulation study (in R) should check whether the tests can detect an existing dependence.

### Introductory literature

• Asset Price Dynamics, Volatility and Prediction von Taylor (in der Bibliothek verfügbar)
• Probability and Statistics with R von Ugarte, Militino und Arnholt (in der Bibliothek verfügbar)
• Angewandte Statistik: Methodensammlung mit R von Sachs/Hedderich (in der Bibliothek verfügbar)
• The R Book von Crawley (in der Bibliothek verfügbar)

Resources for getting started with R will be provided by the supervisor.

## Contact

Ordinarius (Full Professor)
Faculty of Business and Economics
• Room 2317 (Building J)