# Archive 2014

**December 18, 2014: ****Marjan ****Sjerps ****(****Dutch Forensic Institute and University of Amsterdam****)**

When: Thursday December 18th, 16.00

Where: TU Delft, Faculty EWI, Mekelweg 4, Snijderzaal (LB 01.010).

*How strong is the evidence?*

Forensic statistics is concerned with questions like: how can we evaluate the strength of forensic evidence such as DNA profiles, tire marks, or chemical analyses? How can we derive the combined evidential strength and how can we communicate this to a lay judicial decision maker (a judge or jury member)?

In this talk I will focus on various topics that caused some debate, such as the way to combine evidence. I will illustrate these topics with a few examples based on casework by the Netherlands Forensic Institute.

**December 11, 2014: Luca Avena**** (****Leiden University****)**

When: Thursday December 11th, 16.00

Where: TU Delft, Faculty EWI, Mekelweg 4, Snijderzaal (LB 01.010).

*A random walk driven by an exclusion process: hydrodynamic limits and path large deviations*

We consider a RW (Random Walk) in a dynamic random environment given by a simple exclusion process on the integer lattice.

Under proper space-time rescaling, we first prove a hydrodynamic limit theorem for the exclusion as seen by the walker when starting out of equilibrium.

Consequently, we derive an ODE describing the macroscopic evolution of the RW. Then, we prove a large deviation principle with an explicit rate functional for the paths of this process. We obtain these results without explicit knowledge of the invariant measure of the exclusion seen by this walker.

This talk is based on two joint papers with T. Franco, M.Jara,and F. Voellering.

**December 4, 2014: Izabela P****etrykiewiz (****Institut Fourier - Grenoble****)**

When: Thursday December 4th, 16.00

Where: TU Delft, Faculty EWI, Mekelweg 4, Snijderzaal (LB 01.010).

*Analytic properties of certain Fourier series arising from modular forms and their connection to continued fractions*

In my talk, I will speak about the local behaviour (differentiability and Hölder regularity exponent) of certain Fourier series, which are related to modular forms. I will focus on functions arising from Eisenstein Series. In our analysis we apply two methods which were developed in the course of the study of the Riemann series: method of Itatsu (1980) and wavelet methods proposed by Jaffard (1996). We find that both differentiability properties and the Hölder regularity exponent at an irrational x are related to the continued fraction expansion of x, in a very precise way.

I will explain all the concepts and give the main ideas of the methods in an accessible way, no advanced knowledge in Analysis and Number Theory is a prerequisite to follow the talk.

**November 27, 2014: Lutz Dümbgen ****(University of Bern)**This talk is also part of the Van Dantzig seminar series.

When: Thursday November 27th, **14.15** (note the different time!)

Where: TU Delft, Faculty EWI, Mekelweg 4, Snijderzaal (LB 01.010).

*Confidence bands for distribution functions: The Law of the iterated logarithm and shape constraints*

In the first part I'll present new goodness-of-fit tests and confidence bands for a distribution function. These are based on suitable versions of the Law of the Iterated Logarithm for stochastic processes on the unit interval. It is shown that these procedures share the good power properties of the Berk and Jones (1979) test and Owen's (1995) confidence band in the tail regions while gaining considerably in the central region.

In the second part these confidence bands are combined with bi-log-concavity, a new shape constraint on a distribution function F: Both log(F) and log(1-F) are assumed to be concave functions. A sufficient condition for bi-log-concavity is log-concavity of the density f = F', but the new constraint is much weaker and includes, for instance, distributions with multiple modes. We present some characterizations of bi-log-concavity. Then it is shown that combining this constraint with suitable confidence bands leads to nontrivial confidence regions for various functionals of F such as moments of arbitrary order.

Finally I'll explain briefly how to extend logistic regression models via bi-log-concavity.

This is joint work with Jon A. Wellner (Seattle), Petro Kolesnyk (Bern) and Ralf Wilke (Copenhagen)

**November 20, 2014: Rik Lopuhaä ****(TU Delft)**

When: Thursday November 20th, 16.00

Where: TU Delft, Faculty EWI, Mekelweg 4, Snijderzaal (LB 01.010).

*Functional central limit theorems in survey sampling with applications to economic indicators*

In survey sampling one samples n individuals from a fixed population of size N according to some sampling design, and for each individual in the sample one observes the value of a particular quantity. On the basis of the observed sample, one is interested in estimating population features of this quantity, such as the population total or the population average. Well known estimators for population features that take into account the inclusion probabilities corresponding to the specific sampling design are the Horvitz-Thompson estimator and Hájèk's estimator. Distribution theory for these estimators is somewhat limited, partly due to the dependence that is inherent to several sampling designs and partly due to the more complex nature of particular population features. In this talk I will present a number of functional central limit theorems for different types of empirical processes obtained from suitably centering the Horvitz-Thompson and Hájèk empirical distribution functions. Basically, these results are obtained merely under conditions on higher order inclusion probabilities corresponding to the sampling design at hand. This makes the results generally applicable and allows more complex sampling designs that go beyond the classical simple random sampling or Poisson sampling. As an application I will use the results in combination with the functional delta method to establish the limit distribution of estimators for certain economic indicators, such as the poverty rate and the Gini index.

**November 13, 2014: No seminar.**

There is no seminar scheduled on this date, because of the Stochastics Meeting in Lunteren.

**November 6, 2014: Peter Spreij**** (University of Amsterdam)**

When: Thursday November 6th, 16.00

Where: TU Delft, Faculty EWI, Mekelweg 4, Timmanzaal (LB 01.170).

*Approximation of nonnegative systems by finite impulse response convolutions*

We pose the deterministic, nonparametric, approximation problem for scalar nonnegative input/output systems via finite impulse response convolutions, based on repeated observations of input/output signal pairs. The problem is converted into a nonnegative matrix factorization with special structure for which we use Csiszár's I-divergence as the criterion of optimality. Conditions are given, on the input/output data, that guarantee the existence and uniqueness of the minimum. We propose a standard algorithm of the alternating minimization type for I-divergence minimization, and study its asymptotic behavior. We also provide a statistical version of the minimization problem and give its large sample properties.

Joint work with Lorenzo Finesso.

**October 30, 2014: ****Piet Groeneboom (TU Delft)**

When: Thursday October 30th, 16.00

Where: TU Delft, Faculty EWI, Mekelweg 4, Snijderzaal (LB 01.010).

*How many words did Shakespeare know?*

Questions as alluded to in the title have been studied by several statisticians, for example by Good and Efron. One could say that the statistical issue is the problem of estimating the support of a distribution. Traditional maximum likelihood tends to underestimate this support.

In recent years the matter has been picked up again by computer scientists, in particular by Orlitsky and his colleagues in San Diego. They developed a method, called "pattern maximum likelihood”, which is claimed not to suffer the underestimation problems of the traditional methods. However, the computation of these estimates is a rather non-trivial matter.

I will present some results on this problem, which has been brought to my attention by Richard Gill.

**October 29, 2014: Andrea Krajina (Georgia Augusta University Göttingen) **

When: Wednesday October 29th, 14.00

Where: TU Delft, Faculty EWI, Mekelweg 4, Poelzaal (LB 01.220)

*An M-estimator of spatial tail dependence*

Tail dependence models for distributions attracted to a max-stable law are fitted using observations above a high threshold.

To cope with spatial, high-dimensional data, a rank-based M-estimator is proposed relying on bivariate margins only.

A data-driven weight matrix is used to minimize the asymptotic variance.

Empirical process arguments show that the estimator is consistent and asymptotically normal.

Its finite-sample performance is assessed in simulation experiments involving popular max-stable processes perturbed with additive noise.

An analysis of wind speed data from the Netherlands illustrates the method.

**October 23, 2014: Laurens de Haan (Universidade de Lisboa & Erasmus University) **

When: Thursday October 23rd, 16.00

Where: TU Delft, Faculty EWI, Mekelweg 4, Snijderzaal (LB 01.010).

*On the block maxima method in extreme value theory*

In extreme value theory there are two fundamental approaches, both widely used: the block maxima (BM) method and the peaks-over-threshold (POT) method. Whereas much theoretical research has gone into the POT method, the BM method has not been studied thoroughly. The present paper aims at providing conditions under which the BM method can be justified. We also provide a theoretical comparative study of the two methods, which is somewhat consistent with the vast literature on comparing the methods all based on simulated data and fully parametric models. The results indicate that the BM method is a rather efficient method under practical conditions.

In this paper we restrict attention to the i.i.d. case and focus on the probability weighted moment estimators (PWM) of Hosking, Wallis and Wood. (1985).

Joint work with Ana Ferreira.

**October 16, 2014: ****Laurens van der Maaten (TU Delft)**

When: Thursday October 16th, 16.00

Where: TU Delft, Faculty EWI, Mekelweg 4, Snijderzaal (LB 01.010).

*Learning with Marginalized Corrupted Features*

The goal of machine learning is to develop predictors that generalize well to test data. Ideally, this is achieved by training on very large (infinite) training data sets that capture all variations in the data distribution. In the case of finite training data, an effective solution is to extend the training set with artificially created examples - which, however, is also computationally costly.

In this talk, we propose to corrupt training examples with noise from known distributions within the exponential family and present a novel learning algorithm, called marginalized corrupted features (MCF), that trains robust predictors by minimizing the expected value of the loss function under the corrupting distribution - essentially learning with infinitely many (corrupted) training examples. We show empirically on a variety of data sets that MCF classifiers are practical, can be trained efficiently, may generalize substantially better to test data, and are also more robust to feature deletion at test time.

The talk describes joint work with Kilian Weinberger, Minmin Chen, and Stephen Tyree (Washington University in St. Louis).

**October 9, 2014: No seminar.**

There is no seminar scheduled on this date, because of the Van Dantzig Seminar in Amsterdam.

**October 2, 2014: ****Anne Ruiz-Gazen (Université de Toulouse 1 Capitole)**

When: Thursday October 2nd, 16.00

Where: TU Delft, Faculty EWI, Mekelweg 4, Snijderzaal (LB 01.010).

*Taking into account auxiliary information for estimating complex finite population parameters in surveys.*

Precise estimation of complex parameters such as Gini indices or other inequality measures is currently a problem of interest. In order to improve on the precision of existing estimators, we propose a general class of estimators for complex parameters which allows to take into account univariate auxiliary information assumed to be known for each unit in the population. We will begin the presentation by some basics on estimation in survey sampling theory with and without auxiliary information when the inference is based on the design. Model-assisted and calibration approaches will be detailed in a simple framework. Then, we will present our new approach which is a nonparametric model-assisted method. Its advantage is that it leads to a unique system of sampling weights whatever are the parameter to estimate and the variable of interest.

The asymptotic properties of this class of estimators will be given in the particular case of B-splines nonparametric estimators. Simulations using data from the French Employment survey will illustrate the good properties of the proposal. Some comparisons with existing methods will also be presented.

Joint work with Camelia Goga, Institut de Mathématiques de Bourgogne, France.

**September 25, 2014: ****Otilia Boldea (Tilburg University)**

When: Thursday September 25th, 16.00

Where: TU Delft, Faculty EWI, Mekelweg 4, Timmanzaal (LB 01.170).

*Bootstrap-based Tests for Multiple Structural Changes in Linear Models with Endogenous Regressors*

In this paper, we derive the limiting distributions of break-point tests in models estimated via two-stage least squares, when the first-stage is unstable. We show that the presence of breakpoints in the first-stage renders the asymptotic distribution of the second-stage break-point tests non-pivotal. To obtain critical values, we propose bootstrapping these tests via the recursive design wild bootstrap and fixed-design wild bootstrap, and prove their asymptotic validity. We provide simulations results that show good finite sample performance of our procedures, even in cases where the tests remain pivotal. Via an application to the US interest rate reaction function that illustrates our methods, we confirm recent findings of instability in the Taylor rule.

**September 18, 2014: ****Yuan Ju (University of York)**

When: Thursday September 18th, 16.00

Where: TU Delft, Faculty EWI, Mekelweg 4, Snijderzaal (LB 01.010).

*Resolve queueing conflicts: a non-cooperative approach*

Complementary to the prolific axiomatic and mechanism design studies on queueing problems, this paper proposes a strategic bargaining approach to resolve queueing conflicts.

Given a situation where players with different waiting costs have to wait in a queue in order to be served, they firstly compete with each other to win a specific position in the queue. Then, the winner can decide to take up the position or sell it to the others. In the former case, the rest of the players will proceed to compete for the remaining positions; whereas for the latter case the seller can propose a queue with corresponding payments to the others which can be accepted or not. Depending on which position players are going to compete for, the subgame perfect equilibrium outcome of the corresponding mechanism coincides with one of the two best known and axiomatized allocation rules for queueing problems: the maximal and the minimal transfer rules, while an efficient queue is always formed in equilibrium.

The analysis discovers a striking relationship between pessimism and optimism in this type of decision making. The robustness of the result is also discussed.

**September 11, 2014: Tobias Mueller (Utrecht University)**

When: Thursday September 11th, 16.00

Where: TU Delft, Faculty EWI, Mekelweg 4, Snijderzaal (LB 01.010).

*Hyperbolic random geometric graphs*

Random geometric graphs are constructed by sampling n points at random from some probability distribution on the plane and connecting two points when the distance is less than some parameter r.

In this talk I will discuss some preliminary results on what happens when the points of the random geometric graph live in the hyperbolic plane rather than the ordinary, Euclidean plane.

This variation on the model leads to spectacularly different behaviour from the standard, Euclidean version.

(Based on ongoing joint works with Bode, Broman, Fountoulakis and Tykesson)

**Summer Break**

**June 4, 2014: Frank van der Meulen**

When: Wednesday June 4th, 15.45h

Place: TU Delft, Faculty EWI, Mekelweg 4, Timmanzaal (LB 01.170)

*Data augmentation for estimating parameters in diffusion processes*

In this talk I will review some basics of Markov Chain Monte Carlo algorithms and a particular case of it, known as data-augmentation (DA). DA is popular in Bayesian statistics for dealing with missing data problems. The problem of estimating parameters in diffusion processes when only discrete time observations are available can naturally be treated as a missing data problem. However, it has been reported that a standard implementation of DA will not work if unknown parameters are in the diffusion coefficient. I will present a variation of the standard DA algorithm that efficiently deals with this issue. Particular instances of this algorithm have appeared before, but without rigorous proof. The presently proposed algorithm is more general and a fully rigorous proof is available. This concerns joint work with Moritz Schauer (TU Delft).

**May 28, 2014: Prof. Jurg Husler**

When: Wednesday May 28, 15.45h

Place: TU Delft, Faculty EWI, Mekelweg 4, Snijderzaal (LB 01.010)

*How can we use interval censored data in the **estimation of extreme value parameters?*

When historical extreme measurements exist, they are typically not

accurate. They are usually known to be bounded by upper and lower

values. It means we have interval-censored data. These data are

combined with accurate measurements which are not historical and

are more recently collected. This situation is motivated by a real data

example with historical data. We deal with several approaches for the

estimation of the parameters of the distribution tail in such a case.

We discuss their theoretical properties and show some simulations.

**May 21, 2014: Gioia Carinci**

When: Wednesday May 21, 15.45h

Place: TU Delft, Faculty EWI, Mekelweg 4, Snijderzaal (LB 01.010)

*Mass transport via current reservoirs:a microscopic model for a free boundary problem*

Stationary non equilibrium states are characterized by the presence of steady currents

owing through the system and a basic question in statistical mechanics is to understand

their structure. Many papers have been devoted to the subject in the context of stochastic

interacting particle systems. Usually current density is produced by _xing two di_erent

densities at the boundary. We want instead to implement the mass transport by intro-

ducing current reservoirs that produce a given current by sending in particles from the

left at some rate and taking out particles from the rightmost occupied site at same rate.

The removal mechanism is therefore of topological rather than metric nature, since the

determination of the rightmost occupied site requires a knowledge of the entire con_gu-

ration. This prevents from using correlation functions techniques. I will discuss recent

results obtained in the study of these topics, whose _nal purpose is to provide a particle

version of a free boundary-type problem.

**May 14, 2014: Frank Redig**

When: Wednesday May 14, 15.30h

Place: TU Delft, Faculty EWI, Mekelweg 4, Timmanzaal (LB 01.170)

*Abstract duality theory, via examples*

Abstract: Duality is a technique in the theory of Markov processes that allows to link two processes via a so-called duality function. It has important applications in interacting particle systems, and diffusion processes of population genetics.

Recently, starting from models of heat conduction we discovered several new dualities, and a new machinery to generate processes having good duals. The method is based on using different representations of a Lie algebra naturally associated to the generator.

I will explain the method starting from simple examples of diffusion processes which can be used in genetics and statistical physics. I will also illustrate how the method can be implemented constructively, yielding a new process which is a natural extension of the asymmetric exclusion process, constructed from the Casimir operator and a coproduct of the quantum Lie algebra SU_2(q).

**May 7, 2014: Francesca Collet**

When: Wednesday May 7, 15.30h

Place: TU Delft, Faculty EWI, Mekelweg 4, Timmanzaal (LB 01.170)

*Self-sustained oscillations in some dissipative mean field interacting particle systems*

We consider various mean field interacting particle systems, whose dynamics are subject to a sort of dissipation. We show that, in the infinite volume limit, even if neither the particles have intrinsic periodicity nor a periodic force is applied, appearence of limit cycles may occur.

**March 26, 2014: Carlo Carminati**

When: Wednesday March 26th, 15.45h

Place: TU Delft, Faculty EWI, Mekelweg 4, Lipkenszaal (LB 01.150)

*Continued fractions with SL(2,Z)-branches: combinatorics and entropy.*

We study the dynamics of a family $K_\alpha$ of discontinuous interval maps whose (infinitely many) branches are Moebius transformations in SL(2,Z), and which arise as the critical-line case of the family of (a, b)-continued fractions recently studied by S.Katok and I.Ugarcovici.

We provide an explicit construction of the bifurcation locus E for this family, showing it is parametrized by Farey words and it has Hausdorff dimension zero.

As a consequence, we prove that the metric entropy of K_\alpha is analytic outside the bifurcation set but not differentiable at points of E, and that the entropy is monotone as a function of the parameter.

(joint work with G.Tiozzo and S.Isola).

**March 19,** 2014: Max Welling

When: Wednesday March 19th, 15.45h

Place: TU Delft, Faculty EWI, Mekelweg 4, Snijderszaal (LB 01.010)

*Austerity in MCM-Land: Cutting the computational Budget*

Will MCMC survive the “Big Data revolution”? Current MCMC methods for posterior inference, compute the likelihood of a model *for every data-case *in order to make a single binary decision: to accept or reject a proposed parameter value. Compare this with stochastic gradient descent that uses O(1) computations per iteration. In this talk I will discuss two MCMC algorithms that cut the computational budget of an MCMC update. The first algorithm, “stochastic gradient Langevin dynamics” (and its successor “stochastic gradient Fisher scoring”) performs updates based on stochastic gradients and ignore the Metropolis‐Hastings step altogether. The second algorithm uses an approximate Metropolis‐Hastings rule where accept/reject decisions are made with high (but not perfect) confidence based on sequential hypothesis tests. We argue that for any finite sampling window, we can choose hyper‐parameters (stepsize, confidence level) such that the extra bias introduced by these algorithms is more than compensated by the reduction in variance due to the fact that we can draw more samples. We anticipate a new framework where bias and variance contributions to the sampling error are optimally traded-off.

**March 12,** 2014: Anca Hanea

When: Wednesday March 12th, 15.45h

Place: TU Delft, Faculty EWI, Mekelweg 4, van der Poelzaal (LB 01.220)

*The asymptotic distribution of the determinant of a random correlation **matrix*

Random correlation matrices are studied for both theoretical

interestingness and importance for applications. The determinant of a

matrix is one of the most basic and important matrix functions, and this

makes studying the distribution of the determinant of a random

correlation matrix of paramount importance. Our main result gives the

asymptotic distribution of the determinant of a random correlation

matrix sampled from a uniform distribution over the space of d x d

correlation matrices. Several spin-off results are proven along the way,

and a connection with the law of the determinant of general random

matrices is investigated.

**March 5, 2014: **Dorota Kurowicka

When: Wednesday March 5th, 15.45h

Place: TU Delft, Faculty EWI, Mekelweg 4, van der Poelzaal (LB 01.220)

*Joint density of correlations in correlation matrix with chordal sparsity patterns*

Graphical models, regular vines, have gained recently popularity as a back bone to specify d-dimensional copulae using d-1 bivariate and (d-2)(d-1)/2 conditional bivariate copulae arranged according to its structure.

We give a short introduction to vines and concentrate on another application of vines to partial correlation parameterization of a correlation matrix. Bedford and Cooke (2001) showed that partial correlations on a regular vine are algebraically independent and they determine the correlation matrix. This property has been essential in determining a joint density of correlations in the correlation matrix as the product of densities for partial correlations on the vine (Joe (2006) and Lewandowski et al (2009)). The choice of linearly transformed to the interval (-1,1), symmetric beta densities with parameters depending on the level of the vine leads to the uniform density over the set of correlation matrices and allows to compute the volume of the set of correlation matrices in d(d-1)/2 dimensional space. In this talk we present an extension of the results in the above papers to include partially specified matrix with chordal sparsity pattern.

**February 26, 2014: Ross Kang**

When: Wednesday February 26th, 15.45h

Place: TU Delft, Faculty EWI, Mekelweg 4 – van der Poelzaal (room LB 01-220)

*Colouring with sparse or dense sets in random graphs*

Determination of the chromatic number (and related structural parameters like the clique and stability numbers) of the binomial random graph is a classic problem in probabilistic combinatorics. It also has close links to statistical physics and computer science, and is certainly core to random graph theory. The original question about the chromatic number of the random graph, one that appeared in a seminal paper of Erdős and Rényi in 1960, is only now (close to being) answerable. The first half of the talk will be a survey of this area.

After I will describe a line of research I have been active in, which pursues the above where the colour sets instead satisfy certain natural conditions (rather than necessarily being a stable set or clique). Time permitting, I will discuss applications in extremal combinatorics, mentioning joint work with Colin McDiarmid, Bruce Reed and Alex Scott, and with János Pach, Viresh Patel and Guus Regts.

**February 19, 2014: Geurt Jongbloed**

When: Wednesday February 19th, 15.45h

Place: TU Delft, Faculty EWI, Mekelweg 4 – van der Poelzaal (room LB 01-220)

*Nonparametric function estimation: imposing shape and smoothness*

In many inverse problems and censoring models, the aim is to estimate a distribution function nonparametrically based on data from a distribution that is related (but not equal) to the distribution of interest. There are methods to incorporate order- and more general shape constraints on estimators of the distribution function. In addition, it can also be reasonable to assume smoothness properties of the underlying distribution function.

In this presentation I will give an overview of methods currently used to combine shape constraints and smoothness for function estimation. The methods will be illustrated using various censoring models and inverse problems.

**February 12, 2014 John Einmahl**

When: Wednesday February 12th, 15.45h

Place: TU Delft, Faculty EWI, Mekelweg 4 – van der Poelzaal (room LB 01-220)

*Bridging centrality and extremity: refining empirical data depth using extreme value theory*

A data depth is a measure of centrality of a point with respect to a given distribution or data set. It provides a natural center-outward ordering of multivariate data points and yields systematic nonparametric multivariate analyses. In particular, the approaches derived from geometric depths (e.g. halfspace depth and simplicial depth) are especially useful since they generally reflect accurately the true probabilistic geometry underlying the data. However, the empirical geometric depths are defined to be zero outside the convex hull of the data set. This property has restricted much the utility of depth approaches in applications where the extreme outlying probability mass may be the focal point, such as in problems of classification or quality control charts with small false alarm rates. To overcome this shortcoming, we propose to apply extreme value theory to refine the empirical estimator of half-space depth for points in the tails of the data set. This proposal provides an important linkage between data depth, which is useful for inference on centrality, and extreme value theory, which is useful for inference on extremity. The refined estimator of the half-space depth can thus extend depth utilities beyond the data hull and broaden greatly the applicability of data depth. We show that the refined depth is uniformly ``ratio-consistent" on a very large region and that it substantially improves upon the original empirical estimator. This can be immediately translated into improvement for inference approaches based on depth. In a detailed simulation study we show the improvement of the estimator and how this leads to much better performance of depth applications, especially in the directions of multivariate classification and construction of nonparametric control charts. This is joint work with Jun Li (UC Riverside) and Regina Y. Liu (Rutgers University).