# Archive 2009

**December 9, 2009: Frank Redig (Universiteit Leiden)**

*Duality and hidden symmetries in interacting particle systems*

We consider a model of heat conduction, the so-called Brownian momentum process. We show that this model is dual to a particle system which is a natural analogue of the symmetric exclusion process, with attractive instead of repulsive interaction. We further show that this model is self-dual, and give some applications of these duality relations.

We also show in a general context

how to obtain duality functions from symmetries of the generator.

The talk is based on joint work with C. Giardina, J. Kurchan and K. Vafayi.

**December 2, 2009: Shota Gugushvili (Eurandom / TU Eindhoven)**

*Nonparametric inference for discretely sampled Levy processes*

Given a sample from a discretely observed L'evy process $X=(X_t)_{tgeq 0}$ of the finite jump activity, we study the problem of nonparametric estimation of the L'evy density $ho$ corresponding to the process $X.$ We propose an estimator of $ho$ that is based on a suitable inversion of the L'evy-Khintchine formula and a plug-in device. Our main result deals with an upper bound on the mean square error of the estimator of $ho$ at a fixed point $x.$ We also show that the estimator attains the minimax convergence rate over a suitable class of L'evy densities.

**November 25, 2009: Roberto Fernandez (Universiteit Utrecht)**

*Loss of gibbssiannes in un-quenching evolutions*

If an Ising ferrmognet at low temperature is subjected to a high-temperature spin-flip dynamics, the evolved measure may cease to be Gibbsian after a finite time. In this talk I will discuss such phenomenon. The talk will include a survey of the notion of Gibbs and non-Gibbsian measures and a discussion of what is known an expected regarding the dynamical loss of Gibsianness.

**November 11, 2009: Rik Lopuhaa (TU Delft)**

*Central limit theorem and influence function for the MCD estimators at general multivariate distributions*

The minimum covariance determinant (MCD) estimators of multivariate location and scatter are robust alternatives to the ordinary sample mean and sample covariance matrix. Nowadays they are used to determine robust Mahalanobis distances in a reweighting procedure, and used as robust plug-ins in all sorts of multivariate statistical techniques which need a location and/or covariance estimate, such as principal component analysis, factor analysis, discriminant analysis and linear multivariate regression. For this reason, the distributional and the robustness properties of the MCD estimators are essential for conducting inference and perform robust estimation in several statistical models. Butler, Davies and Jhun (1993) show asymptotic normality only for the MCD location estimator, whereas the MCD covariance estimator is only shown to be consistent. Croux and Haesbroeck (1999) give the expression for the influence function of the MCD covariance functional and use this to compute limiting variances of the MCD covariance estimator. However, the expression is obtained under the assumption of existence, continuity and differentiability of the MCD-functionals at perturbed distributions, which is not proven. Moreover, the computation of the limiting variances relies on the von Mises expansion of the estimator, which has not been established. In this presentation we define the MCD functional by means of trimming functions which are in a wide class of measurable functions. The class is very flexible and allows a uniform treatment at general probability measures, including empirical measures and perturbed measures. We prove existence of the MCD functional for any multivariate distribution P and provide a separating ellipsoid property for the functional. Furthermore, we prove continuity of the functional, which also yields strong consistency of the MCD estimators. Finally, we derive an asymptotic expansion of the functional, from which we rigorously derive the influence function, and establish a central limit theorem for both MCD-estimators. All results are obtained under very mild conditions on P and essentially all conditions are automatically satisfied for distributions with a density. For distributions with an elliptically contoured density that is unimodal we do not need any extra condition and one may recover the results in Butler, Davies and Jhun (1993) and Croux

**October 28, 2009: Nelly Litvak (Universiteit Twente)**

*The power law behaviour of the PageRank distribution.*

The PageRank algorithm is designed by Google to rank Web pages according to their importance. According to this algorithm, the importance scores of pages depend on the quantity and the quality of incoming links. We study and explain the properties of PageRank scores in complex information networks characterized by power laws. It is a well-known fact that in the Web, the distribution of the PageRank and In-degree follows a power law distribution with the same exponent. We explain this similarity by presenting a PageRank distribution as a solution of a stochastic equation. Using this model, we apply analytical methods to derive the asymptotic behavior of the PageRank distribution. The obtained results are in good agreement with experimental data. Next, we suggest to measure the dependencies between power law parameters using the notion of angular measure developed within extreme value theory. This technique reveals that the WWW, the Wikipedia and the Growing Network graphs have a completely different dependence structure. In our stochastic model, we can also derive the angular measure analytically, which allows to quantify the proportion of pages that receive a high ranking due to a large in-degree. This is a joint work with Yana Volkovich, Debora Donato, Werner Scheinhardt and Bert Zwart.

**October 21, 2009: Judith Timmer (Universiteit Twente)**

*Cooperation in Tandem Lines*

We consider a number of servers in a tandem line. The servers may improve the efficiency of the system by redistributing their service capacities. This improvement is due to the reduction in the steady-state mean total number of customers in the tandem line. We investigate how the cost of the system after redistribution should be divided among the servers. For this we use tools from cooperative game theory.

**October 14, 2009: Noel van Erp & Pieter van Gelder (TU Delft )**

*Finding Proper Non-informative Priors for Regression Coefficients*

It is a known fact that in problems of Bayesian model selection improper priors may lead to biased conclusions. In this presentation we first give a short introduction to the procedure of Bayesian model selection. We then demonstrate for a simple model selection problem, involving two regression models, how improper uniform priors for the regression coefficients will exclude automatically the model with the most regression coefficients. Having established the problematic nature of improper priors for this particular case we proceed to derive a parsimoneous proper uniform prior for univariate regression models, firstly, and then generalize this result to multivariate regression models, secondly.

**October 7, 2009: Sasha Gnedin (Universiteit Utrecht)**

*Quasi-exchangeability and generalizations of de Finetti's theorem*

A random sequence is quasi-exchangeable if its distribution is quasi-invariant under permutations. We discuss generalizations of de Finetti's theorem for this setting, and make connections to a boundary problem for lattice random walks. Major attention is given to the q-exchangeability, for which it is shown that all ergodic sequences are obtainable from a single measure on the space of infinite permutations.

**September 30, 2009: Charles Berger (Nederlands Forensisch Instituut)**

*Modern forensic methodology*

In this talk we will introduce the modern forensic methodology, where the aim is to find objective evidence using the likelihood ratio as a measure for the strength of that evidence given a set of hypotheses. We will illustrate this approach with some example projects, such as the inference of identity of source for varied traces such as ballpoint ink, speech, paper structure and fingermarks.

**September 23, 2009: Dierk Schleicher (Jacobs-Universitaet Bremen)**

*Dynamics of transcendental entire functions and a dimension paradox.*

**September 16, 2009: Paul Vitanyi (CWI) om 4 uur**

*Similarity by Compression*

We survey a new area of parameter-free similarity distance measures useful in data-mining, pattern recognition, learning and automatic semantics extraction. Given a family of distances on a set of objects, a distance is universal up to a certain precision for that family if it minorizes every distance in the family between every two objects in the set, up to the stated precision (we do not require the universal distance to be an element of the family). We consider similarity distances for two types of objects: literal objects that as such contain all of their meaning, like genomes or books, and names for objects. The latter may have literal embodyments like the first type, but may also be abstract like ``red'' or ``christianity.'' For the first type we consider a family of computable distance measures corresponding to parameters expressing similarity according to particular features between pairs of literal objects. For the second type we consider similarity distances generated by web users corresponding to particular semantic relations between the (names for) the designated objects. For both families we give universal similarity distance measures, incorporating all particular distance measures in the family. In the first case the universal distance is based on compression and in the second case it is based on Google page counts related to search terms. In both cases experiments on a massive scale give evidence of the viability of the approaches

**September 9, 2009: Anne Fey (TU Delft)***A splitting headache.*

We consider the following game. Start with a pile of mass n at the origin of the rectangular grid, and mass h<1 at every other site. Mass may be moved by `splitting piles', that is, one may take all the mass from one site, and divide it evenly among the neighbors. There are restrictions though: One may only split piles of mass at least 1, and one may not stop before all the piles have mass less than 1. We call T the set of sites where at least one split was performed. Will the mass spread over the whole grid, or will T be finite? In the first case, how does it spread? In the last case, what is the size and shape of T, depending on h and n? How many splits are needed? Do any of these answers depend on the order of splitting? This game is related to the abelian sandpile growth model, for which a number of limiting shape results are known. The splitting game however is more difficult to analyse because it is not abelian. Nevertheless, we found several limiting shape results for the splitting game, some of which have no counterpart in the abelian sandpile growth model.

**August 17, 2009: Schmuel Gal (University of Haifa)**

*Coordinated Linear Search*

**June 17, 2009: Ian Melbourne (University of Surrey)**

*Statistical properties of Lorentz gases*

**June 10, 2009: Henk Bruin (TU Delft and University of Surrey)**

*Li-Yorke chaos and Cantor attractors for interval maps.*

Whereas most unimodal interval maps are either chaotic in any mathematical sense of the word, or have periodic attractors attracting almost every point, there are unimodal maps with more interesting attractors. Such attractors are Cantor sets, and the dynamics on them is less chaotic: e.g. entropy and Lyapunov exponents are 0.In this talk, I want to explain that regarding the existence of Li-Yorke pairs (i.e., points x,y such that 0 = liminf |f^n(x)-f^n(y)| limsup |f^n(x)-f^n(y)| ), these attractors can still be quite interesting.(This work is joint with Victor Lopez-Jimenez, Murcia)

**June 3, 2009: Guus Balkema (UvA en ETH)**

**ZAAL F (different location!)**

*Level sets and dependence*

The behaviour of multivariate extremes is determined by the tail behaviour of the marginals and by the (asymptotic) dependence structure of the distribution.

Multivariate dfs tell us little about the distribution of the probability mass and the structure of large sample clouds. In this talk we focus on densities, which are assumed to be continuous and unimodal, and to have level sets which are of the same shape asymptotically. This gives a good impression of what sample clouds from the distribution look like.

I will discuss recent work with Natalia Lysenko and Paul Embrechts (both ETH, Zurich) on the relation between the shape of level sets and asymptotic dependency.

**May 27, 2009: Bert van Es (UvA)***Two dimensional uniform kernel deconvolution*

In a general deconvolution model we have a sample of n independent X_{i} which are equal to the sum of independent and unknown Y_{i} and Z_{i}. So X_{i}=Y_{i}+Z_{i}. We assume that the Z_{i} have a known distribution. The aim is to estimate the probability density f of the Y_{i} from this sample of X_{i}. Since the density of the observed X_{i} is equal to the convolution of the densities of the Y_{i} and Z_{i} one can derive a density estimator of f by Fourier inversion and kernel estimation of the density of the observations. This approach has proven to be useful in many deconvolution models, i.e. different known distributions of the Z_{i}. However, it fails in the model where the known density of the Z_{i} is uniform. This model is usually called uniform deconvolution. We will present an alternative method based on kernel density estimation and a different, non Fourier, type of inversion of the convolution operator in this model. Following earlier work for the one dimensional model, cf Van Es 2002, we will use the same approach in the two dimensional model where the X_{i}, Y_{i} and Z_{i} are two dimensional random vectors and where the distribution of the Z_{i} is uniform on the unit square. We will derive expansions for the bias and variance and present some simulated examples.**Reference**

[1] B. van Es. (2002) *Combining kernel estimators in the uniform deconvolution model* , ArXiv:math.PR/0211079.

**May 20, 2009: Ruud van Ommen (TU Delft)**

*Monitoring of complex chemical systems*

*- a chemical engineer's view on applying non-linear signal analysis*

In the chemical process industry, various pieces of equipment show complex behaviour. A good example is a fluidized bed: a vessel filled with powder, through which a gas is blown upward at such a velocity that

the particles (the grains of the powder) become fluidized (i.e., they are 'floating' on the gas stream). These particles are constantly colliding with each other and with the vessel wall, and form density

patterns on a scale much larger than the particle diameter.

In various industrial operations (e.g, burning biomass for "green electricity" or production of polymers) the particle can become sticky and form large lumps of materials. This can eventually lead to an

unscheduled shut-down of the bed. Currently, the process industry measures typically just average properties such a pressure or temperature. These measurements often do not give an early warning for

the operational problem.

We propose to base the analysis on dynamic signal (typically pressure measurements at ~ 200 Hz) to obtain more information from the system. We reconstruct an 'attractor' in the state space, a technique borrowed from

chaos analysis. By following this attractor in time, it can be observed if this attractor - and thus the monitored system - is changing, which is an indication of particles start to stick together.

The above described technique of 'attractor comparison' is very sensitive, but we would like to improve the selectivity of the monitoring: unimportant events in the system should not be detected. To this end, we recently developed a screening methodology to find the most selective combination of filtering method and analysis method. Some practical results from this methodology will be shown.

**May 13, 2009: Michael Schroeder (VU Amsterdam en Universität Mannheim)**

*A perspective on the integral of geometric Brownian motion, with applications to finance*

The talk will concentrate on exponential functionals of Brownian motion, with the values of Asian options in the BS model as typical examples from finance. We survey how their structure theory emerges by way of establishing interconnections between stochastics and complex analysis on the upper half plane, taking work of Yor's and Dufresne's as starting points. Explicit valuation of Asian options furnishes the points of reference for measuring the progress here; we demonstrate how this is given expression to by at least 2 "arbitrary precision" methods whose potential we illustrate by way of numerical examples."

**April 22, 2009: Nelly Litvak (UT)***The power law behaviour of the PageRank distribution.*

The PageRank algorithm is designed by Google to rank Web pages according to their importance. According to this algorithm, the importance scores of pages depend on the quantity and the quality of incoming links. We study and explain the properties of PageRank scores in complex information networks characterized by power laws. It is a well-known fact that in the Web, the distribution of the PageRank and In-degree follows a power law distribution with the same exponent. We explain this similarity by presenting a PageRank distribution as a solution of a stochastic equation. Using this model, we apply analytical methods to derive the asymptotic behavior of the PageRank distribution. The obtained results are in good agreement with experimental data. Next, we suggest to measure the dependencies between power law parameters using the notion of angular measure developed within extreme value theory. This technique reveals that the WWW, the Wikipedia and the Growing Network graphs have a completely different dependence structure. In our stochastic model, we can also derive the angular measure analytically, which allows to quantify the proportion of pages that receive a high ranking due to a large in-degree.

This is a joint work with Yana Volkovich, Debora Donato, Werner Scheinhardt and Bert Zwart.

**April 15, 2009: Eric Cator (TU Delft)**

*Hammersley process with random weights: stationary measures and Busemann functions (joint work with Leandro Pimentel)*

In this talk I will discuss the Hammersley process with random weights, both as a longest path percolation and as an interacting fluid process. We will show how we can use the Busemann function, as defined in geometry for metric spaces, to find mixing stationary measures for the Hammersley process. This gives us insight in the Busemann function for the classical Hammersley process, and a new description of the multi-class process. If time permits, we will discuss how these methods could be used to prove the cube-root behavior of the Hammersley process with random weights.

**April 1, 2009: Christophe Croux (K.U.Leuven)**

*Classification Efficiencies for Robust Discriminant Analysis*

Linear discriminant analysis is typically carried out using Fisher's method. This method relies on the sample averages and covariance matrices computed from the different groups constituting the training sample. Since sample averages and covariance matrices are not robust, it has been proposed to use robust estimators of location and covariance instead, yielding a robust version of Fisher's method. In this paper relative classification efficiencies of the robust procedures with respect to the classical method are computed. Second order influence functions appear to be useful for computing these classification efficiencies. It turns out that, when using an appropriate robust estimator, the loss in classification efficiency at the normal model remains limited.

**March 11, 2009: Ronald Geskus (Academisch Medisch Centrum)**

*Three Equivalent Forms of the Cause-Specific Cumulative Incidence Estimator and the Fine and Gray Model*

The standard estimator for the cause specific cumulative incidence in a competing risks setting with left truncated and/or right censored data can be written in two alternative forms. One is as an inverse probability weighted estimator and another as a product-limit estimator. The product-limit estimator is based on a simple estimator of the subdistribution hazard, i.e. the hazard that corresponds to the cause specific cumulative incidence curve. As a consequence, estimation of the cause-specific cumulative incidence and regression on the subdistribution hazard can be performed using standard software for survival analysis if the software allows for inclusion of time dependent weights.

**March 4, 2009: Bas Kleijn (UvA)**

*Differentiability and the semiparametric Bernstein-Von Mises theorem*

The Bernstein-Von Mises theorem provides a detailed relation between frequentist and Bayesian statistical methods in smooth, parametric models. It states that the posterior distribution converges to a normal distibution centred on the maximum-likelihood estimator with covariance proportional to the Fisher information. In this talk we consider conditions under which the analogous assertion holds for the marginal posterior of a parameter of interest in smooth semiparametric estimation problems. Central is a no-bias condition, requiring that the nuisance posterior converges at a rate fast enough to suppress the bias. (Joint work with P. Bickel.)

**February 25, 2009: Kees Oosterlee (TU Delft / CWI)**

*On Efficient Methods for Pricing Options with and Without Early Exercise*

In this presentation we will discuss option pricing techniques in the context of numerical integration. Based on a Fourier-cosine expansion of the density function we can efficiently obtain European option prices and corresponding hedge parameters. Moreover a whole vector of strike prices can be valued in one computation. This technique can speed up calibration to plain vanilla options substantially. This pricing method, called the COS method, is generalized to pricing Bermudan and barrier options. We explain this and present a calibration study based on CDS spreads (if time permits).

**February 18, 2009: Mark Veraar (TU Delft)**

*Probability and moment estimates for centered random variables*

In this talk we discuss some sharp inequalities for centered random variables and their applications. In particular we give sharp lower bounds for P(X>0) in terms of moment estimates on X. Here X is a nonzero centered random variable.

**February 11, 2009: Ionica Smeets (UL)**

*The LLL-algorithm: how it originated and how we can use it as a multidimensional continued fraction algorithm*

Hendrik Lenstra, Arjen Lenstra and László Lovász published their famous LLL-algorithm for basis reduction in 1982. In 2007 the 25th birthday of this algorithm was celebrated in Caen, France with a three- day conference. Lenstra, Lenstra, Lovász and close bystander Peter van Emde Boas started the conference by telling how the algorithm emerged from misunderstandings, errors and coincidences. Ionica Smeets wrote down these memories for an upcoming book from Springer about the LLL- algorithm. The first part of her talk will be this nice historic story. In the second part she will talk about her own research on continued fractions and explain how you can iterate the LLL-algorithm to find a series of multidimensional continued fractions.

**February 4, 2009: Gerard Hooghiemstra (TU Delft)**

*First passage percolation on random graphs with finite mean degrees*

*joint work with: Shankar Bhamidi and Remco van der Hofstad*

This talk is about first passage percolation on the configuration model. Assuming that each edge of the graph has an independent exponentially distributed edge weight, we derive distributional asymptotics for the minimum weight between two randomly chosen vertices in the network, as well as for the number of edges on the least weight path.