Archive 2018

December 17, 2018Barbara Terhal (TU Delft, QuTech)

When: Monday December 17th, 16:00
Where: TU Delft, Faculty EWI, Mekelweg 4, EWI-Lecture hall F 

The sign problem in quantum physics: room for algorithms and optimization

Quantum physical problems are described by a sparse Hermitian matrix called a Hamiltonian which obeys a certain locality structure allowing for an efficient description. A subset of such Hamiltonians have been called sign-problem free or "stoquastic" as their smallest eigenvector is the largest eigenvector of a nonnegative matrix. The connection with nonnegative matrices has allowed for the development of various Quantum Monte Carlo methods to simulate features of these Hamiltonians in the quantum physics community over the past 30 years. As the non-negativity of a matrix is basis-dependent, local basis changes which preserve the locality structure of the Hamiltonian can remove the sign problem. We report on our new results of finding such local basis changes algorithmically for subclasses of Hamiltonians.
 

December 03, 2018Frank van der Meulen (Tu Delft)

When: Monday December 03th, 16:00
Where: TU Delft, Faculty EWI, Mekelweg 4, EWI-Lecture hall D@ta

Simulation of hypo-elliptic conditioned diffusions

Consider a diffusion that is constructed as a strong solution to a stochastic differential equation (SDE).  Parameter estimation of discrete time data is hard, due to intractability of the likelihood. To deal with this problem, a popular approach is to use a data-augmentation method, where the latent path is imputed. This imputation boils down to simulation of conditioned diffusions. Over the past decade this area of research has received considerable attention. Virtually all of the proposed methods assume that each component of the SDE is driven by its own Brownian motion. Unfortunately, if this is not the case, they break down. I will discuss how this problem can be dealt with for a wide class of SDE-models. 
 

November 19, 2018Marton Balasz (Bristol University)

When: Monday November 19th, 16:00
Where: TU Delft, Faculty EWI, Mekelweg 4, EWI-Lecture hall D@ta

Jacobi triple product via the exclusion process

I will give a brief overview of very simple, hence maybe less investigated
structures in interacting particle systems: reversible product blocking
measures. These turn out to be more general than most people would think, in
particular asymmetric simple exclusion and nearest-neighbour asymmetric zero
range processes both enjoy them. But a careful look reveals that these two are
really the same process. Exploitation of this fact gives rise to the Jacobi
triple product formula - an identity previously known from number theory and
combinatorics. I will show you the main steps of deriving it from pure
probability this time, and I hope to surprise my audience as much as we got
surprised when this identity first popped up in our notebooks.
 

October 29, 2018:  Joe P. Chen (Colgate University)

When: Monday October 29th, 16:00
Where: TU Delft, Faculty EWI, Mekelweg 4, EWI-Lecture hall D@ta

Hydrodynamic limit of the boundary-driven exclusion process on the Sierpinski gasket

The exclusion process is a well-known interacting particle system which exhibits a rich variety of phenomena. In particular, limit theorems for the asymmetric exclusion process on 1D have been studied intensely over the past few years. A natural question is whether similar analysis can be made on higher-dimensional state spaces.

To be precise, fix a state space Χ with a finite boundary set ∂Χ. Let (GN be a sequence of graph approximations of Χ. Consider the (asymmetric) exclusion process on GN, and in addition, add to each boundary point Î± âˆˆ ∂ Î§ a birth-and-death chain, which models the injection or extraction of particles at well-defined rates. Then under the diffusive scaling limit (if known), the empirical density will concentrate on a hydrodynamic trajectory governed by a nonlinear heat equation. In addition, one may also establish a large deviations principle, characterizing the fluctuations about the hydrodynamic trajectory.

When Χ = [0,1] and ∂Χ = {0,1}, this has been proved by Bertini, De Sole, Gabrielli, Jona-Lasinio, and Landim. In this talk I will describe joint work with Michael Hinz (Bielefeld) and Alexander Teplyaev (Connecticut) on establishing the corresponding results when Î§ is the Sierpinski gasket and ∂Χ is the set of 3 boundary vertices of the gasket. Our results pave the way for further studies of non-equilibrium fluctuations in the presence of 3+ boundary components.

Our proofs are based on the "entropy method" of Guo, Papanicolaou, and Varadhan, and in adopting this method we had to overcome a number of technical difficulties, most notably the lack of translational invariance. This led to an auxilliary study of functional inequalities for the exclusion process on a "resistance space," which includes trees, fractals, and random graphs as examples. I will briefly mention my own work proving a "moving particle lemma" on a finite connected weighted graph, which is used to facilitate a local coarse-graining argument in passing to the macroscopic limit.


October 22, 2018Bert Kappen (Radboud University Nijmegen,Gatsby Computational Neuroscience Unit UCL London)

When: Monday October 22nd, 16:00
Where: TU Delft, Faculty EWI, Mekelweg 4, Lecture hall F.

The Quantum Boltzmann Machine 

We propose to generalise classical maximum likelihood learning to density matrices. As the objective function, we propose a quantum likelihood that is related to the cross entropy between density matrices. We apply this learning criterion to the quantum Boltzmann machine (QBM), previously proposed by Amin et al.. We demonstrate for the first time learning a quantum Hamiltonian from quantum statistics using this approach. For the anti-ferromagnetic Heisenberg and XYZ model we recover the true ground state wave function and Hamiltonian. The second contribution is to apply quantum learning to learn from classical data. Quantum learning uses in addition to the classical statistics also quantum statistics for learning. These statistics may violate the Bell inequality, as in the quantum case. Maximizing the quantum likelihood yields results that are significantly more accurate than the classical maximum likelihood approach in several cases. We give an example how the QBM can learn a strongly non-linear problem such as the parity problem. The solution shows entanglement, quantified by the entanglement entropy.
https://arxiv.org/abs/1803.11278

Bio: http://www.snn.ru.nl/~bertk/biograph2


October 08, 2018: Extreme TiDE seminar

When: Monday October 8th, 15:00 - 17:00
Where: TU Delft, Faculty EWI, Mekelweg 4, Lecture hall Snijderzaal (LB 01.010).


15:00-15:45 Hanna Ahmed (Tilburg University)

Improved estimation of the extreme value index using a related variables

Heavy tailed phenomena are naturally analyzed by extreme value statistics. A crucial step in such an analysis is the estimation of the extreme value index, which describes the tail heaviness of the underlying probability distribution. We consider the situation where we have next to the n observations of interest another n + m observations of one or more related variables, like, e.g., financial losses due to earthquakes and the related amounts of energy released, for a longer period than that of the losses. Based on such a data set, we present an adapted version of the Hill estimator that shows greatly improved behavior and we establish the asymptotic normality of this estimator. For this adaptation the tail dependence between the variable of interest and the related variable(s) plays an important role. A simulation study confirms the substantially improved performance of our adapted estimator relative to the Hill estimator. We also present an application to the aforementioned earthquake losses.

This is a joint work with John H.J. Einmahl.


16:00-17:00  ClĂ©ment Dombry (University of Franche-ComtĂ©)

Analysis of the proportional tail model for extreme quantile regression via a coupling approach

Extreme quantile regression is a fundamental problem in extreme value theory. Assume that we observe an n-sample (x1,y1) ,...,(xn,yn) of a random variable Î„∈together with covariates X∈R.
Our goal is to estimate the conditional quantile of order 1-p of Y given X = x.  When p is small, there is not enough observations and extrapolation further in the tail distribution is needed. We face an extreme value problem. 
The purpose of the talk is to present an ongoing joint work with B.Bobbia and D.Varron on the proportional tail model where we assume that the conditional tails are asymptotically proportional to the unconditional tail, that is P(Y>y|X=x) âˆŒ Ïƒ(x)P(Y>y) as y â†’ y,the upper endpoint of the distribution. This framework was introduced in the slightly different context of heteroscedastic extremes in Einmahl et al. (JRSSB 2016) and the function σ was coined the skedasis function. Assuming an extreme value condition for Y  together with the proportional tail model, the extreme quantile regression is reduced to the estimation of the skedasis function and the extreme value index. We present our results for such an estimation. Interestingly, we introduce different techniques for the proof as Einmahl et al.: we introduce coupling arguments relying on total variation and Wasserstein distances, whereas the original proof relies mostly on empirical process theory.

The seminar room will be Snijderszaal (LB 01.010 ) at the first floor of EWI Faculty (Building 36). Full address: Mekelweg 4, 2628 CD Delft.


September 24, 2018Pierre Nyquist (TU Eindhoven)

When: Monday September 24th, 16:00
Where: TU Delft, Faculty EWI, Mekelweg 4, Lecture hall D@ta.

The infinite swapping algorithm: Properties and applications.

Infinite swapping is a Monte Carlo method originally developed by Doll, Dupuis and co-authors using empirical measure large deviations. The method is designed to overcome the problem of rare-event sampling, and thus speed up convergence, in the setting of computing integrals with respect to Gibbs measures. In our subsequent work the properties of infinite swapping were studied further using a combination of empirical measure large deviations and stochastic ergodic control problems, revealing explicit information on the convergence properties of the sampling scheme.  This talk will focus on the properties of infinite swapping, and the associated large deviations analysis, and (non-standard) applications in quantum dynamics and machine learning. Given time we will also discuss briefly the use of empirical measure large deviations to study convergence rates for MCMC methods, and possible implications of the comparison between parallel tempering and infinite swapping for training RBMs.

This is based on joint work with Paul Dupuis, Jim Doll (Brown University), Henrik Hult and Carl Ringqvist (KTH).


September 17, 2018: Maki Furukado (Yokohama National University)

When: Monday September 17th, 16:00
Where: TU Delft, Faculty EWI, Mekelweg 4, Lecture hall D@ta.

The generation of the stepped surface in terms of the modified Jacobi-Perron algorithm

For a substitution σ satisfying some conditions, it is known the way how to define a map which maps an unit face to some unit faces on a stepped surface, which are constructed by three kinds of unit faces U, of a plane related with σ.  It seems that we can generate the stepped surface by applying σ to U infinitely many times. In this talk, we would like to treat the substitutions in terms of the modified Jacobi-Perron algorithm and to discuss about the generation of the stepped surface.
This work is in collaboration with Shunji Ito (Kanazawa University) and Shin-ichi Yasutomi (Toho University).


September 10, 2018: Mike Keane (TU Delft)

When: Monday September 10th, 16:00
Where: TU Delft, Faculty EWI, Mekelweg 4, Lecture hall D@ta.

The World of Bernoulli Schemes

The study of random processes is a central element of
probability and statistics. In this lecture I wish to concentrate on the
understanding which has been developed of Bernoulli schemes, named after
the seminal work of Jacob and Johan Bernoulli in the 17th century in
Basel and Groningen, and in particular to explain simple methods which
algorithmically convert sequences of random "coin" tosses of a coin to
such random sequences of another, different "coin", both in the
classical and quantum domains. There is at present a hope that the new
"coins", sometimes called "quantum coins", will provide solutions to yet
intractable questions using classical Bernoulli techniques.


June 11, 2018: Rajat Hazra (Indian Statistical Institute, Kolkata)

When: Monday June 11th, 16:00
Where: TU Delft, Faculty EWI, Mekelweg 4, Lecture hall D@ta.

Branching random walk: a tale of two tails.

Branching random walk with regularly varying displacements is a heavy-tailed random field indexed by a Galton-Watson tree. We shall survey some results on rightmost position of the particle and also describe the point process associated to it. These results were obtained when the mean of the branching random variable is finite and satisfies the Kesten-Stigum condition. We extend these results to the case when the mean is infinite in the underlying GW tree (which can happen when the branching random variable has heavy tails). The weak limit of the rightmost particle, the associated cloud speed, the point process of displacements all can be described explicitly. If time permits we shall also describe the case when displacements have exponentially decaying tails. 

The talk is based on the master thesis of Souvik Ray (ISI, Kolkata) and also joint works with Ayan Bhattacharya (CWI, Amsterdam),  Parthanil Roy (ISI Bangalore), Philippe Soulier (Universite Paris Nanterre).
 

May 14, 2018: Eni Musta (TU Delft)

When: Monday May 14th, 16:00
Where: TU Delft, Faculty EWI, Mekelweg 4, Lecture hall D@ta.

Central limit theorems for global errors of smooth isotonic estimators

A typical problem in nonparametric statistics is estimation of an unknown function on the real line, which may, for instance, be  a probability density, a regression function or a failure rate. In many cases, it is natural to assume that this function is monotone and incorporating such a prior knowledge in the estimation procedure leads to more accurate results. We consider smooth isotonic estimators, constructed by combining an isotonization step with a smoothing step. We investigate the global behavior of these estimators and obtain central limit theorems for their L_p losses. We also perform a simulation study for testing monotonicity on the basis of the L_2 distance between the kernel estimator and a smooth isotonic estimator.

Joint work with Hendrik P. LopuhaÀ.
 

May 07, 2018: Federico Sau (TU Delft)

When: Monday May 7th, 16:00
Where: TU Delft, Faculty EWI, Mekelweg 4, Lecture hall D@ta.

Exclusion process in symmetric dynamic environment: quenched hydrodynamics.

For the simple exclusion process evolving in a symmetric dynamic random environment, we derive the hydrodynamic limit from the quenched invariance principle of the corresponding random walk. For instance, if the limiting behavior of a test particle resembles that of Brownian motion on a diffusive scale, the empirical density, in the limit and suitably rescaled, evolves according to the heat equation.

In this talk we make this connection explicit for the simple exclusion process and show how self-duality of the process enters the problem. This allows us to extend the result to other conservative particle systems (e.g. IRW, SIP) which share a similar property. 

Work in progress with F. Collet, F. Redig and E. Saada.
 

April 23, 2018: Tim van Erven (Leiden University)

When: Monday April 23rd, 16:00
Where: TU Delft, Faculty EWI, Mekelweg 4, Lecture hall D@ta.

MetaGrad: Multiple Learning Rates in Online Learning

In online sequential prediction it is well known that certain subclasses
of loss functions are much easier than arbitrary convex functions.
We are interested in designing adaptive methods that can automatically
get fast rates in as many such subclasses as possible, without any
manual tuning. Previous adaptive methods are able to interpolate between
strongly convex and general convex functions. We present a new method,
MetaGrad, that adapts to a much broader class of functions, including
exp-concave and strongly convex functions, but also various types of
stochastic and non-stochastic functions without any curvature. For
instance, MetaGrad can achieve logarithmic regret on the unregularized
hinge loss, even though it has no curvature, if the data come from a
favourable probability distribution. MetaGrad's main feature is that it
simultaneously considers multiple learning rates. Unlike all previous
methods with provable regret guarantees, however, its learning rates are
not monotonically decreasing over time and are not tuned based on a
theoretically derived bound on the regret. Instead, they are weighted
directly proportional to their empirical performance on the data using a
tilted exponential weights master algorithm.

References:
T. van Erven and W.M. Koolen. MetaGrad: Multiple Learning Rates in
Online Learning. NIPS 2016.
W.M.Koolen, P. GrĂŒnwald and T. van Erven. Combining Adversarial
Guarantees and Stochastic Fast Rates in Online Learning. NIPS 2016.
 

April 09, 2018: Alessandra Cipriani (TU Delft)

When: Monday April 09th, 16:00
Where: TU Delft, Faculty EWI, Mekelweg 4, Lecture hall D@ta.

On two Gaussian random interfaces

Random interfaces arise naturally as separating surfaces between two different thermodynamic phases or states of matter, for example oil and water. In this talk we will introduce Gaussian models for random interfaces by focusing on two of them: the discrete Gaussian free field (DGFF) and the membrane model (MM). We will present their similarities and differences, in particular we will discuss their 1-dimensional representations, the recent developments on the description of their extremal behavior, and their scaling limits. We will also pose some open problems that arise in the MM setting.
 

March 26, 2018: Arne Huseby (University of Oslo)

When: Monday March 26th, 16:00
Where: TU Delft, Faculty EWI, Mekelweg 4, Lecture hall D@ta.

Environmental contours

Environmental contours are used as an efficient way of summarizing uncertainty about various environmental variables relevant to marin constructions. They are typically used in the design phase where precise knowledge about the construction is not yet available. The traditional approach to environmental contours is based on the well-known Rosenblatt transformation. However, due to the effects of this transformation the probabilistic properties of the resulting environmental contour can be difficult to interpret. An alternative approach to environmental contours uses Monte Carlo simulations on the joint environmental model, and thus obtain a contour without the need for the Rosenblatt transformation. This contour have well-defined probabilistic properties, but may sometimes be overly conservative in certain areas. In this presentation we review the main approaches to environmental contours, and show how to evaluate the probabilistic properties of such contours.
 

March 19, 2018: Bruno Kimura (TU Delft)

When: Monday March 19th, 16:00
Where: TU Delft, Faculty EWI, Mekelweg 4, Lecture hall D@ta.

Phase transition in the one-dimensional long range Ising model under the presence of decaying external fields

The long range Ising models differs from the classical ones by the fact that the spin-spin interaction decays spatially. Since long range models often behaves like short range models in higher dimension, we studied the phase diagram of one-dimensional Ising models in d=1 and proved that even in the presence of certain external fields there is phase transition at low temperatures. The techniques used to prove such result were based contour arguments developped by Froehlich, Spencer, Cassandro, Presutti et al.
 

March 12, 2018: Dennis Dobler (VU Amsterdam)

When: Monday March 12th, 16:00
Where: TU Delft, Faculty EWI, Mekelweg 4, Lecture hall D@ta.

Nonparametric inference on transition probabilities in right-censored multi-state models

When analyzing times of occurences of certain events, e.g. individuals' transitions from a healthy to a disease state or vice versa, many medical studies need to face the problem of incomplete data due to independent right-censoring. That is, from certain points in time, observation of some events is not possible any more. Because of this, the use of empirical distributions for estimating event probabilities would lead to a systematic bias.

Instead, products of estimated hazard functions yield estimators of probabilities of state transitions. This well-known Aalen-Johansen estimator is applicable if the Markov assumption for the processes of individual state trajectories is satisfied, i.e. given the present state, visited past and future states are independent. Considered as a process in time, this estimator has a complicated Gaussian limit distribution when the sample size tends to infinity. This makes the use of resampling techniques necessary. We will discuss related conditional central limit theorems which enable the construction of confidence bands for state transition probabilities.

When there is reason to discard the Markov assumption, the Aalen-Johansen estimator loses its justification. We will discuss recently proposed estimators of transition probabilities in this situation as well as their drawbacks and advantages.


March 05, 2018: Cees de Valk (Royal Netherlands Meteorological Institute KNMI)

When: Monday March 05rd, 16:00
Where: TU Delft, Faculty EWI, Mekelweg 4, Lecture hall D@ta.

Tail behaviour of a transport-based multivariate quantile

The talk addresses joint work with Johan Segers at the UniversitĂ© catholique de Louvain. Recently, an attractive multivariate quantile based on Monge-Kantorovitch optimal transport was proposed: a map transforming a spherical reference distribution function to the distribution function of interest is sought which minimises the mean square of the size of the displacement. We explore this idea further, in particular to define and estimate a multivariate tail quantile. Starting from the assumption of multivariate regular variation, we derive limit relations satisfied by the optimal map and the associated quantile. Estimators and examples of applications are discussed.


February 26, 2018: Jan H. van Schuppen (TU Delft)

When: Monday February 26th, 16:00
Where: TU Delft, Faculty EWI, Mekelweg 4, Lecture hall D@ta.

Control and Prediction of Traffic Flow in a Large-Scale Road Network

The research is motivated by control of the motorway network of The Netherlands. Six traffic control centers receive online data about the traffic flow and activate the control measures of routing control, dynamic speed limits, incident warnings, and ramp-metering.
      There is first a need for an algorithm to predict the traffic flow in a large-scale network for a period of 30 minutes into the future.
The predictions are needed to evaluate control measures.  The short prediction time, less than a second, is needed for control purposes.
      The prediction algorithm is based on parallel processing at the local level with a coordination at the network level.
For this an iterative procedure has been formulated.
It is then possible to predict the traffic flow in a 145 km network in about 0.04 s with an elementary computer.
      Control objectives for a large-scale road network require a new approach. The reader may think for the scale of the network of the range of 100 to 2.000 km, most of the Western part of The Netherlands with about 6 to 10 million inhabitants.
For this purpose, a multilevel control system is proposed that is currently in development. So far one distinguishes the levels of a section, a link, a subnetwork, a network, a region, etc. Control objectives are assigned to each level and control algorithms for each level are synthesized.
Control synthesis of a multilevel control system requires the development of an appropriate procedure.
      The research is based on scientific cooperation with:
Yubin Wang and Jos L.M. Vrancken.


February 19, 2018: Bas  Janssens  (TU Delft)

When: Monday February 19th, 16:00
Where: TU Delft, Faculty EWI, Mekelweg 4, Lecture hall D@ta.

Reflection positivity for parafermions

If you interchange two bosons, nothing happens. If you interchange two fermions, you get a minus sign. More generally, if you interchange two parafermions (or "anyons"), you get a complex pth root of unity. Inspired by recent interest in the use of parafermions in statistical physics, we give necessary and sufficient conditions on parafermionic hamiltonians in order for the the corresponding Gibbs state to be reflection positive. (Joint w. Arthur Jaffe, J. Funct.

Analysis 272:8, 3506-3557)


February 12, 2018:  Cristian GiardinĂ  (UniversitĂ  di Modena e Reggio Emilia)

When: Monday February 12th, 16:00
Where: TU Delft, Faculty EWI, Mekelweg 4, Lecture hall D@ta.

Boundary driven 2D Ising model

I will discuss properties of the non-equilibrium stationary state of the two-dimensional Ising model coupled to magnetization reservoirs. When the boundaries impose magnetization values corresponding to opposite phases, a free boundary problem (of Stefan type) describes the evolution of the interface in the hydrodynamic limit. I will argue that the stationary solutions in the metastable or unstable phase may sustain a current that is going uphill, i.e. from the reservoir with lower magnetization to the one with larger magnetization.


February 05, 2018: Ivan S. Yaroslavtsev  (TU Delft)

When: Monday February 05th, 16:00
Where: TU Delft, Faculty EWI, Mekelweg 4, Lecture hall D@ta.

The canonical decomposition of local martingales in infinite dimensions

The canonical decomposition of local martingales was invented by Yoeurp in 1976 and partly by Meyer in 1976, and it has the following form: a local martingale has the canonical decomposition if it can be decomposed into a sum of a continuous local martingale (a Wiener-like part), a purely discontinuous quasi-left continuous local martingale (a Poisson-like part, which does not jump at predictable stopping times), and a purely discontinuous local martingale with accessible jumps (a discrete-like part, which jumps only at certain predictable stopping times). Due to Yoeurp the canonical decomposition of a real-valued local martingale always exists and is unique. In the current talk we will extend this result to infinite dimensions and show that the canonical decomposition of an arbitrary $X$-valued local martingale exists if and only if $X$ is a UMD Banach space; moreover, if this is the case, then the corresponding $L^p$ ($1<p<\infty$) and weak $L^1$ estimates for the decomposition terms hold.

The canonical decomposition plays a significant role in vector-valued stochastic integration theory. For instance one can show sharp estimates for an $L^p$-norm of an $L^q$-valued stochastic integral with respect to a general local martingale. An important tool for obtaining these estimates are the recently proven Burkholder-Rosenthal-type inequalities for discrete $L^q$-valued martingales.

This talk is partly based on joint work with Sjoerd Dirksen (RWTH Aachen University).


January 29, 2018: Wolter Groenevelt (TU Delft)

When: Monday January 29th, 16:00
Where: TU Delft, Faculty EWI, Mekelweg 4, Lecture hall D@ta.

Stochastic duality, orthogonal polynomials and Lie algebras

Recently it has been shown that stochastic duality functions for several Markov processes can be given in terms of orthogonal polynomials. In this talk I explain how this results can be obtained from representation theory of Lie algebras.


January 22, 2018: No seminar


January 15, 2018: Rob van den Berg (CWI Amsterdam)

When: Monday January 15th, 16:00
Where: TU Delft, Faculty EWI, Mekelweg 4, Lecture hall D@ta.

Near-critical percolation and applications

Motivated by sol-gel transitions, David Aldous (2000) introduced and  analysed an interesting percolation model on a tree where clusters  stop growing (`freeze') as soon as they become infinite. I will discuss recent work with Demeter Kiss  and Pierre Nolin (and current work with Pierre Nolin) on processes of similar flavour on planar lattices.  We focus on the question whether or not the `gel' (i.e. the union of the frozen clusters)  occupies a negligible fraction of space. Sharp versions of some classical scaling results by Harry Kesten for near-critical percolation were needed, and developed, to answer this question.   I also plan to present a modification of the model which may be interpreted as a `self-organized critical' sensor/communication network. The study of it leads naturally to a more general framework of near-critical percolation with  `heavy-tailed' impurities.


January 08, 2018: Marjan Bakker (Tilburg University)

When: Monday January 08th, 16:00
Where: TU Delft, Faculty EWI, Mekelweg 4, Lecture hall D@ta.

Human factors in statistics: an example with outliers. 

I’m part of the Meta-Research Center of Tilburg University, School of Social and Behavioral Sciences and our research focuses on investigating research practices and problems in the scientific system. I will first present a short overview of our work, after which I will continue to discuss a specific research practice, namely the detection and handling of outliers. In psychology, outliers are often excluded before running an independent sample t test and data are often non-normal because of the use of sum scores based on tests and questionnaires. After reviewing common practice, I present results of simulations of artificial and actual psychological data, which show that the removal of outliers based on commonly used Z value thresholds severely increases the Type I error rate. Result show that removing outliers with a threshold-value of Z of 2 in a short and difficult test, increases Type I error rates up to .222. Inflations of Type I error rates are particularly severe when researchers are given the freedom to alter threshold-values of Z after having seen the effects thereof on outcomes. I recommend the use of non-parametric or robust tests instead.