Stata The Stata listserver
[Date Prev][Date Next][Thread Prev][Thread Next][Date index][Thread index]

st: Conference on Large Data Sets

From   Kit Baum <[email protected]>
To   [email protected]
Subject   st: Conference on Large Data Sets
Date   Mon, 6 Oct 2003 09:21:38 -0400

(reposted from Ox-Users list)

November 6th., 2003
Amsterdam, The Netherlands

Section Statistical Software of The Netherlands Society for Statistics and Operations Research

Program Committee
Dr. Ruud Koning, Universiteit of Groningen
Prof.dr. Arno Siebes, University of Utrecht
Dr. Siem Heisterkamp, National Institute of Public Health and the Environment (RIVM)
Prof.dr. Patrick Groenen, Erasmus University Rotterdam

Large Data Sets
Fifteen years ago, handling of large datasets, let alone analysis in them was a nearly impossible task for researchers. The data were often stored on tape, and even the process of reading the dataset into the memory of a mainframe was slow. Memory was scarce, and so it was difficult to save intermediate results. Such datasets were analyzed using either tailor-made statistical software, or self-written programs using routines from numerical libraries like NAG or IMSL. Maximum-likelihood estimation of non-linear models was non-trivial if not impossible, and researchers often had to be satisfied with one-step improvements over some consistent estimator.

Things have changed for the better, from a technical point of view. Huge datasets are routinely available to researchers in different fields, like finance, marketing, biomedical sciences, particle physics, astronomy, life sciences, and social sciences. Datasets used to be large in the sense of containing many observations on a small number of variables. But nowadays, e.g. in the life sciences we are confronted with datasets with a small number of observations and a huge number of variables. Data can be transported on media that can be read by most personal computers, and the computing power on the desk of a statistical researcher is absolutely impressive. Instead of focusing on the mechanics of the analysis of datasets, researchers can focus on the actual statistical analysis. Thus the question has turned into: Now that we have a lot of data, what could we do with it?

This conference addresses the analysis of very large datasets, both from the point of view of a statistician who works with such datasets as well as the point of view of practitioners from various fields. By presenting several applications and tools available to a modern day statistical researcher, we want to show that large datasets offer unique opportunities for researchers to answer questions that were difficult to tackle before. The program committee is delighted to be able to present a selection of the top researchers on this topic.

Please register via email to [email protected] or online via:

9:30 registration and coffee
10:00 opening
10:05 Yoav Benjamini
Tel-Aviv University
Multiplicity issues related to complex research questions
in microarrays analysis
10:55 Philip Hans Franses
Erasmus University, Rotterdam
More, but also better?
11:40 Paul Eilers
Leiden University Medical Centre
Low Memory, High Speed Smoothing on Large
Multidimensional Grids
12:30 Lunch
13:30 Andreas Buja
University of Pennsylvania
Hands-On Experiences with Mining Telecom Data
14:15 Jos Roerdink
University of Groningen
Visualization of large data sets with applications in
life science
15:00 coffee/ tea break
15:15 Geert Wets
Limburg University, Belgium
Large data sets in traffic safety
16:30 Drinks

Nieuwpoortkade 25
1055 RX Amsterdam
The Netherlands

T +31 (0)20 5608410
F +31 (0)20 5608448
E [email protected]

* For searches and help try:

© Copyright 1996–2024 StataCorp LLC   |   Terms of use   |   Privacy   |   Contact us   |   What's new   |   Site index