Kaggle. A series of data
competitions in which teams compete the model the data and predict the
future. A USD500 prize for each competition.
Ready for Teaching
OzDASL. A library of data sets for teachers of
statistics in Australian and New Zealand. Gordon Smyth, Walter and Eliza Hall Institute of Medical Research.
Data and Story Library. DASL (pronounced
"dazzle") is an online library of datafiles and stories that illustrate the use
of basic statistics methods. Stories are classified according to statistical methods and
major topics of interest. Well organized. Perhaps the best single source of data sets for
teaching an introductory class. DASL Project, Cornell University.
Peter J Diggle Data Sets.
Geostatistical and Spatial point pattern data sets. Peter Diggle, University of Lancaster.
SPSS Data Sets. Data sets for SPSS
and SYSTAT, and a selection of other public data sets. SPSS Inc.
Statistical Reference Datasets. The
purpose of this project is to improve the accuracy of statistical software by providing
reference datasets with certified computational results that enable the objective
evaluation of statistical software. NIST.
Case Studies in Biometry. Data
diskette for the book by Nicholas Lange, Louise Ryan, Lynne Billard, David Brillinger,
Loveday Conquest, Joel Greenhouse. Wiley, 1994.
Data Expositions. Data sets used for
the annual ASA Statistical Graphics and Computing Data Expositions.
Disease Data. From the 1991 Statistics in
Public Health Surveillance Exposition.
JASA Data. Contributed datasets from
articles published in the Journal of the American Statistical Association.
King Crab Data. A large but patchy data set.
Although the topic is in principle an interesting one, my students have had trouble
assembling any useful data set from the various files associated with this project. 1990
Data Expo.
University of Wisconsin Data Archive.
Data sets from masters exams and several books, including Box, Hunter & Hunter;
Devore; Milliken & Johnson's Analysis of Messy Data; Yandell's Practical
Data Analysis for Designed Experiments. Douglas Bates, University of Wisconsin.
Council of European Social Science Data
Archives. Provides a clickable map of social science data archives all over the world,
and an integrated data catalogue for social science data archives.
Documents Center.
A excellent index to government statistical data on the Web, both United States and
international, maintained by the Documents Center of the University of Michigan.
Data Zoo. California coastal data
collection programs. Organized by experiment, instrument type and geographical region.
Center for Coastal Studies, University of California, San Diego.
Project Gutenberg. Full text online for a huge number
of books, including such things as the World Factbook. Major public domain books or
classics for which copyright has expired are likely to be here.
VIMS Pier Ambient Monitoring Data.
Local conditions on the York River at Gloucester Point, VA. You can download water
parameters and meteorological variables measured at 6 minutes intervals for the past 10
days, or view graphs of the same variables for the current and past years. Virginia
Institute of Marine Science, College of William & Mary, Gloucester Point, VA.