Sparse coded images from Caltech256

Images are represented using "visterms", a combination of color and texture histograms.              Download

Gene expression time courses ("Impulse")

76 time courses with 5-13 time points. Gene expression measured in yeast following changes in environment.              Download
See webpage

NIPS 1-17 data

Distribution of words in all NIPS papers from 1988 to 2003. Including joint distribution of words and authors. This extends a database prepared by Sam Roweis, that used papers from nips 1-12 that were OCR'ed by Yann Lecun. Word occurences were calculated from pdf or ps files mostly available on the NIPS web site. Data was processed with the help of Amir Globerson.

Distribution of words, documents and authors
Download the data in matlab (V6, R12) format (35Mb)              Download
Format description
Data preprocessing description.

If you use this data please refer to "Euclidean Embedding of Co-occurrence Data. Amir Globerson, Gal Chechik, Fernando Pereira and Naftali Tishby, JMLR 8, 2007. bibtex

Additional data
Raw text files

home | publications | cv | code | data | personal | book | . . . . . . .