Netflix Prize Data Set
Netflix

nf_prize_dataset.tar.gz697.55MB
Type: Dataset
Tags:

Bibtex:
@article{,
title= {Netflix Prize Data Set },
journal= {},
author= {Netflix},
year= {2009},
url= {http://archive.ics.uci.edu/ml/datasets/Netflix+Prize},
license= {},
abstract= {This is the official data set used in the Netflix Prize competition. The data consists of about 100 million movie ratings, and the goal is to predict missing entries in the movie-user rating matrix.

|Attribute| Value|
|----|---|
| Data Set Characteristics:  | Multivariate, Time-Series      |
| Attribute Characteristics: | Integer                      |
| Associated Tasks:          | Clustering, Recommender-Systems |
| Number of Instances:       | 100480507                     |
| Number of Attributes:      | 17770                    |     
| Missing Values?            | Yes                           |
| Area:                      | N/A                                   |          


#Data Set Information:

This dataset was constructed to support participants in the Netflix Prize. 

There are over 480,000 customers in the dataset, each identified by a unique integer id. 

The title and release year for each movie is also provided. There are over 17,000 movies in the dataset, each identified by a unique integer id. 

The dataset contains over 100 million ratings. The ratings were collected between October 1998 and December 2005 and reflect the distribution of all ratings received during this period. Each rating has a customer id, a movie id, the date of the rating, and the value of the rating. 

As part of the original Netflix Prize a set of ratings was identified whose rating values were not provided in the original dataset. The object of the Prize was to accurately predict the ratings from this 'qualifying' set. These missing ratings are now available in the grand_prize.tar.gz dataset file.


#Attribute Information:

The format of the data is described fully in the README files contained in the dataset tar files. 


|Attribute| Value|
|-|-|
|MovieID: | Arbitrarily assigned unique integer in the range [1 .. 17770]. |
|CustomerID:  |Arbitrarily assigned unique integer in the range [1..2649429] (with gaps). |
|Rating:  |Number of 'stars' assigned to a movie by a customer; an integer from 1 to 5. |
|Title: | English language title of the movie on the Netflix website. |
|YearOfRelease:  |Year a movie was released in the range [1890..2005]. May correspond to the release of corresponding DVD, not necessarily its theaterical release. |
|Date: | Timestamp of a rating in the form YYYY-MM-DD, in the range 1998-11-01 to  2005-12-31. |
|NetflixID: | Integer ID of a movie as currently used in the Netflix developer API |

#Relevant Papers:
James Bennett and Stan Lanning. 'The Netflix Prize', 2007. 
http://rexa.info/paper/4755326FDAE3929649348DC380A46D3882A98198},
keywords= {},
terms= {}
}

Hosted by users:

Send Feedback