Netflix Prize 完整数据集

Name: Netflix Prize 完整数据集
Rating: 4.5 (97 reviews)
Author: qq_14275567

上传者：qq_14275567 2018-12-19 00:47:32上传 TORRENT文件 26.68KB 热度 97次

著名的Netflix 智能推荐百万美金大奖赛使用是数据集. 因为竞赛关闭, Netflix官网上已无法下载. Netflix provided a training data set of 100,480,507 ratings that 480,189 users gave to 17,770 movies. Each training rating is a quadruplet of the form . The user and movie fields are integer IDs, while grades are from 1 to 5 (integral) stars.[3] The qualifying data set contains over 2,817,131 triplets of the form , with grades known only to the jury. A participating team's algorithm must predict gra des on the entire qualifying set, but they are only informed of the score for half of the data, the quiz set of 1,408,342 ratings. The other half is the test set of 1,408,789, and performance on this is used by the jury to determine potential prize winners. Only the judges know which ratings are in the quiz set, and which are in the test set—this arrangement is intended to make it difficult to hill climb on the test set. Submitted predictions are scored against the true grades in terms of root mean squared error (RMSE), and the goal is to reduce this error as much as possible. Note that while the actual grades are integers in the range 1 to 5, submitted predictions need not be. Netflix also identified a probe subset of 1,408,395 ratings within the training data set. The probe, quiz, and test data sets were chosen to have similar statistical properties. In summary, the data used in the Netflix Prize looks as follows: Training set (99,072,112 ratings not including the probe set, 100,480,507 including the probe set) Probe set (1,408,395 ratings) Qualifying set (2,817,131 ratings) consisting of: Test set (1,408,789 ratings), used to determine winners Quiz set (1,408,342 ratings), used to calculate leaderboard scores For each movie, title and year of release are provided in a separate dataset. No information at all is provided about users. In order to protect the privacy of customers, "some of the rating data for some customers in the training and qualifying sets have been deliberately perturbed in one or more of the following ways: deleting ratings; inserting alternative ratings and dates; and modifying rating dates".[2] The training set is such that the average user rated over 200 movies, and the average movie was rated by over 5000 users. But there is wide variance in the data—some movies in the training set have as few as 3 ratings,[4] while one user rated over 17,000 movies.[5] There was some controversy as to the choice of RMSE as the defining metric. Would a reduction of the RMSE by 10% really benefit the users? It has been claimed that even as small an improvement as 1% RMSE results in a significant difference in the ranking of the "top-10" most recommended movies for a user.[6]

下载地址

用户评论

更多下载

下载地址

立即下载

用户评论

Netflix Prize完整数据集

著名的Netflix 智能推荐百万美金大奖赛使用是数据集. 因为竞赛关闭, Netflix官网上已...

大小：0B | 2018-12-19 00:47:32
imdb完整数据集

包含文件imdb.npzimdb_word_index.json互联网电影资料库（InternetM...

大小：0B | 2019-07-10 05:03:21
netflix inc netflix prize data

netflix-prize-data 数据集 Netflix数据集包含了1999.12.31-200...

大小：4.87MB | 2021-04-16 13:52:46
VOC2012完整数据集

大小：0B | 2019-04-07 21:56:05
手写数字识别完整数据集

大小：0B | 2018-12-09 02:14:53
netflix prize详细介绍

在2006年，我们宣布举办NetflixPrize，这是一个旨在解决电影评分预测问题的机器学习和数据...

大小：0B | 2020-05-15 08:07:33
Netflix Movies and TV Shows Netflix影视节目数据集

该数据集包括截至2019年Netflix上可用的电视节目和电影。 netflix_titles.cs...

大小：1.17MB | 2021-04-25 00:18:02
DogsvsCats完整数据集之part2

大小：0B | 2019-02-18 21:33:10
ace2005完整数据集详见内部.zip

ace2005完整数据集详见内部.zip

大小：18KB | 2020-07-30 17:39:24
Dogsvs.Cats完整数据集之part4

下载自kaggle官网的猫狗识别的完整数据集，压缩包大小831M。csdn限制每部分不能超过220M...

大小：0B | 2019-04-27 17:05:21
Netflix数据集上的协同过滤算法

硕士论文，Netflix数据集上的协同过滤算法

大小：0B | 2020-04-03 12:37:15
手写体识别项目代码和完整数据集

大小：0B | 2019-04-13 20:45:16
Dogs vs.Cats完整数据集之part3

下载自kaggle官网的猫狗识别的完整数据集，压缩包大小831M。csdn限制每部分不能超过220M...

大小：0B | 2019-07-05 05:13:07
Dogs vs.Cats完整数据集之part1

下载自kaggle官网的猫狗识别的完整数据集，压缩包大小831M。csdn限制每部分不能超过220M...

大小：0B | 2019-05-28 09:30:53
MNIST CNN 手写体识别完整数据集加代码

大小：0B | 2019-04-11 18:22:41
imdb完整的数据集

imdb.npz和imdb_word_index.json互联网电影资料库（InternetMovi...

大小：0B | 2020-05-26 01:11:02