Yahoo! Labs


KDD CUP 2011 Official Rules


THE FOLLOWING CONTEST IS OFFERED IN THE UNITED STATES AND SHALL ONLY BE CONSTRUED, GOVERNED AND EVALUATED ACCORDING TO UNITED STATES LAW. BY VISITING http://kddcup.yahoo.com/ (THE "CONTEST WEBSITE"), YOU AGREE THAT YOU ARE PARTICIPATING IN A PROMOTION GOVERNED BY UNITED STATES LAW. THE TERMS OF SERVICE OF YAHOO! INC., LOCATED AT http://info.yahoo.com/legal/us/yahoo/utos/utos-173.html APPLY TO ALL MATERIAL ON THE CONTEST WEBSITE.

PLEASE DO NOT PROCEED ANY FURTHER IF YOU DO NOT AGREE TO THE TERMS OF THE CONTEST WEBSITE AND THE PROMOTION, OR IF ACCESS TO THE CONTEST WEBSITE AND/OR ENTRY INTO THE PROMOTION IS CONTRARY TO THE LAWS OF YOUR COUNTRY.

NO ENTRY FEE OR PURCHASE IS REQUIRED TO PARTICIPATE. THE CONTEST IS VOID WHERE PROHIBITED BY LAW.

ENTRY IN THIS CONTEST CONSTITUTES YOUR ACCEPTANCE OF THESE OFFICIAL RULES.


  1. About the Contest: The KDD Cup 2011 (the "Contest") is a promotion in which participants from industry and academia will compete to see whose machine learning algorithm performs best on two tasks, described below. Using one or both of two datasets supplied by Yahoo!, competitors will process the data using their algorithms, and return the test set predictions to Yahoo! through the Contest Website. On the basis of a given metric set forth in these Official Rules, Yahoo! will select the top three scoring Submissions (as defined below) in two different Contest tracks, as described below. In addition to these Official Rules, more information about how to compete in the Contest (the "Contest Instructions") is available at the Contest Website. The Contest Instructions are incorporated into these Official Rules by reference; however, in the event of a conflict between the Contest Instructions and these Official Rules, these Official Rules will control.
  2. Eligibility: All participants in this Contest must be at least 18 years or the age of majority in their respective jurisdiction(s) of legal residence at the time of entry. Void where prohibited by law. Employees and agents of Yahoo! Inc. ("Sponsor" or "Yahoo!"), its affiliates, subsidiaries, advertising and promotional agencies, any other prize sponsor, and any entity involved in the development, production, implementation, administration or fulfillment of the Contest (all of the foregoing, together with Yahoo!, collectively referred to as "Contest Entities"), and the immediate family members and persons living in the same household as such individuals, whether related or not, are not eligible to participate or win.
  3. How To Enter
    • i. Team Membership: Participation in this Contest is done on an individual or a team basis (both individual entrants and team entrants are hereafter referred to as a "Team"). Teams may be comprised of 1-10 members. Each individual participating in this Contest may belong to one team ONLY; you may not participate in more than one Team in the Contest and you may not participate as both an individual entrant and a Team member. Each Team will be required to name one representative for the Team (the "Team Representative"), who will be responsible for submitting the Team's registration in this Contest, and serving as the Sponsor's primary contact for matters relating to this Contest. Further, when a particular Team is named as a winner in this Contest, that Team's prize will be awarded to the Team Representative, who will be responsible for splitting the prize among Team members and for any tax obligations associated with winning the prize. Refer to Rule 6 below for more information.
    • ii. Registration: The Contest begins at 12:00:00 AM PT on March 15, 2011 and ends at 11:59:59 PM PT on June 30, 2011 (the "Entry Period"). To enter the Contest, the Team Representative must visit the Contest Website during the Entry Period. The Team Representative will be asked to sign up for a Yahoo! ID for the team. Yahoo! recommends signing up for a new account and not using any individual's existing Yahoo! ID/account. The Team Representative will also be asked to create a team name that will be used in connection with submitting or updating the Team's Contest submission, and to provide the names and email addresses of all Team members on the registration form (the "Team Members"). Each Team Member will receive an email invitation from the Team Representative inviting him/her to participate in the Team. The Team Member MUST acknowledge the Team invitation and agree to the Official Rules, including any eligibility restrictions and the terms of service governing the download of data in connection with this Contest, described below, before he/she will be allowed to participate in the Team. If a person is invited to be a Team Member and does not so agree, the Team will consist of the remaining Team Members who have so agreed. The Team Representative is primarily responsible for adding and/or dropping Team Members from the Team; however, should a Team Member wish to withdraw from the Contest for any reason, he/she may do so by visiting the Contest Website. Any Team Member who wishes to change Teams must first withdraw from his or her original Team.
    • iii. Contest Tracks: There are two independent tracks offered in the Contest (each, a "Contest Track"). A Team may participate in one or both Contest Tracks, at its discretion. Except where specifically indicated, instructions for one Contest Track in these Official Rules apply equally to the other Contest Track. The first Contest Track ("First Track") employs a dataset containing over 260 million ratings. For this dataset the task is to predict test set ratings as accurately as possible. The second Contest Track ("Second Track") concentrates on a smaller training set with 62 million train ratings. The goal in this track is to separate items rated highly by the users from items never rated by the users.
    • iv. Download, Processing and Output: There are two datasets offered in connection with the Contest (each, a "Dataset"): a larger Dataset, titled "Set1" and a smaller Dataset, titled "Set2". Set1 is further divided into three subsets: training (the "Training Subset"), validation (the "Validation Subset") and test (the "Test Subset"). Set2 is further divided into two subsets: training (the "Training Subset") and test (the "Test Subset"). ). Each test set is further randomly split into two subsets of a similar size, in a way undisclosed to the participants. One subset ("Test1") is used for deciding the winners in the contest, whereas the other subset ("Test2") is used for reporting results back to the competitors and for publishing teams' standing on the public Leaderboard. Refer to the Contest Instructions for more details about the format of the Datasets. The Team Representative, and any other Team Members wishing to access to the Dataset(s) must visit the Contest Website, check where indicated to agree to and accept these Official Rules and the terms governing the download of the Datasets (the "Data Sharing Agreement", or "DSA"), and follow the instructions to download the Datasets. Then, the Team must use its algorithms (its "Algorithms") to process the Dataset(s). When processed, a member of the Team must return to the Contest Website, where, for each Contest Track in which the Team is competing, he/she will be required to complete the Team's submission by uploading a file containing predictions on the Test Subset (the "Test Subset Predictions"). The Team's Output files must follow the format guidelines provided in the Contest Instructions. Together, the Team's Registration Info and Output constitute a "Submission" in the Contest. In each Contest Track in which the Team is competing, the Team's performance will be evaluated by their last submission (its "Primary Entry"). Predictions will be ranked in increasing order (lower score is better) according to the performance measure for each track: Root Mean Squared Error for First Track and error rate for Second Track. Refer to the Contest Instructions for more information. In addition, a Team's Test1 Subset Predictions for each Contest Track are the only predictions that will be evaluated by the Sponsor for purposes of the Team's performance in either Contest Track For purposes of these Official Rules, "receipt" of a Submission occurs when Yahoo!'s servers record both the Team's Registration Information and Output upon clicking the "Submit" button at each phase. In the event of a dispute about the identity of any individual Entrant, Team Representative or Team Member, each Primary Entry will be declared made by the authorized email account holder of the email address submitted at the time of entry. "Authorized email account holder" is defined as the natural person who is assigned to an email address by an internet access provider, online service provider, or other organization (e.g. business, educational institution, etc.) that is responsible for assigning email addresses for the domain associated with the submitted email address. Any potential winning Team will be required to provide Contest Entities with proof that all members of the Team are the respective authorized account holders of the email address associated with the winning Team's entry. In the event that a dispute over the identity of any individual Entrant, Team Member or Team Representative cannot be resolved, that Primary Entry and any related Submissions may be disqualified.

      LIMIT ONE TEAM PER INDIVIDUAL CONTESTANT; LIMIT ONE SUBMISSION PER INDIVIDUAL ENTRANT OR TEAM IN AN EIGHT HOUR PERIOD. Subsequent attempts to submit outside these limits will not be accepted. A team's last entry in time will be considered its final entry to be judged. The upload of a Submission for an Entrant/Team is solely the responsibility of the Entrant or Team Representative, acting on behalf of the Team. Entries may only be made according to the method described above. Proof of sending (such as an automated computer receipt confirming delivery of email, "thanks for entering" message, or post office receipt) does not constitute proof of actual receipt by Yahoo! of an entry for purposes of these Official Rules. Automated entries (including but not limited to entries submitted using any bot, script, macro, or service), copies, third party entries, facsimiles and/or mechanical reproductions are not permitted and will be disqualified. Only eligible Submissions and actually received by Yahoo! before the end of the Entry Period will be evaluated for ranking in the Contest. Unintelligible, incomplete, improperly formatted, or garbled Submissions will not be accepted and may be disqualified. Subject to Section 9, all Submissions become the property of Yahoo!, and none will be acknowledged or returned; however, Yahoo! claims no ownership or rights in the underlying Algorithm used to create any Submissions submitted in this Contest. Refer to Rule 4 below for more information.

  4. General Conditions of Entry: By entering this Contest, each Team Representative (and, if there are other members of the Team, each Team Representative, on behalf of all Team Members):
    • A. WARRANTS AND REPRESENTS THAT THE ENTRANT OR TEAM REPRESENTATIVE (EITHER ALONE OR JOINTLY WITH THE OTHER TEAM MEMBERS ONLY) IS/ARE THE ORIGINAL AUTHOR(S) AND INVENTOR(S) OF THE ALGORITHM RESPONSIBLE FOR GENERATING THE OUTPUT CONTAINED IN THE TEAM'S PRIMARY ENTRY AND SUBMISSION(S), AND OWN(S) ALL RIGHTS TO, OR HAS/HAVE OBTAINED ALL LICENSES AND OTHER PERMISSIONS NECESSARY TO, USE THE ALGORITHM AND SUBMIT THE OUTPUT CONTAINED IN THE TEAM'S SUBMISSION(S), CAN PROVIDE WRITTEN CONFIRMATION OF ANY OF THE ABOVE UPON REQUEST, AND THAT, TO THE TEAM'S KNOWLEDGE, THE ALGORITHM AND ANY SUBMITTED OUTPUT DO NOT AND WILL NOT INFRINGE THE INTELLECTUAL PROPERTY RIGHTS OR ANY OTHER RIGHTS OF ANY THIRD PARTY;
    • B. agrees to be bound by these Official Rules (which will be posted at the Contest Website throughout the Contest), the DSA, and by any other standard policies, rules and regulations governing the download and use of the Datasets made available in connection with this Contest, and acknowledges that failure to comply with any of the above will result in disqualification from the Contest;
    • C. agrees to comply with and be bound by the decisions of the Sponsor and the judges relating to this Contest, which shall in each case be final and binding in all respects;
    • D. shall, upon request, provide a list (including application or registration number, title, country of filing, and filing date) of any patents or patent applications held or applied for by the Team or any Team Members, or specifying any Team Member as an inventor, related to the Algorithm;
    • E. represents and warrants that the Output contained in the Team's Submission(s) has been produced by the Algorithm without human intervention;
    • F. grants to Sponsor and its affiliates, legal representatives, assigns and licensees, the right and permission to copyright (as appropriate), reproduce, encode, store, copy, transmit, publish, broadcast, display, publicly perform, exhibit and/or otherwise use or reuse (without limitation as to when or to the number of times used), each Team Member's name, address, image, voice, likeness, statements, and biographical material (in each case, as submitted or as edited by Sponsor, in Sponsor's sole discretion), as well as any additional photographic images, video images, portraits, interviews or other materials relating to the Team Member and arising out of his/her participation in this Contest, any other information or materials provided by the Team or any of its Team Members in connection with the Team's entry in either Contest Track, including without limitation the Registration Information, the Output, and the Presentation (as defined below) (collectively, the "Additional Materials"), with or without using any Team Member's name, in any media throughout the world for advertising and publicity purposes without additional review, compensation, or approval,
    • G. forever waives any rights of publicity, rights of privacy, intellectual property rights, and any other legal or moral rights that might preclude Yahoo!'s use of the Team's Primary Entry, Submission(s) or the Additional Materials, or require any Team Member's permission for Yahoo! to use them for promotional purposes, and agrees to never sue or assert any claim against the Contest Entities relating to the Contest Entities' use of those materials; and
    • H. agrees to indemnify and hold the Contest Entities and their respective subsidiaries, affiliates, officers, directors, agents, co-branders or other partners, and any of their employees (collectively, the "Contest Indemnitees"), harmless from any and all claims, damages, expenses, costs (including reasonable attorneys' fees) and liabilities (including settlements), brought or asserted by any third party against any of the Contest Indemnitees due to or arising out of the Team's Submissions or Additional Materials, or any Entrant or Team Member's conduct during or in connection with this Contest, including but not limited to trademark, copyright, or other intellectual property rights, right of publicity, right of privacy and defamation.

  5. Verification: For each Contest Track, the top three Primary Entries (as ranked by their MSE for First Track and ERROR RATE for Second Track as calculated by Yahoo!) will be named as potential first, second, and third prize winners in each Contest Track, subject to verification. Following the close of the Entry Period, Sponsor will contact the Entrants or Team Representatives for each of the top three highest-scoring submissions in each Contest Track for verification. In the event of a tie for any prize in the Contest, the affected prize will be divided among those Teams tied for that prize. The judges reserve the right to select some or no winners if they determine, in their sole opinion, that there are an insufficient number of eligible, complete, or appropriate Submissions or Primary Entries. Odds of winning depend on the number of eligible Submissions received. Judges' decisions will be final. PLEASE NOTE: EVEN IF YOU OR YOUR TEAM IS IDENTIFIED AS A POTENTIAL WINNER ACCORDING TO THE ONLINE LEADERBOARD FOR THE CONTEST, OR ARE CONTACTED AS A POTENTIAL WINNER BY THE SPONSOR, YOUR TEAM HAS NOT YET WON A PRIZE. ALL POTENTIAL WINNING TEAMS AND THEIR RESPECTIVE MEMBERS ARE SUBJECT TO VERIFICATION AND A POSSIBLE BACKGROUND CHECK CONDUCTED ON BEHALF OF THE CONTEST ENTITIES. ALL POTENTIALLY WINNING TEAM MEMBERS MUST MEET ALL ELIGIBILITY REQUIREMENTS BEFORE THE TEAM WILL BE CONFIRMED AS A WINNER OR BEFORE ANY PRIZE WILL BE AWARDED. As a condition for receiving any prize, the potential winning teams are required to submit a manuscript describing the winning Team's Algorithm and methods used to generate the Team's output (the "Manuscript").

    The Manuscript must be written in English, with pseudo-code and mathematical formulae as necessary. The Manuscript must be written at a level sufficient for a practitioner in computer science to reproduce the results obtained by the contestant. It must describe substantially all the steps performed both by any learning component and by the prediction component that produced the submitted predictions. If applicable, it should detail any initialization and convergence properties of the method as well as tuning procedures for setting any free parameters of the algorithm.

    The manuscript will be evaluated by a panel of three experts (the "Judges"), who will verify that the Manuscript could reasonably have generated the Team's Submission. Application of the foregoing judging criteria to eligible Contest entries in the Contest shall be at the judges' reasonable discretion and, as to elements of the judging criteria involving matters of subjectivity, at the judges' sole discretion.

    The Judges are members of Yahoo! Labs, closely familiar with the dataset and the related scientific field: (1) Gideon Dror; (2) Yehuda Koren; (3) Markus Weimer.

    The Team agrees in advance that the submitted manuscript from the winning Team will be published in the KDD Workshop proceedings, if the Workshop organizers choose to do so.

  6. Prizes: Three prize winning Teams will be named for each Contest Track, as follows:
    First Track:
    • First Prize: One First Place Prize Winner in the First Track will receive $5,000 cash.
    • Second Prize: One Second Place Prize Winner in the First Track will receive $2,000 cash.
    • Third Prize: One Third Place Prize Winner in the First Track will receive $1,000 cash.

    Second Track:
    • First Prize: One First Place Prize Winner in the Second Track will receive $5,000 cash.
    • Second Prize: One Second Place Prize Winner in the Second Track will receive $2,000 cash.
    • Third Prize: One Third Place Prize Winner in the Second Track will receive $1,000 cash.

    The following applies to all prizes: All prizes will be delivered in the form of a check. Limit one prize per winning Team in each Track, for a maximum of two possible prizes per winning Team in the Contest. Prizes cannot be used in conjunction with any other promotion or offer. Prizes may not be transferred or assigned except by Sponsor. Only listed prizes will be awarded and no substitutions will be made, except that Sponsor reserves the right to substitute any prize with another prize of equal or greater value in its sole discretion. Expenses not specifically stated above, together with the reporting and payment of all applicable taxes, fees, and/or surcharges, if any, arising out of, or resulting from, acceptance or use of a prize, are the sole responsibility of the winner(s) of that prize. Yahoo! expressly disclaims any responsibility or liability for injury or loss to any person or property relating to the delivery and/or subsequent use of the prizes awarded, or for any dispute that may arise among winning Team Members relating to the division of a Prize. Yahoo! makes no representations or warranties concerning the appearance, safety, or performance of any prize awarded. Restrictions, conditions, and limitations apply. Contest Entities will not replace any lost or stolen prize items

  7. Winner Notification: The Team Representative of a potentially winning Team will be notified following the close of the Contest via email. Each verified prize winning Team Representative may be required to sign, notarize and return an Affidavit of Eligibility and Liability/Publicity Release (except where prohibited by law) and provide any additional information (such as social security number), that may be required by Contest Entities. Except where prohibited by law, potential winning Team Representatives must return all such required documents within ten days following attempted notification or prize may be forfeited. Return of any prize/prize notification as undeliverable, or inability of Contest Entities to contact the Team Representative of a potentially winning Team may result in disqualification and selection of an alternate winner according to the judging criteria above. Winners may be issued an IRS Form 1099 which documents the value of the prize for tax purposes. Any difference between actual value of a prize and the approximated value of a prize as stated in these Official Rules will not be awarded.

  8. General Conditions: This Contest is governed by the laws of the United States. All federal, state and local laws and regulations apply.

    Sponsor reserves the right at its sole discretion to disqualify the entry of any individual or Team found to be (a) tampering or attempting to tamper with the entry process or the operation of the Contest; (b) violating the Official Rules or DSA; CAUTION: ANY ATTEMPT BY ANY INDIVIDUAL TO DELIBERATELY DAMAGE ANY WEBSITE OR UNDERMINE THE LEGITIMATE OPERATION OF THE CONTEST MAY BE A VIOLATION OF CRIMINAL AND CIVIL LAWS. SHOULD SUCH AN ATTEMPT BE MADE, SPONSOR RESERVES THE RIGHT TO SEEK DAMAGES FROM ANY SUCH PERSON TO THE FULLEST EXTENT PERMITTED BY LAW.

    By accepting a prize, each winning Team Member agrees to release and hold harmless Contest Entities, and their respective parents, companies, subsidiaries, affiliates, directors, officers, employees, and agents from any and all liability for any injuries, loss or damage of any kind to persons, including but not limited to death, or property, resulting in whole or in part, directly or indirectly, from acceptance, possession, misuse or use of any prize, participation in this promotion, or while traveling to, preparing for or participating in any prize-related activity.

  9. Intellectual Property. As between Yahoo! and any Team participating in this Contest, the Team will retain all intellectual property rights in its Submissions, subject to the licenses granted by the Team herein.

  10. Limitations of Liability: Contest Entities assume no responsibility for lost, late, misdirected, or unintelligible submissions, or for theft, destruction or unauthorized access to, or alteration of, entries. Contest Entities are not responsible for any incorrect or inaccurate information, whether caused by website users, any of the equipment or programming associated with or utilized in the Contest, or any technical or human error which may occur in the processing of Submissions in the Contest. Contest Entities assume no responsibility for any error, omission, interruption, deletion, defect, delay in operation or transmission, failures or technical malfunction of any telephone network or lines, computer online systems, servers, providers, computer equipment, software, email, players or browsers, whether on account of technical problems, traffic congestion on the Internet or at any website, or on account of any combination of the foregoing (including but not limited to any such problems which may result in the inability to access the Contest Website or to submit or modify Submissions or Primary Entries in this Contest). Contest Entities are not responsible for any injury or damage to participants or to any computer related to or resulting from participating or downloading materials in this Contest. If, for any reason, the Contest is not capable of running as planned, including infection by computer virus, bugs, tampering, unauthorized intervention, fraud, technical failures, or any other causes beyond the control of Contest Entities which corrupt or affect the administration, security, fairness, integrity or proper conduct of this Contest, Yahoo! reserves the right at its sole discretion to cancel, terminate, modify or suspend the Contest and select winners from among all eligible entries received prior to the cancellation according to the same judging criteria provided above.

  11. Privacy: By entering the Contest, you agree to Yahoo!'s use of your personal information as described in Yahoo!'s Privacy Policy, located at http://privacy.yahoo.com/privacy/us/promo/details.html.

  12. Winners' List: For a list of the Contest Winners, send a self-addressed, stamped envelope to KDD Cup 2011, Winners List Requests, Attn: Tiffany Argueta, 701 First Avenue, Sunnyvale, CA 94089. Requests received after August 1, 2011 will not be honored.

  13. Official Rules: For a copy of the Official Rules, print this web page or send a self-addressed, stamped envelope to: KDD Cup 2011, Official Rules Requests, Attn: Tiffany Argueta, 701 First Avenue, Sunnyvale CA 94089. Vermont residents may omit return postage. Requests received after the close of the Entry Period will not be honored.

  14. Sponsor: Yahoo! Inc., 701 First Avenue, Sunnyvale, CA 94089.