Datathon@LISH FAQs

We will update this section with FAQs as we get them. If your question is not answered here, please reach out directly to the contest organizers. (

  1. Questions about Eligibility

  2. Prizes and Benefits

  3. Contest Logistics

  4. Contest Problem & Rules

Questions about Eligibility

I'm not a Harvard student, but I'm affiliated with _____ University / College. Am I eligible to compete?

The contest is open to any US university affiliate with a valid university email. Any level of study (undergraduate / graduate / post doc / RAs / other staff) are welcome. We encourage all interested qualified people to sign up for the contest. Note that to receive any of the top 3 monetary prizes, you must have a Social Security Number and be eligible to work in the USA as you will be required to complete a W2 tax form to receive payment.

I've only taken ____ classes in data science / statistics? Am I qualified to participate in the contest?

Anyone that is interested may sign up as long as they are currently a student and have a university email address. However, the contest will require the use of the python programming language, and we recommend that students have at least one semester of data science education and familiarity with common python data science techniques and libraries.

What type of computer do I need to have to participate in the contest?

As this contest will be remote-first, you'll need to have your own computer with web browser access in order to participate. However, we will require that all participants work within our custom development environment / notebook (hosted through Google Colab), meaning that there are no computational requirements for your machine beyond that.

Prizes and Benefits

Why should I participate in the contest?

  1. We hope that, like us, contestants enjoy data science problem solving and will have fun working on our selected problem. This contest also represents a great chance to further develop your problem-solving skills.

  2. We are offering significant cash prizes for high-placing participants. We will also publicize the results of the contest, which will give winners some visibility in the broader data science community.

  3. This contest will contribute to a study on data-science problem solving. By participating in this contest, you will contribute to the science of data science.

Who am I competing against?

You will be competing against other students and university affiliates that register for the contest per our eligibility criteria given above. Note that we will split our registrants into two separate competition "tracks" in order to ensure competitiveness; each track will have slightly different rules, different contest rankings, and prize winners. Your team will be randomly assigned to one of these tracks at the start of the contest. This doubles your chances of winning the contest, as you will be competing directly against only half of our participants.

What are the prizes?

We will award prizes to eligible participants based on the rank (in your competition track) of your best model submitted. As noted above, you will be assigned to one of two parallel competition tracks. This means you will only directly compete against half of our participants, which doubles your likelihood of winning a prize. Prizes will be as follows:

  • $1000 cash for first place

  • $500 cash for second place

  • $250 cash for third place

  • $50 Amazon Gift Card for a top-10 finish.

  • Participation prize ($10 Amazon gift-card) for eligible teams that beats our benchmark score in the prediction contest (which should be easy).

Notes that we previously advertised a contest-t shirt, but will be unfortunately unable to provide this option due to challenges around the logistics of delivering the shirt through mail!

What are my chances of winning a prize?

We will provide a participation prize give to anyone that beats our benchmark score in the prediction contest (which should be easy). As described above, cash prizes will be given out to the top 10 placers in each contest that we host. We aspire for roughly the top 10% of participants to receive a monetary prize of some form and for everyone to be able to receive a participation prize provided they invest enough effort in solving the problem.

Are there any other benefits to participating?

Possible benefits of participating include getting to hone your data science skills and public recognition among the data science community. In particular, top-three winners in each track will be asked to post a short overview of their solution approach, which will be made public on the LISH website. The top placing team in each track will also get to present their solution at the Awards Ceremony.

Contest Logistics

Can I enter the contest with teammates? If so, how can I indicate this?

Yes, our datathon is open to individuals and pairs. However, we will not allow teams larger than this to participate.

You can let us know your contest partner when you register for the competition. Even if you originally registered as an individual, you still choose to participate with a partner at a later date – just be sure to have your partner register via the datathon website before the event. At the very start of the competition, we'll ask you to confirm your final plans for who to partner with in the competition. (We unfortunately cannot help you find a partner for the competition, though.)

What times would I need to be available for the Datathon during the weekend of February 12th?

The contest is open from Friday 02/11 5pm to Sunday 02/13 5pm -- a total of 48 hours. However, the contest will be accessible to virtual participants and can be engaged asynchronously. The only synchronous event will be the awards ceremony, which will be held on Wednesday 02/16 at 5:30PM on Zoom. See the contest timeline for more details.

Do I need to be physically present for the competition?

No - the competition will be held virtually. You can participate from any location of your choosing (just make sure you have a decent internet connection!).

Will I get a chance to meet other participants in the Datathon?

The contest will be virtual, meaning that you will likely not be co-located with other participants. Due to restrictions from the Omicron variant, we are also unfortunately unable to offer in-person networking opportunities as part of the contest. However, winners will be able to share their problem approach during our awards ceremony.

Contest Problem & Rules

When will the details on the contest problem be released?

The contest problem details are given in our contest notebook. Take the pre-contest survey to get a link to the contest notebook!

How can I review / prepare for the competition?

We do not have any dedicated materials to help prepare for our competition, but standard data science courses taught in python will provide the basic tools necessary for succeeding on our problem. Consider participating in Kaggle competitions or seeing openly available data science courses such as those provided by Coursera.

Can I use a different language from python? Can I use my own development environment?

Unfortunately, no to both. In this contest, we are requiring the use of python and Google Colab for three reasons:

  1. To verify compliance with contest rules (given that the contest is running remotely).

  2. To assist our research goals standardizing the code that we are collecting as a result of the competition.

  3. To ensure parity of computational resources available to participants.