WSU students named finalists in NFL data competition

If you’ve never watched American football, it can look like organized chaos. But for Washington State University graduate student Jugal Marfatia, looking at data snapshots of plays allowed him and Namrata Ray to find hidden data inside the chaos. That eventually lead the duo to a trip to Indianapolis later this month for the NFL’s Scouting Combine.

22 dots, 11 red and 11 green, are plotted on a rectangle. Each has an arrow coming out of it going in different directions.
Here’s an example of what each play looked like, when given to competitors in the Big Data Bowl.

Marfatia, a Ph.D. student in Economics and Master’s in Statistics at WSU, and Ray, Ph.D. student in Sociology and Master’s in Statistics, entered the NFL’s 2020 Big Data Bowl competition to answer a question: when a running back takes a handoff, how many yards should we expect him to gain?

The WSU team were named finalists in the collegiate event, earning a trip to the combine.

“We’ll get to meet with coaches and league officials to talk about what we found when breaking down all the data,” Marfatia said.

The NFL posted the contest on Kaggle, an online community of data scientists, and over 2,000 people competed. The WSU student team is one of six finalists in the college portion.

The contest and submission

Each competing team or individual received the position of all 22 players on the field for over 23,000 rushing plays from real NFL games in 2018 and 2019. For the college competition, the goal was to derive valuable insights about rushing plays using the data provided.

Posed portrait photos of Marfatia and Ray, both in outdoor settings.
Jugal Marfatia, left, and Namrata Ray, right.

“Our submission focused on looking at the open area that a rusher has, and if that is an indication of how many yards they will gain,” said Marfatia, a native of India.

The graduate students found that if you have the speed and direction of each player, it’s possible to predict where all 22 players will be 1 second after the provided snapshot.

“Our results indicate that an open area at any time interval is not a good predictor of the yards gained,” Ray said. “However, the difference in the open space between the time of handoff and after 0.5/1.0 second(s) is an extremely good predictor of the yards gained.”

That all makes sense to people familiar with football, but there are benefits to coaches and players to be found by looking further.

“It’s possible for a rusher to be going too fast when approaching a defender, or he could be going in slightly the wrong direction,” Marfatia said. “Our findings would let a coach know if the rusher was in the best possible position based on where the other players are. That’s something they can then work on with their players to help them find the most likely open spots.”

Non-football background

Marfatia isn’t a big sports fan, and had never watched American football before entering the Big Data Bowl. But he’s been doing competitions on Kaggle for a few years and loves to take on new challenges.

“I felt I could use the knowledge I’ve learned on econometrics here at WSU to find something useful and interesting,” he said. “I want to know how data can be applied in different fields, not just economics or business, but sports or health. That’s what fascinates both Namrata and I, the application of data in different aspects of our lives.”

And although he’d never followed the NFL before, he and Ray have learned the game since diving in. They are both surprised by the success of their entry in the contest, but plan to enjoy themselves when they arrive and meet their fellow contestants.

“It will be great to meet with and see how other people from different backgrounds thought about this competition,” Marfatia said. “And we’re looking forward to learning even more about the sports analytics field.”