‘Big data’ leads to better trees in $3 million grant

Scientists at Washington State University are harnessing the power of ‘big data’ to help growers create the next generation of healthy, sustainable forests and tree crops.

Dorrie Main, professor in the WSU Department of Horticulture, is leading a $3 million effort to create cyber-infrastructure that helps researchers and breeders share and use tree data. The National Science Foundation funded the grant in July.

Dorrie Main, WSU horticulture professor, leads a $3 million effort to help tree researchers and breeders share ‘big data’ (Seth Truscott/WSU Photo).
Dorrie Main, WSU horticulture professor, leads a $3 million effort to help tree researchers and breeders share ‘big data’ (Seth Truscott/WSU Photo).

Main’s team includes Sook Jung, associate research professor; Stephen Ficklin, WSU associate professor of horticulture; and colleagues at the Universities of Connecticut, Tennessee and Kentucky. The project will use servers at Main’s lab as well as WSU’S new Kamiak supercomputer.

Scientists are already generating a wealth of data on tree genomes, genetics and breeding. The problem, says Main, is making sense of it all—especially once you add environmental and geographic data.

“Due to advances in sequencing technology, even small labs are now putting out mountains of data,” she said. “Cumulatively, we’re talking about billions of data points. The big challenge is analysis and interpretation.”

To help people generate and use big data, Main and her colleagues aim to unify access to tree crop data through a network of community-driven databases, data-mining and analysis tools, and educational modules. Such a network would allow scientists, students and tree breeders to share, filter and ultimately use their data in meaningful ways, from basic discoveries to new varieties.

“We want to give breeders more tools to make good decisions,” Main said.

Over the past decade, Main’s team created seven public, open-source databases for 25 crops, such as the rose family (includes almond, apple, cherry, peach, pear, raspberry, strawberry), citrus, cotton, cacao, legumes and blueberries. Those online databases act as clearing houses for information on genomics, genetics and breeding.

The National Science Foundation project is the next step.

“This new grant will build a unified system of tree databases, help people build their own databases, and then connect them in a way that’s currently unavailable.” These resources could help breeders more quickly create new, more adaptable varieties.

Building the network is urgent, says Main. Scientists are in a race with evolving diseases and a changing climate. She is excited about how shared information can help breeders.

“These tools could help scientists create crops that use the genetic diversity that already exists in their wild relatives,” Main said.  “We can develop new cultivars that require less chemicals, can grow in marginal land, and are adapted for harsher climates while still producing quality yields.”