Every year billions of birds migrate to different parts of the globe. These same birds feast on insects, mice and rats, helping to keep them in check. Crops are protected by the 400 to 500 million tons of insects birds take care of each year. Birds quite literally play an essential role in protecting humanity from being overrun by all manner of bugs and pests. Keeping tabs on birds aids scientists who are monitoring the health of the bird population, but it’s an epic task. By some estimates, there are hundreds of billions of birds sharing the planet with humans. That’s a lot of birds to keep tabs on.
Scientists at Cornell knew they couldn’t complete this work alone, so they enlisted amateur enthusiasts to help them watch birds. The bird-tracking project is called eBird, and the results of collaborating with bird watchers have yielded mountains of data. By 2021, eBird had more than a billion bird observations to sift through. However, the observational data was only part of the picture. Each sighting only showed that a particular bird was in a particular location – not that the bird had traveled from point A to point B. The data is far from useless, though, as even the snapshots could be used in training AI to plot migratory paths.
Before they could use the data, it needed to be cleaned up. You might not think of bird sightings as being biased, but the data is surprisingly skewed if you examine it closely. Even though birds are everywhere, bird watchers tend to be in affluent countries. Bird watchers also might not make identical observations; two birders in the same place at the same time, watching the same birds might make two distinctly different observations based on their skills as birders.
To clean up this data for use in training a machine-learning AI, researchers needed the powerful resources provided by ACCESS. For each of the 2,300 bird species they would analyze over the course of a year, two to eight gigabytes (GB) of computer memory and 3,000 to 4,000 CPU hours would be needed. Enter Bridges-2, a powerful supercomputer at the Pittsburgh Supercomputing Center (PSC). “Turning those very messy observations from lots and lots of participants into reliable information relies on a lot of computing,” says Daniel Fink, a member of the eBird research team from the Cornell Lab of Ornithology.
We couldn’t do this without the scale of Bridges-2. Access to memory allocations is important too … [Also], the stability of Bridges-2 was better than any of the commercial options.Tom Auer, Cornell University
The research team has published some of their findings in the journal, Methods in Ecology and Evolution.
You can read more about this story here: BirdFlow AI “Connects the Dots” in Massive Volunteer Database to Track Migratory Birds
Institution: PSC (Pittsburgh Supercomputing Center}
University: Cornell University, University of Massachusetts
Funding Agency: NSF
The science story featured here, allocated through August 31, 2022, was enabled through Extreme Science and Engineering Discovery Environment (XSEDE) and supported by National Science Foundation grant number #1548562. Projects allocated September 1, 2022 and beyond are enabled by the ACCESS program, which is supported by National Science Foundation grants #2138259, #2138286, #2138307, #2137603, and #2138296.