An international research collaboration trained computers to sift through millions of images for cosmic treasure.
On a mountaintop in Chile, 7,200 feet above sea level, the universe is unfolding image by image. The Cerro Tololo Inter-American Observatory is home to the 4-meter Blanco Telescope, situated in a giant dome under which scientists have studied the sky since 1976. As advances in technology result in exponentially more data, one giant constraint remains: how fast people can sift through it all.
An international collaboration of astronomers has developed a machine-learning model to search through the telescope’s images to find faint galaxies. Using the model, researchers whittled down 11 million images — all from the southern sky — to just 581 that show a phenomenon called strong lensing.
When a massive cosmic object lies in front of a more distant galaxy, the massive object’s gravity bends and warps the background galaxy’s light. This gravitational lensing magnifies the background galaxy, revealing unseen features, but also often distorts it into a stretched-out image. Such strongly lensed galaxies enable astronomers to unearth clues about the history and evolution of our universe that would otherwise be invisible.
In total, out of the millions of galaxies already cataloged, astronomers have only been able to confirm and model about 1,000 that are lensed. “Strong lenses are really rare,” says graduate student Erik Zaborowski (Ohio State University) who led the study published in the September 1st Astrophysical Journal. As observatories like the DECam-equipped Blanco Telescope open up the southern sky for study, astronomers will likely find lenses lurking there. “It’s really nice to have this really unsearched area of the sky become available now,” he says.
Strongly lensed galaxies help scientists interrogate some of astronomy’s biggest questions, from the nature of dark matter to the expansion rate of the universe. But with many of the world’s largest astronomical searches happening in the Northern Hemisphere, only recently have astronomers begun rigorously surveying the full southern sky.
That effort has been aided with the Blanco Telescope’s Dark Energy Camera (DECam), which saw first light in 2012. The camera is specifically designed to view large swaths of sky at once; a single image captures an area the size of 20 Moons as seen from Earth.
As researchers have more sky to examine, “the amount of data that we’re going to have to sift through is going to be impossible for a human to go by eye,” says Greg Mosby (NASA Goddard). He wasn’t part of the study but uses machine-learning algorithms in his own work. Automating searches through that data is “a sign of the times,” he adds.
In comes machine learning, an increasingly popular subfield of artificial intelligence in which computer algorithms learn what to look for via training data rather than direct human intervention. It is the driving force behind familiar tools such as language translation, targeted advertising, and image identification — the latter of which is gaining traction in astronomy.
Zabrowski and colleagues were the first to apply machine learning to public data from the DECam Local Volume Exploration Survey (DELVE). DELVE began in 2019 and has captured a whopping 520 million cosmic sources since.
Out of those, Zaborowski’s team chose 11 million extended sources (that is, objects that don’t appear as points sources in images) to begin the search for strongly lensed galaxies. They then fed those sources to a five-layered convolutional neural network, a type of machine learning often used to classify images.
But first, before the network could look for real lensed galaxies, the scientists had to train it. They fed it over 80,000 real galaxy images from DELVE, adding an artificial lensing effect to half of them. They also included over 3,200 false positives — images that look like they might feature strongly lensed galaxies but don’t — to improve the model’s accuracy.
However, humans weren’t taken totally out of the equation. Out of the 11 million DELVE sources, the model spit out 50,000 that were rated most likely to be lenses. The scientists then went old school, checking each one by eye. When they'd finished, they had 581 extended sources likely to be strongly lensed galaxies, 562 of which had never been reported. They also had eight potentially lensed quasars, extremely luminous galaxy cores. If those new lensed galaxies check out — and there’s a long waiting list of potential lenses for scientists to study — then the number of strongly lensed galaxies will have increased by more than 50%.
While the number of new candidate strong lenses is formidable, it also forces astronomers to think about what this new era of giant databases and machine-learning models entails. For one, the models are only as good as the data on which we train them. For strong lenses, there are so few real images that scientists are forced to train models on artificial ones.
“I do have some reservations,” says Ben Metcalf (University of Bologna, Italy), who was not involved in the study. “If there’s something in the real data that you haven’t put in the simulated data, you just don’t know how [the model] reacts to that at all.”
In other words, the model may be great at finding run-of-the-mill strong lenses, but unique ones could slip by undetected. Mosby seconds that concern, suggesting that humans stay involved in searching until we can train machine-learning models on rarer lenses.
A related limitation is that astronomers’ ideas of what visually counts as a strong lens are subjective. Scientists must reach a consensus that tells models what to select for in the first place.
Although machine learning isn’t foolproof, its strength is speed. Tedious image classifications that take researchers months to do take models mere hours. With ever more data at our disposal, this study demonstrates that machine learning can leapfrog people to the next step: figuring out what strong lenses tell us about the universe.
“There are a lot of kinds of analyses that you can really only do with a lot of these [lenses],” Zaborowski says. “With that data, you can start answering the science questions.”