In the future, it will not be a human with years of scientific training who identifies the remains of fossils.
It will be a machine.
At least, that is the vision of the new Endless Forams online image database, which uses a kind of facial recognition software to identify different species of foraminifera—microscopic marine organisms that come in a multitude of shapes—in a sample based on the creatures’ skeletal remains.
“A lot of experts are lower than the machine accuracies,” said computational paleobiologist Allison Hsiang, the database’s architect. “On average, the human accuracies are something like 70% to 75%, and these are experts in the field. Our model in this case was 87%.”
Eos asked Hsiang if the new software will one day replace humans who are experts at identifying foram species.
“I don’t think so,” she said, laughing.
Eos then asked foram expert and micropaleontologist Michal Kucera of the University of Bremen the same question.
“Yes,” he said.
Kucera, who was not involved in Endless Forams’ creation, explained that the software is not quite to the point where it can replace the work of experts because the diversity of forams that Endless Forams accounts for is not a complete snapshot of the diversity of forams that exists in the place where the database’s species come from: the Atlantic Ocean. “The bulk of the images are not taken in a way that represents the full diversity of specimens in a sample,” he said.
But it is only a matter of time, he added, before software like Endless Forams overcomes that hurdle and becomes as skilled as the telescreens in George Orwell’s dystopian novel Nineteen Eighty-Four, which kept a close watch on everyone in society, or today’s social media platforms, which do the same thing.
Rise of the Machine
Instead of putting experts like himself out of a job, Kucera thinks that Endless Forams will only be a boon for his and others’ research projects. That’s because researchers like him will, because the software can be better than experts at identifying forams, be able to work with more accurate data sets than ever before.
To make this possible, Hsiang, who is currently at the Ludwig Maximilian University of Munich, trained the software to scan samples and to identify forams it thinks it recognizes by showing her program about 24,000 photographs of forams that four independent experts identified beforehand. Hsiang and her team reported on their work in June in Paleoceanography and Paleoclimatology.
The forams they cataloged can be treasure troves of information on modern and ancient oceans because they record information about, among other things, the chemical composition of the oceans in their shells. Forams thus serve as one of the main lines of evidence that allow researchers to infer what Earth’s climate was like in the deep past.
There are about 4,000 modern species alone, and Endless Forams accounts for about 50 of those species. It can be hard and time-consuming for a scientist to spot and identify the individuals that they need for their work, but with the database Hsiang explained that researchers can not only train themselves to identify species, they can use the software to quickly identify which forams they want to analyze.
And Endless Forams, more than any other foram database, also contains a huge variety of photographs of individual species, like Globigerina bulloides, which has over 2,000 photographs on the website. Such a resource will let scientists survey the full range of shapes that a single species can have—and because shape often defines what a species is, having access to all the shapes that a single species can take on will help researchers tackle one of the most fundamental questions in biology: What is a species, exactly?
It pays to keep a close watch on every foram, it seems.
“You can congratulate the authors,” Kucera said. “They put a lot of work into this.”
—Lucas Joel, Freelance Journalist