Folded and sealed with a dollop of red wax, the will of Catharuçia Savonario Rivoalti lay in Venice’s State Archives, unread, for more than six and a half centuries. Scholars don’t know why the document, written in 1351, was never opened. But to physicist Fauzia Albertin, the three-page document—six pages, folded—was the perfect thickness for an experiment.
Albertin, who now works at the Enrico Fermi Research Center in Italy, wanted to read the will without unsealing it. Her approach: X-ray vision. In a 2017 demonstration, Albertin and her team beamed X-rays at the document to photograph the text inside. Then, using algorithms, they digitally peeled apart the six pages to legibly reproduce handwritten words.
They haven’t figured out entirely what the document says. Rivoalti used an old form of Italian, which their archivist collaborators are still interpreting, says Albertin. (They did decipher part of it: In one passage, Rivoalti notes that her will is written on high-quality paper, possibly to remind the reader that she was wealthy.) But the technique should help historians study texts without damaging the physical objects themselves. “The only other way to read [the will] is to cut it open,” says Albertin.
Albertin is collaborating on a larger project known as the Time Machine, which aims to create a Google-like search engine spanning 2,000 years of European history. To do this, researchers plan to digitize and organize the archives of Europe’s cities into one database, says Frédéric Kaplan, a computer scientist at the École Polytechnique Fédérale de Lausanne, who leads the Time Machine collaboration. Eventually, Kaplan thinks that historians could scan libraries of closed tomes using Albertin’s X-ray techniques in a mostly automated process. They could then feed those scans to an AI-driven text recognition algorithm their team is developing, which would automatically enter the text into a database.
In its grandest form, Kaplan envisions a maps function for the Time Machine, where you can zoom in on the street view of a 19th-century Parisian neighborhood, for example. They have high-quality aerial photographs of Paris during that era. To propagate the city further back in time, Kaplan thinks they can use AI, trained with historical urban planning information, to make educated guesses on how the street layouts evolved. Last month, the European Union awarded the team a million euros in seed funding to continue developing these methods, and the Time Machine is one of six scientific projects currently competing for a billion euros of European funding over the next decade.
But the Time Machine won’t just be flashy apps. Its huge database should allow historians to study societal patterns over longer time spans and geographical scales. The project is part of a recent trend in the last few years, where more historians have been trying to use data science to mine new information out of old texts. When historians propose projects for grant funding, “it’s almost a [requirement] that you make a database and do some network analysis,” says historian Johannes Preiser-Kapeller of the Austrian Academy of Sciences.
For example, Hilde De Weerdt, a historian at Leiden University, and her team have built a tool that automatically tags names, places, and times in digitized Chinese and Korean texts. They’ve designed the database so it can link up to map-plotting software, and they can more easily visualize how people and ideas migrate in space and time.
This data-based approach can offer a fresh perspective on the past. Traditionally, historians use narratives to understand the past and focus their study on “big men and big places,” says Preiser-Kapeller. This framework can lead to cherry-picking, where scholars highlight only the cases that support their narrative. Cherry-picking still happens often in historical scholarship, he says.
Data science guards against some of this subjectivity. “When you systematically collect evidence and put it into a database, you get away from cherry-picking,” says Preiser-Kapeller. It can shift historians away from dominant narratives. For example, Preiser-Kapeller has mapped networks to identify crucial players, unremarkable at first reading, in Byzantine Empire documents. “They were not the loudest people, but they were always in the background, connecting groups of people who might not otherwise be connected,” he says.
Even relatively simple data projects can yield new historical insights. Máirín MacCarron, a historian at the University of Sheffield, has manually entered the 600 characters in an 8th-century text, The Ecclesiastical History of the English People, into a giant Excel spreadsheet. She and her team have also recorded every single interaction between these characters. “We even have a category for postmortem interactions,” says MacCarron. “Because these are medieval religious texts, we have saints coming back and performing miracles.”
In particular, MacCarron is studying how women interact in the text. Conventional scholarship casts women as “peace-weavers”—preventing conflict by marrying a neighboring kingdom’s ruler. But network analysis can reveal more complex roles, says MacCarron. In some early analysis, she has found that of the 12 most socially connected characters in the text, three of them are women. “When you plot the interactions mathematically, you might see that a character has a network of connections that you missed from a normal reading,” says MacCarron.
Not all historians are convinced of the benefits of the approach. “When I started using it, I thought it would be much more powerful than it turned out to be,” says historian Michal Biran of the Hebrew University of Jerusalem, who has created a database and mapped social networks of the Mongol Empire during the 13th and 14th century. The visualizations look pretty in presentations, says Biran, but they take a long time to produce, and she hasn’t been able to eke out much scholarly information from them.
Biran’s difficulties may stem from quirks in her source material. Because very little Mongol writing has survived, she primarily studies documents written in the languages of Mongols’ imperial subjects. Often that means working through Chinese, Persian, and Russian texts. The characters have different names across languages, and often even multiple names within the same language—a man might be called by different honorifics at different points in his life, for example. It takes careful study to keep the names straight, which makes them difficult to sort digitally into neat boxes, says Biran.
But even if a source’s text translates easily to digital information, De Weerdt points out that data analysis still can’t stand on its own. Ultimately, historical studies are based on texts, and “the more you implement mathematical processes, the further you get from the text,” she says. To really understand the subtleties of primary documents, you still need the specialized expertise of conventional historians.
“We’re rarely definitive in history, as a general rule,” says MacCarron. “Everything is subjective. Every source reveals, to some extent, the bias of the writer.” But by pooling together historical sources and analyzing them in parallel, maybe you can average out some of those biases—and get closer to the truth.
More Great WIRED Stories
- A more humane livestock industry, thanks to Crispr
- Coders’ primal urge to kill inefficiency—everywhere
- For gig workers, client interactions can get … weird
- For avalanche safety, data is as important as proper gear
- How hackers pulled off a $20 million Mexican bank heist
- 👀 Looking for the latest gadgets? Check out our latest buying guides and best deals all year round
- 📩 Get even more of our inside scoops with our weekly Backchannel newsletter