How an Anonymous 4chan Post Helped Solve a 25-Year-Old Math Puzzle

link to original article

In September 16, 2011, an anime fan posted a math question to the online bulletin board 4chan about the cult classic television series The Melancholy of Haruhi Suzumiya. Season one of the show, which involves time travel, had originally aired in nonchronological order, and a re-broadcast and a DVD version had each further rearranged the episodes. Fans were arguing online about the best order to watch the episodes, and the 4chan poster wondered: If viewers wanted to see the series in every possible order, what is the shortest list of episodes they’d have to watch?

In less than an hour, an anonymous person offered an answer — not a complete solution, but a lower bound on the number of episodes required. The argument, which covered series with any number of episodes, showed that for the 14-episode first season of Haruhi, viewers would have to watch at least 93,884,313,611 episodes to see all possible orderings. “Please look over [the proof] for any loopholes I might have missed,” the anonymous poster wrote.

The proof slipped under the radar of the mathematics community for seven years — apparently only one professional mathematician spotted it at the time, and he didn’t check it carefully. But in a plot twist last month, the Australian science fiction novelist Greg Egan proved a new upper bound on the number of episodes required. Egan’s discovery renewed interest in the problem and drew attention to the lower bound posted anonymously in 2011. Both proofs are now being hailed as significant advances on a puzzle mathematicians have been studying for at least 25 years.

Mathematicians quickly verified Egan’s upper bound, which, like the lower bound, applies to series of any length. Then Robin Houston, a mathematician at the data visualization firm Kiln, and Jay Pantone of Marquette University in Milwaukee independently verified the work of the anonymous 4chan poster. “It took a lot of work to try to figure out whether or not it was correct,” Pantone said, since the key ideas hadn’t been expressed particularly clearly.

Now, Houston and Pantone, joined by Vince Vatter of the University of Florida in Gainesville, have written up the formal argument. In their paper, they list the first author as “Anonymous 4chan Poster.”

Permutation Cities

If a television series has just three episodes, there are six possible orders in which to view them: 123, 132, 213, 231, 312 and 321. You could string these six sequences together to give a list of 18 episodes that includes every ordering, but there’s a much more efficient way to do it: 123121321. A sequence like this one that contains every possible rearrangement (or permutation) of a collection of n symbols is called a “superpermutation.”

In 1993, Daniel Ashlock and Jenett Tillotson observed that if you look at the shortest superpermutations for different values of n, a pattern quickly seems to emerge involving factorials — those numbers, written in the form n!, that involve multiplying together all the numbers up to n (for example, 4! = 4 × 3 × 2 × 1).

If your series has just one episode, the shortest superpermutation has length 1! (also known as plain old 1). For a two-episode series, the shortest superpermutation (121) has length 2! + 1!. For three episodes (the example we looked at above), the length works out to 3! + 2! + 1!, and for four episodes (123412314231243121342132413214321), it is 4! + 3! + 2! + 1!. The factorial rule for superpermutations became conventional wisdom (even though no one could prove it was true for every value of n), and mathematicians later confirmed it for n = 5.

Then in 2014, Houston startled mathematicians by showing that for n = 6, the pattern breaks down. The factorial rule predicts that to watch six episodes in every possible order should require 873 episodes, but Houston found a way to do it in 872. And since there is a simple way to convert a short superpermutation on n symbols into a short superpermutation on n + 1 symbols, Houston’s example meant that the factorial rule fails for every value of n above 6 too.

Houston’s construction works by translating the superpermutation problem into the famous traveling salesman problem, which looks for the shortest route through a collection of cities. More specifically, superpermutations are connected to the “asymmetric” traveling salesman problem, in which each path between two cities has a cost (which is not necessarily the same in both directions), and the goal is to find the least expensive route through all the cities.

The translation is simple: Think of each permutation as a “city” and imagine a path from each permutation to each other permutation. In the superpermutation problem, we want the shortest possible sequence of digits that lists all the permutations, so the goal is to travel through the permutations in a way that adds as few digits to the starting permutation as possible. So we declare the cost of each path to be simply the number of digits we have to attach to the end of the first permutation to get the second one. In the n = 3 example, for instance, the path from 231 to 312 costs $1, since we just have to add a 2 to the end of 231 to get 312, while the path from 231 to 132 costs $2, since we have to add a 32. With this setup, the least-expensive path through the cities corresponds directly to the shortest superpermutation.

Lucy Reading-Ikkanda/Quanta Magazine

This translation meant that Houston could turn the power of traveling salesman algorithms on the superpermutation problem. The traveling salesman problem is famous as an NP-hard problem, meaning that there’s no efficient algorithm that can solve all cases of it. But there are algorithms that can solve some cases efficiently, and other algorithms that produce good approximate solutions. Houston used one of the latter to produce his 872-digit superpermutation.

Since he produced only an approximate solution, it might not be the very best superpermutation. Mathematicians are now conducting a giant computer search for the shortest superpermutation on six symbols, Pantone said. “We know our search will finish in finite time, but don’t know if that’s one week or a million years,” he said. “There’s no progress bar.”

The Wrong Order

By the time of Houston’s work, the anonymous 4chan post had been sitting in its corner of the internet for nearly three years. One mathematician, Nathaniel Johnston of Mount Allison University, had noticed a copy of the post on a different website a few days after it was posted — not because he was an anime fan, but because he had typed an assortment of superpermutation-related search terms into Google.

Johnston read the argument and thought it seemed plausible, but he didn’t invest much effort in checking it carefully. At the time, mathematicians believed that the factorial formula for superpermutations was probably correct, and when you think you know the exact answer to a question, a lower bound isn’t very interesting. In other words, the superpermutation research episodes were playing out of order.

Johnston later mentioned the lower bound on a couple of websites, but “I don’t think anyone really gave it any particular heed,” Houston said.

Then on September 26 of this year, the mathematician John Baez of the University of California, Riverside, posted on Twitter about Houston’s 2014 finding, as part of a series of tweets about apparent mathematical patterns that fail. His tweet caught the eye of Egan, who was a mathematics major decades ago, before he launched an award-winning career as a science fiction novelist (his breakthrough 1994 novel, in a happy coincidence, was called Permutation City). “I’ve never stopped being interested in [mathematics],” Egan wrote by email.

Egan wondered if it was possible to construct superpermutations even shorter than Houston’s. He scoured the literature for papers on how to construct short paths through permutation networks, and after a few weeks found exactly what he needed. Within a day or two, he had come up with a new upper bound on the length of the shortest superpermutation for n symbols: n! + (n – 1)! + (n – 2)! + (n – 3)! + n – 3. It’s similar to the old factorial formula, but with many terms removed.

“It absolutely smashed the [previous] upper bound,” Houston said.

The anonymous 4chan poster’s lower bound, meanwhile, was tantalizingly close to the new upper bound: It works out to n! + (n – 1)! + (n – 2)! + n – 3. When Egan’s result became public, Johnston reminded other mathematicians about the anonymous poster’s proof, and Houston and Pantone soon showed it was correct. As with Houston’s work, the new lower and upper bounds both come at superpermutations via the traveling salesman problem: The lower bound shows that a route through all the cities must travel along some minimum number of paths that cost more than $1, while the upper bound constructs a specific route for each n that uses only $1 and $2 connections.

For Haruhi fans, Egan’s construction gives explicit instructions for how to watch all possible orderings of season one in just 93,924,230,411 episodes. Viewers could start binge-watching immediately, or they could wait and see whether mathematicians can whittle this number down. The anonymous poster’s lower bound proves that this whittling down couldn’t possibly save more than about 40 million episodes — but that’s enough to get a nice start on season two.

Original story reprinted with permission from Quanta Magazine, an editorially independent publication of the Simons Foundation whose mission is to enhance public understanding of science by covering research developments and trends in mathematics and the physical and life sciences.


More Great WIRED Stories