In 2007, I spent the summer before my junior year of college removing little bits of brain from rats, growing them in tiny plastic dishes, and poring over the neurons in each one. For three months, I spent three or four hours a day, five or six days a week, in a small room, peering through a microscope and snapping photos of the brain cells. The room was pitch black, save for the green glow emitted by the neurons.
I was looking to see whether a certain growth factor could protect the neurons from degenerating the way they do in patients with Parkinson’s disease. This kind of work, which is common in neuroscience research, requires time and a borderline pathological attention to detail. Which is precisely why my PI trained me, a lowly undergrad, to do it—just as, decades earlier, someone had trained him.
Now, researchers think they can train machines to do that grunt work.
In a study described in the latest issue of the journal Cell, scientists led by Gladstone Institutes and UC San Francisco neuroscientist Steven Finkbeiner collaborated with researchers at Google to train a machine learning algorithm to analyze neuronal cells in culture.
The researchers used a method called deep learning, the machine learning technique driving advancements not just at Google, but Amazon, Facebook, Microsoft. You know, the usual suspects. It relies on pattern recognition: Feed the system enough training data—whether it’s pictures of animals, moves from expert players of the board game Go, or photographs of cultured brain cells—and it can learn to identify cats, trounce the world’s best board-game players, or suss out the morphological features of neurons.
Two of the most difficult things about training an AI in this fashion are generating a sufficiently large dataset and getting people to annotate that dataset. Fortunately, most neuroscience labs have an abundance of cell cultures to convert into training data (Finkbeiner’s lab, which has automated various other parts of the microscopy process, already produces more images than it can analyze), and plenty of lab hands to label that data for training purposes.
“Basically it came down to having a lot of summer students, graduate students, and postdocs do manual annotation, to feed into the computer,” says molecular neuroscientist Margaret Sutherland, program director at the National Institute of Neurological Disorders and Stroke, which helped fund the study. (Even with AI in the picture, students and postdocs always seem to draw the short straw.)
Finkbeiner’s team developed a deep neural network and trained it on images of cells with and without fluorescent tags. These glowing probes are helpful for distinguishing between cell types, and can make it easier to tell where the body of a neuron ends and where its axons and dendrites—the projections that carry electrochemical impulses to and from other neurons—begin. But many labeling methods can also damage the very cells you’re trying to observe. With training however, the researchers’ algorithm was able to identify specific types of brain cells in images it had never seen before. It could also distinguish dead cells from live ones, locate a cell’s nucleus, and differentiate between axons and dendrites—all without the aid of fluorescent labels. Finkbeiner and his team call their machine-learning approach in silico labeling, or ISL for short.
Because analyzing the cells doesn’t require the addition of fixatives or fluorescent dyes, ISL could be more consistent, less harmful to cultures, and enable longer-term monitoring of cellular health than traditional methods. And since humans are only required to train the algorithm, the approach could provide researchers a way to analyze hordes of data without conscripting an army of lab technicians to toil away at microscopes in the dark.
That could be great news for biomedical researchers, whether they work in a well-funded lab at a major research university or a tiny startup. “Techniques like this tend to have a democratizing effect,” says computational biologist Molly Maleckar, director of mathematical modeling at the Allen Institute for Cell Science. Together with her colleagues, Maleckar, who was unaffiliated with Finkbeiner’s study, have explored similar label-free machine learning techniques for identifying subcellular structures. By combining machine learning approaches, she says, smaller biomedical research outfits could accelerate every step of the drug discovery process. “If you understand the limitations of your algorithm and make a clear point of understanding how you can interpret and improve its performance, you don’t need so many humans collecting and analyzing large amounts of data.”
Of course, you’ll still need humans to train the algorithms. For that, there will always be summer interns.