Deep Learning Powers Drug Discovery for Rare Diseases


Cancer, diabetes, heart disease. These diseases attract a ton of research effort and funding, and for good reason: They afflict tens of millions of people each year.

But there are about 7,000 known rare diseases that rarely get attention. Also called “orphan” diseases, these conditions collectively affect about 400 million worldwide and were historically neglected by the drug industry, which could not justify the costs of developing drugs to address the small number of affected patients.

That’s slowly changing, thanks to new funding options as well as new drug discovery methods.

Salt Lake City-based Recursion Pharmaceuticals focuses on drug discovery across several therapeutic areas, including hundreds of rare diseases that currently lack treatments — such as Sandhoff disease, an inherited, often-fatal disorder that destroys neurons in an infant’s brain and spinal cord. The condition affects less than 1 in 100,000 people in Europe.

A member of the NVIDIA Inception virtual accelerator program, Recursion is using deep learning to analyze biological images. The startup, which has raised more than $85 million in venture funding, aims to discover 100 new treatments by 2025.

Recursion has a laboratory with robotic arms conducting around 100,000 miniature experiments each week. These lab experiments create about 2 million high-resolution biological images weekly.

“A single human being couldn’t look at all of that,” Nilsson said.

Deploying more than a hundred GPUs, its researchers train dozens of neural networks on terabytes of data each week. The company runs 250 terabytes of data through its machine learning algorithms each month to identify promising drug candidates for rare conditions like  hereditary hemorrhagic telangiectasia, a genetic disorder that causes malformed blood vessels and potential hemorrhaging.

“Normally companies work on one disease and one hypothesis, one potential drug at a time,” said Lina Nilsson, Recursion’s senior director of data science product. “We’re multiplexing it, screening hundreds of different diseases at a time, thousands of drug molecules.”

The Big Picture: Drug Discovery with Biological Images

When a cell is diseased, it looks structurally different from healthy ones. So one way to determine if a drug candidate is effective is to add the compound to a diseased cell and observe what happens.

If the cell structure changes to more closely resemble the healthy cells, the drug candidate is promising. But these morphological changes to the cell can be extremely subtle to detect, and researchers want to analyze as many drug candidates as possible, as quickly as possible.

In the traditional drug discovery process, scientists focus on one disease at a time, performing research over many years. With this approach, it can cost more than $2 billion to discover a new drug and bring it to market.

“That’s a very sequential approach that is slow and hampered by the world’s limited understanding of biology,” said Mason Victors, the startup’s chief technology officer. “The platform we’ve built here allows us to probe many diseases that would be almost impossible to go after in this sequential, target-driven approach.”

Scientists generally design experiments to measure one or few features from microscopy images of cells, testing a limited number of hypotheses per experiment. With a deep learning algorithm using CNNs, in contrast, Recursion can in a week analyze hundreds of features from more than 10 million cells, and test dozens of hypotheses at once.

The team relies on a large cluster of NVIDIA GPUs for both training and inference. Multi-GPU training with NVIDIA V100 Tensor Core GPUs and NVLink allowed them “to accelerate from a single researcher taking a couple days to train a network to being able to do that in a matter of hours,” Victors said.

Deep learning helps the researchers look at hundreds of cell features and diseases at a time, allowing them to quickly pursue new therapeutic areas of interest and look at drug compounds in previously unexplored fields.

These tools also shed light on how a drug compound might interact with other cells in the body, and what potential toxicities it may have — like damaging the liver or causing arrhythmias.

While Recursion primarily focuses on in-house drug development, it also works with pharmaceutical companies that want to screen a large set of compounds.

The company’s pipeline includes compounds tested against more than 80 disease models. One compound has been FDA cleared for a Phase 1 clinical trial and is currently undergoing clinical testing in humans, while several additional compounds have been selected by Recursion’s pharmaceutical partners for further development.

Deep Learning Powers Drug Discovery for Rare Diseases