The power of small numbers

There is a revolution underway at the interface between biology and computer science. We are becoming much quicker at analysing increasing numbers of separate human cells, which enables us to sort and profile these cells. It allows us to gain an understanding of the type of cells that form a tumour and target efforts to combat them. This is very welcome news, since current therapies are insufficiently targeted, making them unnecessarily aggressive. Marcel Reinders, professor at TU Delft, conducts research in bioinformatics and explains about developments in single cell data analysis. Because the power is in the small numbers.

In the field of bioinformatics, we get to see increasing amounts of raw data. We cluster it, make sense of it and make predictions. In the future, we hope to be able to use this data to build a model of a cell; possibly ultimately of an entire tissue.

How did you end up in the field of bioinformatics?

“I was conducting research in image processing, compressing video images, when TU Delft started several major research projects in around 1997. One project involved making a protein chip as a follow-up to the micro-array, an RNA chip that you can use to measure the expression of 20,000 genes simultaneously. The people running the project thought that if you could achieve it with RNA, it should also be possible for proteins. That's much more interesting, because proteins are the chemicals in the cell that do the real work. Then the researchers were looking for someone to interpret the data. They started making the chip while I worked on analysing the data.”

“When we started work on the protein chip project, there was a lot we didn't know. Although I knew a lot about machine learning, no one knew about applying it to this type of new data. For every micro-array, you get information from 20,000 genes. Based on these measurements, you need to discover what role each individual gene plays, for example if it is involved in the formation of cancer. That development, finding your way through a large quantity of biological data, is ultimately what became the field of bioinformatics.”

How far have we advanced over the years?

“We now know that bulk material, from a tumour biopsy, for example, is heterogeneous. A tumour consists of several types of cells, each of which have evolved in their own way. It is actually several tumours. This is also why many of the current therapies for cancer are ineffective: 99% of the tumour may die, but the odd clone or individual group continues to survive. As a result, the tumour returns, often worse than before. This heterogeneity therefore prevents lots of things. Around five years ago, researchers began looking at each cell individually, rather than the bulk.”

How does that work?

“We can now sequence DNA: cutting it into pieces and analysing it directly. It took ten years to sequence the human genome and now you can do the same thing within a single day for a fraction of the cost. The resolution of the equipment used is also sufficient for small quantities. You no longer need the bulk in order to measure DNA, but can do it for each individual cell. The trick is to obtain single cells from the bulk. Dropseq is the method generally used for this: you allow the cell to run through several narrow channels, isolating it. You can then stick a barcode onto the DNA in the cell and then put the DNA from that cell back into a pool of several cells that you then go on to sequence. But because you have marked the DNA of each cell, you can trace the measurements back to the individual cell. Since the introduction of Dropseq, two or three years ago, things have moved very fast. Initially, we could only distinguish a few cells in a pool, but it's now already possible to pool millions of cells. In the years ahead, the numbers will only increase.”

What does the data look like after merging these cells and what can we do with it?

“We can cluster the cells visually based on the profile measured, for example by giving them a colour code. In this way, researchers have discovered that there are more than 140 different cells in blood: many more than we had so far thought. Biological insights of this kind will help us to advance, for example in identifying rare cell types that play an essential role in the body's defences.”

Returning to the example of the tumour: how will cancer patients benefit from this new insight?

“If you know that you can screen a complete single cell from a tumour biopsy, you can identify exactly what different types of cell there are in the tumour. You can then adjust your therapy accordingly. If you can analyse the tumour completely, you also know the cocktail of drugs that will be most effective. But it's not simply a case of ‘grabbing the drugs, mixing them together and that's it’: there is also the risk of toxicity when using several drugs.”

How do you envisage the future, in view of all these developments?

“I believe that in some ten years time, single cell data will enable us to gain a greater understanding of pathological processes and large parts of the body. There are already some major initiatives, such as the Cancer Genome Atlas, bringing together all data about cancer. The entire human brain has already been mapped using this method. The next step is the whole human body. This helps us to learn: what the body is, what it's made up of, what can go wrong and what possibilities there are to intervene. These systems are also in development, a major example of which is CRISPR-CAS: it can be used to make very targeted changes to genetic material. So if you know that you have a mutation that increases your chance of breast cancer, you don't need to remove the breast, but can have the cells genetically repaired. The technology is there and we can use it to improve health. We mustn't close our eyes to the potential and shout that we shouldn't be using it. It will happen in the end anyway, so it's better if we approach it positively.”

Text: Koen Scheerders | Photo: Mark Prins | Oktober 2017