Recently I’ve been thinking about applying deep learning techniques to the problem of CT image segmentation. Various deep learning methods have already been used for segmentation of 2D images (look at e.g Jonathan Long’s work and also some older work on combining graph-based approaches with convolutional nets) and there’s even been some work on 3D segmentation.
There are two main issues:
- 3D imaging datasets are large and require a lot of computing power for deep learning to work.
- Training deep architectures typically needs a lot of labelled/segmented training data.
Problem 1 is less of a problem nowadays than it used to be when deep learning started becoming popular a few years ago. It’s pretty common now to train on datasets of millions of images.
Problem 2 is the main sticking point. It’s hard to obtain CT data. Forget obtaining data from actual patients. Because of the radiation hazard, CT scans are only done when absolutely needed, and volunteers are not accepted. There’s a lot of completed CT scans out there, of course, but radiology centers usually don’t want to share them for obvious reasons, and manually-segmented data has the issue where we don’t know if the segmentation matches the ‘ground truth’. Plus it’s time-consuming (and expensive if you’re hiring people to do it). You could do CT scans on phantoms (these are simulated ‘dummies’ made out of materials designed to mimic human tissue) but doing so is expensive because time on CT machines is expensive, and the phantoms are time-consuming to set up and scan. Plus, phantoms don’t accurately reflect the real-world anatomical geometries we’re interested in. Often they’re nothing more than just a box of material with some cylindrical and spherical shapes inserted or drilled in.
It seems like the easiest way to attack the problem of data is through simulated CT imaging. The idea is that you set up a simulated geometry and CT scan machine, and produce entirely computer-generated CT scans. All of this is done in software of course. If one can get this to work and to accurately simulate how a CT machine operates, one could produce an endless set of training data for virtually free. Even better, 3D geometries of human organs are available and can be deformed and modified ad infinitum, and in every scan we know exactly what the ground truth is without having to go through expensive manual segmentation.
The problems with simulating CT are that:
- The physics are non-trivial.
- Machine design parameters can be assumed to be proprietary and unavailable.
- Reconstruction algorithms are proprietary and unavailable.
In a basic CT machine setup, the radiologist/researcher doesn’t even get to see the data produced by the machine (the sinogram). All they see is the final ‘reconstruction‘, and it’s unknown what process the machine went through to actually produce that reconstruction.
A few studies have looked at generating CT data. There are a number of different kinds of methods:
- There are methods that go directly from a 3d model to a simulated CT scan.
- Other methods use a high-quality (high dose) CT scan as input and generate simulated scans of lower quality (lower dose).
- Still other methods use other forms of tomography (e.g. MRI) and transform this to a simulated CT scan.
Methods 2 & 3 might at first seem to be of limited use (“Why do I need to generate a CT scan if I already have one?”) but they’re actually pretty useful for deep net training purposes. High-dose CTs have high enough contrast and are easy enough to segment that we don’t really need ML methods to do it. So we can take a small amount of high-quality data and produce a large amount of training data (data magnification) that we can use for training. Now you might ask, “can’t you just take existing data and add Gaussian noise to it?” The answer is: no, it’s not that simple. Low-dose CT data isn’t merely high-dose CT data with some noise added; lower SNR in the input x-ray data affects the reconstruction methods in interesting and non-obvious ways. The reconstruction processes aren’t simply linear transformations of the input data.
But another way those methods are useful is if used in conjunction with methods of type 1. At the limit of infinite resolution and infinite dose, a CT scan represents the input geometry exactly. Thus we can ‘generate’ a synthetic ‘high-dose’ CT and then gradually transform it into CT scans of lower and lower doses.
In the next post, I’ll go over some of the methodologies that have been used and discuss them in a bit more depth.