Tissue classification to improve tractography
The main aim of this project is to
implement an image segmentation algorithm that is able to classify the
different tissue types of the brain using structural and diffusion MRI images.
The ultimate goal is to add this algorithm to the set of tools available in
DIPY (Diffusion Imaging in Python: http://dipy.org)
and incorporate the resulting tissue probability maps in the processing
pipelines for tractography [1] (Anatomically-constrained tractography). We
decided to use a Bayesian approach for the segmentation, as has been previously
done by Zhang et al. and Avants et al. [2, 3].
The
Bayesian approach to image segmentation
Bayesian image segmentation
algorithms use a likelihood model, also known as a constant observation model,
and a prior probability model. The product of these two is equal to the
posterior probability P(x|y), where x is the label and y is the input image. The optimal label for a voxel x ̂ is found by calculating
the maximum of this posterior probability (Maximum A Posteriori or MAP). This
is defined by:
(1)
Where P(y|x) is the likelihood and P(x) is the prior and 1/p(y) is a constant. The constant observation model chosen for our algorithm is the Gaussian log-likelihood.
(3)
Here θ are the parameters, i.e., the mean denoted by μ and the standard deviation denoted by σ. This type of observation model has
been used for brain segmentation by other researchers in the past [2]. It is
important to note here that despite its simplicity, the gaussianity assumption
constituted one of the main challenges we encountered, as it will become clear
further in this report. We also used this likelihood to calculate the very
first segmentation of the input image. This initial segmentation is suboptimal
but it is already a close approximation to the optimal segmentation. It is noted that no
contextual information is used for the initial segmentation. In Figure 1 we
show a T1-weighted coronal slice of a healthy young adult and the corresponding
maximum likelihood segmentation.
Figure 1.
Markov
Random Fields
We modeled the prior probability
(second factor of equation 1) with Markov Random Fields, which do take into
account contextual information. The idea behind the MRF models is based on the
simple intuition (nowadays formalized in a mathematical theorem) that a
specific label for a voxel is favored by the label homogeneity of its neighborhood. This
assumption is formally called the “locality” assumption and together with the
“positivity” assumption determines that a random field is to be markovian. The
positivity assumption states that P(x)
> 0.
By taking this into account an MRF
distribution can be modeled as a Gibbs distribution:
(4)
(4)
A Gibbs distribution
is characterized by a constant Z
factor and an “energy” term U(x).
This energy function is defined by the neighborhood and is assumed to be the
sum of all the pairwise clique potentials (a voxel with each of its neighbor
voxels) and is based on the Ising-Potts model. In order to determine the
optimal label, the Gibbs energy needs to be minimized. This is where the Expectation Maximization
(EM) algorithm comes into play. In each iteration, we have an E-Step, in which
we update the label of the voxel by minimizing the energy function of the Gibbs
distribution and we have an M-Step in which we update the parameters (mean and
variance) to calculate the log-likelihood again, to which the Gibbs energy will
be added again, and the E-Step and M-Step take place for another iteration. The
minimization of the Gibbs energy functions is accomplished with the ICM
(Iterated Conditional Modes) algorithm. Here is where one of the most important
parameters of our algorithm comes into place. The beta parameter, also called
the “smoothing” factor, basically
determines how much weighting is given to the neighborhood according to the MRF
prior model.
Implementation
details
Our algorithm needs 4 input
variables: a)the gray-scale MRI image, b) the number of tissue classes desired
to be segmented; c) the number of iterations to go through the E and M steps;
and d) the beta value. As it is shown in Fig#2 the choice of beta makes a difference,
namely the higher the beta the “smoother” the segmentations are, which does not
necessarily mean a better segmentation.
Choosing the right amount of
iterations is also critical. With just one or two iterations the end result is
not optimal, thus usually more than ten iterations are needed. In Figure 3 we
show the resulting segmentations after many iterations with a beta value of 0.
It can be seen how the segmentation changes from iteration to iteration. Also,
by examining the total final energies it is possible to see when the algorithm
converges. In our tests we made sure that the total energies would always be
minimized after each iteration as can be seen in Figure 4. Also, we found that convergence is reached when the difference between the total energies of two consecutive iterations is of -3 orders of magnitude (10e-3). We noticed that the energy initially drops
drastically in the first two iterations and then stabilizes around 4-6
iterations, at which the total energy reaches a minimum and then it starts
oscillating with a small variation.
Figure 2.
Figure 2.
Figure 3.
One of the outputs of the algorithm
is the probability map for each tissue type. So there will be as many
probability maps as the number classes specified as input. For tractography
applications, most of the times the brain is segmented into three tissue
classes, namely cortico-spinal fluid (CSF), gray matter (GM) and white matter
(WM). Especially the white matter probability map is used for so called
“Anatomically Constrained Tractography” [1] in which this map is used as a stop
criterion for the tracking algorithm. An example of these probability maps for
the same slice shown in Figure 1 is shown in Figure #5.
The Gaussian likelihood prior,
although simple, gave some problems when accomplishing the
segmentation task. The input to our algorithm is assumed to be a skull-striped
(masked) brain image. When this step is performed and the skull is removed, the
background becomes zero. Since we wanted to include the background as one
additional “tissue class” and not have a brain mask as an input, the background
converges rapidly to a mean of zero and a variance of zero, from the very first
iterations of our algorithm, Hence, this class does not have a Gaussian
distribution anymore. Understanding the behavior of this was one of the major
roadblocks during the development of the algorithm.
Figure 5.
Tractography
We tested our algorithm on one
subject and used the output to compute tractography. The type of tractography
that we performed is called the “Anatomically Constrained Tractography”, also
known as ACT [1]. Tissue specific probability maps are required for this type
of tractography. We performed ACT by using tissue probability maps derived from
two different types of images, on a T1-weighted image and on a Diffusion Power
Map (DPM) [4].
Figure 6. Tissue classification of subject's T1-weighted image
Figure 7. Tissue probability maps of subject's T1-weighted image
Figure 8. Tissue classification of subject's Diffusion Power Map
Figure 9. Tissue probability maps of subject's Diffusion Power Map
Figure 10 shows the resulting ACT tractographies of the corpus callosum.
Conclusions
We developed and fully tested a
brain segmentation algorithm based on a Bayesian framework by using the Markov
Random Fields Theory and Expectation Maximization. This algorithm proved
to work on T1-weighted images as well as
on dMRI derived maps, such as the DPMs [4]. The algorithm could
successfully segment white from gray matter, even in troublesome areas of high
concentrations of crossing fibers. We were able to use the derived probability
maps from both the T1 and the DPMs for ACT. Further work will include an
extensive validation of the algorithm, a comparison against traditionally used
toolboxes (FSL and ANTS) and look for quantitative methods to compare the tractographies
derived from tissue maps of T1 vs. those
derived from DPM.
REFERENCES
[1] Smith, R. E., Tournier, J.D.,
Calamante, F., & Connelly, A. Anatomically-constrained tractography:
Improved diffusion MRI streamlines tractography through effective use of
anatomical information. NeuroImage, 63(3), 1924-1938, 2012.
[2] Zhang, Y., Brady, M. and Smith,
S. Segmentation of Brain MR Images Through a Hidden Markov Random Field Model
and the Expectation-Maximization Algorithm IEEE Transactions on Medical
Imaging, 20(1): 45-56, 2001
[3] Avants, B. B., Tustison, N. J.,
Wu, J., Cook, P. A. and Gee, J. C. An open source multivariate framework for
n-tissue segmentation with evaluation on public data. Neuroinformatics, 9(4):
381–400, 2011.
[4] Flavio Dell'Acqua, Luis Lacerda,
Marco Catani, and Andrew Simmons, Anisotropic Power Maps: A diffusion contrast
to reveal low anisotropy tissues from HARDI data. The International Society for
Magnetic Resonance in Medicine. Annual Meeting 2014, 10-16 May 2014, Milan,
Italy
Final comments
The most rewarding part of the program was the mentorship and the progress I made while being taught and guided by such a great team of people. The meetings were long and very into the details of the code and the testing. No single line was left out of sight as every single line was tested. It was the first time I did such a thorough testing of my code and I am sure this will help me a lot for all my future endeavors. One interesting event during this summer was that I had the chance to meet with my mentors at a conference (OHBM 2015). GSoC also helped me with $500 to go to the conference. Without this I think things would have been a bit harder. Talking directly with my mentors and getting to know the general idea of what this open source initiative for brain imaging is, was of invaluable help and very enlightening. An advice I would give to future GSoC participant is not to give up if things aren't working as expected or the way you proposed. If there is a defined end goal there are for sure many different routes to get there.
Cool stuff! You mentioned that the images you ran for your test subject were skull stripped. Could this algorithm easily be used to do the skull stripping as well?
ReplyDeleteSpecifically, if the tissue types could be computed on the images with skulls, and if the CSF map was reliable enough, perhaps you could segment using the outer layer of CSF.
I doubt this is possible, but just wondering if you had thoughts!
I was also wondering a bit more about the probability model--is it computing the probability of a label given the raw voxel intensity, or a normalized voxel intensity? Is the voxel intensity taken before or after warping to a common (template) space?
ReplyDelete