Tuesday, August 25, 2015

Final Report GSoC 2015

Tissue classification to improve tractography


The main aim of this project is to implement an image segmentation algorithm that is able to classify the different tissue types of the brain using structural and diffusion MRI images. The ultimate goal is to add this algorithm to the set of tools available in DIPY (Diffusion Imaging in Python: http://dipy.org) and incorporate the resulting tissue probability maps in the processing pipelines for tractography [1] (Anatomically-constrained tractography). We decided to use a Bayesian approach for the segmentation, as has been previously done by Zhang et al. and Avants et al. [2, 3]. 

The Bayesian approach to image segmentation

Bayesian image segmentation algorithms use a likelihood model, also known as a constant observation model, and a prior probability model. The product of these two is equal to the posterior probability P(x|y), where x is the label and y is the input image. The optimal label for a voxel x ̂ is found by calculating the maximum of this posterior probability (Maximum A Posteriori or MAP). This is defined by:

(1)
(2)



Where P(y|x) is the likelihood and P(x) is the  prior and 1/p(y) is a constant. The constant observation model chosen for our algorithm is the Gaussian log-likelihood.

(3)


Here θ are the parameters, i.e., the mean denoted by μ and the standard deviation denoted by σ. This type of observation model has been used for brain segmentation by other researchers in the past [2]. It is important to note here that despite its simplicity, the gaussianity assumption constituted one of the main challenges we encountered, as it will become clear further in this report. We also used this likelihood to calculate the very first segmentation of the input image. This initial segmentation is suboptimal but it is already a close approximation to the optimal segmentation. It is noted that no contextual information is used for the initial segmentation. In Figure 1 we show a T1-weighted coronal slice of a healthy young adult and the corresponding maximum likelihood segmentation.


Figure 1.

Markov Random Fields

We modeled the prior probability (second factor of equation 1) with Markov Random Fields, which do take into account contextual information. The idea behind the MRF models is based on the simple intuition (nowadays formalized in a mathematical theorem) that a specific label for a voxel is favored by the label homogeneity of its neighborhood. This assumption is formally called the “locality” assumption and together with the “positivity” assumption determines that a random field is to be markovian. The positivity assumption states that P(x) > 0. 


By taking this into account an MRF distribution can be modeled as a Gibbs distribution:

(4)


A Gibbs distribution is characterized by a constant Z factor and an “energy” term U(x). This energy function is defined by the neighborhood and is assumed to be the sum of all the pairwise clique potentials (a voxel with each of its neighbor voxels) and is based on the Ising-Potts model. In order to determine the optimal label, the Gibbs energy needs to be minimized.  This is where the Expectation Maximization (EM) algorithm comes into play. In each iteration, we have an E-Step, in which we update the label of the voxel by minimizing the energy function of the Gibbs distribution and we have an M-Step in which we update the parameters (mean and variance) to calculate the log-likelihood again, to which the Gibbs energy will be added again, and the E-Step and M-Step take place for another iteration. The minimization of the Gibbs energy functions is accomplished with the ICM (Iterated Conditional Modes) algorithm. Here is where one of the most important parameters of our algorithm comes into place. The beta parameter, also called the “smoothing” factor, basically determines how much weighting is given to the neighborhood according to the MRF prior model.
 
 Implementation details


Our algorithm needs 4 input variables: a)the gray-scale MRI image, b) the number of tissue classes desired to be segmented; c) the number of iterations to go through the E and M steps; and d) the beta value. As it is shown in Fig#2 the choice of beta makes a difference, namely the higher the beta the “smoother” the segmentations are, which does not necessarily mean a better segmentation.

Choosing the right amount of iterations is also critical. With just one or two iterations the end result is not optimal, thus usually more than ten iterations are needed. In Figure 3 we show the resulting segmentations after many iterations with a beta value of 0. It can be seen how the segmentation changes from iteration to iteration. Also, by examining the total final energies it is possible to see when the algorithm converges. In our tests we made sure that the total energies would always be minimized after each iteration as can be seen in Figure 4. Also, we found that convergence is reached when the difference between the total energies of two consecutive iterations is of -3 orders of magnitude (10e-3). We noticed that the energy initially drops drastically in the first two iterations and then stabilizes around 4-6 iterations, at which the total energy reaches a minimum and then it starts oscillating with a small variation.

Figure 2.

Figure 3.


Figure 4.


One of the outputs of the algorithm is the probability map for each tissue type. So there will be as many probability maps as the number classes specified as input. For tractography applications, most of the times the brain is segmented into three tissue classes, namely cortico-spinal fluid (CSF), gray matter (GM) and white matter (WM). Especially the white matter probability map is used for so called “Anatomically Constrained Tractography” [1] in which this map is used as a stop criterion for the tracking algorithm. An example of these probability maps for the same slice shown in Figure 1 is shown in Figure #5.

The Gaussian likelihood prior, although simple, gave some problems when accomplishing the segmentation task. The input to our algorithm is assumed to be a skull-striped (masked) brain image. When this step is performed and the skull is removed, the background becomes zero. Since we wanted to include the background as one additional “tissue class” and not have a brain mask as an input, the background converges rapidly to a mean of zero and a variance of zero, from the very first iterations of our algorithm, Hence, this class does not have a Gaussian distribution anymore. Understanding the behavior of this was one of the major roadblocks during the development of the algorithm.

Figure 5.

Tractography

We tested our algorithm on one subject and used the output to compute tractography. The type of tractography that we performed is called the “Anatomically Constrained Tractography”, also known as ACT [1]. Tissue specific probability maps are required for this type of tractography. We performed ACT by using tissue probability maps derived from two different types of images, on a T1-weighted image and on a Diffusion Power Map (DPM) [4].

As a first step we ran the tissue classifier on the whole 3D volumes (T1 and DPM). We used a beta value of 0.01 and 20 iterations in both cases. The images were skull-stripped in advance and in the case of the T1 weighted image we used the N4 algorithm for bias field correction. The segmentation results for the T1 can be seen in Fig.6 and the probability maps of each tissue class (CSF, GM and WM) in Fig.7.  The segmentation for the DPM can be seen in Fig.8 and the three tissue probability maps in Fig.9. In figure 10 we show the resulting tractographies when seeding only in the Corpus Callosum and by using both types of tissue probability maps.

Figure 6. Tissue classification of subject's T1-weighted image

Figure 7. Tissue probability maps of subject's T1-weighted image

Figure 8. Tissue classification of subject's Diffusion Power Map

Figure 9. Tissue probability maps of subject's Diffusion Power Map


Figure 10 shows the resulting ACT tractographies of the corpus callosum.




Conclusions


We developed and fully tested a brain segmentation algorithm based on a Bayesian framework by using the Markov Random Fields Theory and Expectation Maximization. This algorithm proved to work on T1-weighted images as well as  on dMRI derived maps, such as the DPMs [4]. The algorithm could successfully segment white from gray matter, even in troublesome areas of high concentrations of crossing fibers. We were able to use the derived probability maps from both the T1 and the DPMs for ACT. Further work will include an extensive validation of the algorithm, a comparison against traditionally used toolboxes (FSL and ANTS) and look for quantitative methods to compare the tractographies derived from tissue maps of  T1 vs. those derived from DPM.


REFERENCES

[1] Smith, R. E., Tournier, J.D., Calamante, F., & Connelly, A. Anatomically-constrained tractography: Improved diffusion MRI streamlines tractography through effective use of anatomical information. NeuroImage, 63(3), 1924-1938, 2012.

[2] Zhang, Y., Brady, M. and Smith, S. Segmentation of Brain MR Images Through a Hidden Markov Random Field Model and the Expectation-Maximization Algorithm IEEE Transactions on Medical Imaging, 20(1): 45-56, 2001

[3] Avants, B. B., Tustison, N. J., Wu, J., Cook, P. A. and Gee, J. C. An open source multivariate framework for n-tissue segmentation with evaluation on public data. Neuroinformatics, 9(4): 381–400, 2011.


[4] Flavio Dell'Acqua, Luis Lacerda, Marco Catani, and Andrew Simmons, Anisotropic Power Maps: A diffusion contrast to reveal low anisotropy tissues from HARDI data. The International Society for Magnetic Resonance in Medicine. Annual Meeting 2014, 10-16 May 2014, Milan, Italy


Final comments

The most rewarding part of the program was the mentorship and the progress I made while being taught and guided by such a great team of people. The meetings were long and very into the details of the code and the testing. No single line was left out of sight as every single line was tested. It was the first time I did such a thorough testing of my code and I am sure this will help me a lot for all my future endeavors. One interesting event during this summer was that I had the chance to meet with my mentors at a conference (OHBM 2015). GSoC also helped me with $500  to go to the conference. Without this I think things would have been a bit harder. Talking directly with my mentors and getting to know the general idea of what this open source initiative for brain imaging is, was of invaluable help and very enlightening. An advice I would give to future GSoC participant is not to give up if things aren't working as expected or the way you proposed. If there is a defined end goal there are for sure many different routes to get there.













2 comments:

  1. Cool stuff! You mentioned that the images you ran for your test subject were skull stripped. Could this algorithm easily be used to do the skull stripping as well?

    Specifically, if the tissue types could be computed on the images with skulls, and if the CSF map was reliable enough, perhaps you could segment using the outer layer of CSF.

    I doubt this is possible, but just wondering if you had thoughts!

    ReplyDelete
  2. I was also wondering a bit more about the probability model--is it computing the probability of a label given the raw voxel intensity, or a normalized voxel intensity? Is the voxel intensity taken before or after warping to a common (template) space?

    ReplyDelete