Abstract: A score function useful as a quantitative measure of the performance of the medical image enhancement techniques is reported in this paper. The measure proposed is based on merging of full–reference and blind–reference image enhancement measures. The score function is the average of the weighted sum of the image enhancement measures normalized between zero and one. The novel measure is validated considering as a hypothesis that values maximizing score function have that maximize the values of the metrics (Dice coefficient) used to evaluate certain previously reported cardiac image segmentation approach. The values of score function and Dice score reached the maximum value for the same cardiac volumes segmented.
Keywords:Image enhancementImage enhancement,cardiac imagescardiac images,image qualityimage quality,image enhancement assessmentimage enhancement assessment.
Resumen: En este artículo se presenta una función de puntuación útil como medida cuantitativa del rendimiento de técnicas de mejora de imágenes médicas. La métrica propuesta se basa en la fusión de medidas de mejora de imagen de referencia completa y referencia ciega. La función de puntuación es el promedio de la suma ponderada de las medidas de mejora de imagen normalizadas entre cero y uno. La nueva medida se valida considerando la hipótesis de que los valores que maximizan la función de puntuación tienen como máximo los valores de las métricas (coeficiente de Dice) utilizados para evaluar cierto enfoque de segmentación de imágenes cardíacas reportado previamente. Los valores de la función de puntuación y el coeficiente de Dice alcanzaron el valor máximo para los mismos volúmenes cardíacos segmentados.
Palabras clave: Realce imágenes, imágenes cardiacas, calidad de imagen, evaluación del realce de imagen.
Artículos
A score function as quality measure for cardiac image enhancement techniques assessment
Una función de puntuación como medida de calidad para la evaluación de técnicas de mejora de la imagen cardíaca
The acquisition or generation of bi-dimensional (2–D) or three-dimensional (3–D) medical images is performed using medical imaging systems prone to interference due to a signal random variation1. Such interference or noise contaminates the image actual information2. Moreover, the nature of the physiological systems and the acquisition/generation protocols reduce the image contrast and increase the information degradation thus hindering the observation of subtle anatomical features. In addition, the medical images often are corrupted by artifacts, which, theoretically, provide a systematic discrepancy between the quantized values in the acquisition and real values for the attenuation coefficients of the objects in the real scene3. For images in which the noise and the artifacts impact are relatively high, the accurate interpretation of the tissue status could be very difficult if there is a slight difference between the characteristics of a healthy tissue and a diseased tissue.
Image enhancement is one of the most used image pre–processing techniques to attenuate the impact of unwanted signals. Image enhancement techniques modify the appearance of a scene, providing a version of the image more suitable for a specific human observer. On the other hand, these techniques constitute also a pre–processing step required for automatic image analysis. In general terms, and in the clinical context, the enhancement techniques improve image quality and facilitate the diagnosis.
The use of enhancement techniques in order to improve the appearance and visual quality of the images can contribute to the interpretation of such scenes by specialists. However, the intra–specialist interpretation variability can be high because each interpretation depends on the each specialist perception, therefore subjectivity about the diagnosis increases.
The use of image quality metrics such as mean, standard deviation, MSE, MAE and PSNR is appealing because they are simple to calculate, has clear physical meanings, and mathematically convenient in the context of optimization. Nevertheless, these metrics not assess the perceived visual quality very well4. The developed of measures of the image quality is basically due to the subjective assessment tests are quite expensive and time consuming, and they depend on the specific application. The objective image quality assessment requires of an image quality metrics that correlates with perceived image quality. This objective assessment is necessary to validate the effectiveness of the image enhancement techniques. In this sense, the development of a score function useful as a quantitative measure of the performance of the medical image enhancement techniques is proposed in this work.
2.1 Relative Image Enhancement Measures
The proposed image enhancement assessment score function is based on merging of full–reference and blind–reference image enhancement measures. A complete reference image is required for applying the full–reference metrics while blind–reference metrics do not require any information about the reference image5. For full–reference metrics, the original unprocessed image is considered as complete reference image.
2.1.1 Statistical
It has been very important in the robust analysis of images impacted by noise and artifacts6. Although various researches consider that statistical measures of gray level distribution of local contrast enhancement based on mean, standard deviation or entropy has not been meaningful; these statistics are used in order to formulate the proposed image enhancement assessment score function.
Mean
The mean of an image represents its average intensity or density or average of gray levels. The mean removing gives rise to an image represented by the edges and gray level fluctuations about the mean. A decreasing on the value of the mean leads to an increase in image enhancement.
Standard deviation
It is the deviation about mean and represents the dynamic range of intensity values in the image about the mean. A low standard deviation value indicates that the intensities are closer to mean value. In this work, the image enhancement technique is required in order to preprocess cardiac images that will be segmented using a clustering technique, therefore, while the lower the standard deviation, the image data are less scattered.
Entropy
It is a statistical measure of information used commonly for expressing how the probability of occurrence of each gray level in an image varies over the available range7. Entropy measures the information content of an image; high values indicate an image with many details. This indicates that entropy increases when blur diminishes, but this measure can also increase when a high percentage of noise is added. Then, this statistical measure is a good blind–reference enhancement measure for image enhancement techniques applied to low noise images8. The entropy expressed in bits is given by
where Lgray refers to number of gray levels and p(i) the probability of occurrence of the ith gray level.
2.1.2 Traditional Enhancement Measures
The traditional or conventional image enhancement metrics are measured with features such as: parameter free, inexpensive to compute, clear physical meaning, and useful in optimization context.
Mean Squared Error
MSE is a full–reference enhancement measure obtained by averaging the squared intensity differences of enhanced and original image9. This measure based on a simple mathematical formulation is still today widely used due to its historical application in the optimization and evaluation of a wide variety of signal/image processing approaches10. For an original 3–D image (I) with a size of LxMxN and its enhanced version (Ienh), the MSE between I and Ienh is defined by
Mean Absolute Error
MAE is defined by (3) and it is a metric that quantifies the average of the absolute differences of all image elements between the original image and enhanced image. MAE like the values of MSE increases with increase in image enhancement. These metrics are considered to evaluate accuracy prediction11.
Peak Signal–to–Noise Ratio
PSNR between the two images is given by
PSNR has provided useful baseline comparisons on a wide amount of image processing works5,8,11,12,13.
2.1.3 Complex Measures of Enhancement
A set of complex measures of enhancement that use Weber-law-based contrast measure or Michelson contrast law is also considered by us. These enhancement measures split the current image in small regions or blocks and then, the maximum and minimum intensities in these blocks, and the intensity of the block central pixel/voxel are determined, and subsequently are used in order to calculate the measures by each block. These results are averaged to obtain the measures values of the current enhanced image.
Measure of Enhancement
EME14 represents a contrast scoring and clarity of the information contained in an image based on the Weber–Fechner Law15. EME high values denote high contrast and an increase clarity of information, which corresponds to an improvement in image enhancement. In order to compute the metric, the enhanced image should be splitted into k1k2k3 blocks of sizes l1xl2xl3. EME is defined by
where and
represent the minimum and maximum of the image intensities inside the block denoted by the superscripts ijk. The ratio between the minimum and maximum is known as the contrast ratio16. In this paper, the contrast ratio is considered for each block and it is referred to hereinafter as
.
Measure of Enhancement by Entropy
EMEE14 is define as the entropy of the contrast ratio scaled by a. For each enhanced image block, EMEE is given by
Michelson law measure of enhancement
AME is proposed as an EME improvement by introducing of the Michelson contrast17. This measure is the average of the logarithmic form of the Michelson contrast evaluated in each block ijk of the splitted image18.
Michelson law measure of enhancement by entropy
AMEE is the entropy of the Michelson contrast (8) for each block denoted by the superscripts ijk and scaled by 8
AMEE is described by (9).
Second–Derivative–like Measure of Enhancement
SDME is initially proposed as a visibility operator based on the second derivate of the contrast ratio19. The usefulness of this metric as enhancement measure is previously reported20. Moreover, the SDME has been also formulated as an image enhancement approach21. For each block in the split enhanced image, the second–derivative–like visibility operator is given by
where is the gray level of the central voxel in each block. The blocks size should be odd. Then, the SDME is defined by
Structural SIMilarity
SSIM index is a metric based on the degradation of structural similarity5. SSIM considers that the structure of an image is represented by statistics, such as mean and variance. This metric measures the quality of the
where and
are mean and variance of I, respectively.
and
are mean and variance of Ienh,
and represents the covariance of I and Ienh. c1, c2 represent constants to stabilize the equation when the denominators are very close to zero. In order to evaluate the overall image quality a single measure is required. In this sense, the mean of SSIM index (13) is used.
where K is the number of blocks of the image.
2.2 Identifying the Expected Variation of the Metrics
The image enhancement measures described above and computed from the enhanced cardiac image are merged in order to construct a score function. Each measure should have the same weighting in the score function, but add or subtract the amount depends on its variation (increase/decrease) with respect to the measurement value obtained from the original unprocessed image. Therefore, the expected variation for each metric in order to establish whether an increase/decrease is added to or subtracted must be identified.
2.2.1 The Procedure
The basic idea is to develop a procedure useful for analyzing and identifying the variation of the image enhancement measures calculated from cardiac medical images. In this sense, the measures described in section 2.1 are estimated both the original cardiac images and their enhanced versions. The measures calculated from the original images are referred to hereinafter as reference measures while those calculated from the enhanced images are referred as improvement measures. Two smoothing filters are considered for enhancing the cardiac medical images: the binomial filter22 and the weighted median filter23. The proposed procedure has six steps as follows.
Step 1. For each cardiac image, to calculate the reference measures
Step 2. For each enhanced cardiac image obtained using the binomial filter to calculate the binomial improvement measures; and, for each enhanced cardiac image obtained using the weighted median filter to calculate the weighted median improvement measures.
Step 3. Compute for each cardiac image the difference between their binomial improvement measures and their respective reference measures. Then, store these differences in the vector of differences d1; and, compute for each cardiac image the difference between their weighted median improvement measures and their respective reference measures. Then, store these differences in the vector of differences d2
Step 4. Determine negative values in the vectors d1 and d2, and then label them as decrease with increase enhancement. Label the positive values as increase with increase enhancement
Step 5. For all cardiac images, to quantify the occurrences of the labels for each measure
Step 6. For each measure establish like the expected variation, the label that reported the highest frequency of occurrence
2.2.2 Evaluation of the image enhancement measures
The procedure proposed above is applied to 4–D (3–D + time) cardiac images sequences acquired using a multi–slice computerized tomography (MSCT) scanner (Philips Brilliance 64 Host–10236). Each sequence consists of 10 volumes describing the heart anatomical information for a complete cardiac cycle. The resolution of each volume is (512´512´324) voxels. The spacing between pixels in each slice is 0.429688 mm and the slice thickness is 0.400024 mm. The image volume is quantized to 12 bits per voxel. In this experiment, a total of 125 patients are considered, therefore, a total of 1250 MSCT cardiac volumes is evaluated. Figure 1 shows a MSCT image.
A convolution process that implies the 3–D convolution kernel shown in (14) is used for enhancing the cardiac volumes by means of a 3x3x3 binomial filter. The procedure concerning to weighted median filters with non-negative weights23 is applied to the MSCT volumes taking into account the replication factors contained in the same 3–D mask described in (14).
A total of 2500 enhanced volumes is available to evaluate the performance of 12 image enhancement measures considered by us. 1250 volumes generated using the binomial filter and 1250 generated by means of the weighted median filter.
From the original MSCT images, the values of the references measures are calculated (MSE and MAE are zero, meanwhile PSNR is considered 99 dB), while from the enhanced volumes, the binomial and weighted median improvement measures are estimated. For the complex measures of enhancement k1, k2 and k3 are fixed at 32, 32, 16, respectively. Then, the difference between the measures (enhanced minus reference) is computed and subsequently tabulated for obtaining two arrays the 1250 xsize of d1 and 1250 x size of d2, respectively.
Once labeled the arrays according to step 4 of the proposed procedure, the occurrences of the labels are quantified, obtaining for each measure with a frequency of occurrences of the 100%, the expected variation indicated in the Table 1. The performance of the image enhancement measures is evaluated with the aim of developing a score function for the assessment of the proposed image enhancement technique.
2.3 Construction of Score Function
Let’s consider the vector whose components are the considered image enhancement measures (see second column of Table 1). Let
i the vector of weights whose components take values −1 or +1.
i is +1 if the value of the corresponding metric calculated for enhanced image has the expected variation (¯/) with respect to the original image indicated in Table 1.
i is −1 otherwise.
The score function is the average of the weighted sum of the image enhancement measures normalized between zero and one. This score function is defined by
where is the vector of the normalized image enhancement measures and l is the number of image enhancement measures that varies from 1 to 12.
2.4 Technical Implementation and Run–Time
A desktop with an Intel Pentium Dual CPU (3.40 GHz), 16 GB memory and linux operating system is used for performing the experiments. The score function is codified using C++ and Visualization Toolkit (VTK)24.
In order to measure the run–time, each quality measure associated with the proposed score function was computed ten times for the twenty pairs (original volume and enhanced volume, each pair) of cardiac volumes described in section 3.
Computation of the SSMI was more time–consuming, accounting for more than 80% of the run–time of the score function. Nevertheless, it should be pointed out that the computational time of the proposed score function is very low.
In all case, the score function run–time depends on the number of voxels of the three–dimensional images. The average run–time of the score function was about 7.8 s for assessing the enhancement of each cardiac volume.
In order to validate the score function an experiment builds upon a previous work25 is proposed. The main objective of this work25 was to develop an enhancement scheme useful as an image processing procedure for attenuating artifacts in MSCT sequences and improving heart cavities segmentation. A methodology useful for evaluating the intra–subject variability of the complete approach was considered. In this sense, the segmented shapes obtained from enhanced images are compared with respect to manual segmentations performed by a cardiologist. A particular filter based on a similarity criterion that has been applied to improve the cardiac images.
This similarity enhancement is based on merging of two preprocessed versions of an original image according to a similarity criterion. One image is a high pass filtered image (IHPF) and the second is a low pass filtered image (ILPF). IHPF was generated using a scheme based on a morphological filter applied to a smoothed version of the original image obtained by means a combination of a Gaussian filter and a multi–scale Gaussian filter. Meanwhile, ILPF corresponded with a smoothed image achieved by using an average filter. A volume of the cardiac sequence analyzed (volume in diastole phase) was used in order to set the morphological filter parameters as follows. The segmentation process was applied by varying each parameter value. For each set of parameters, a comparison between the resulting volume and the ground truth volume traced by the cardiologist was obtained. This comparison was performed using the Dice score and both volume and surface errors. The optimal parameters obtained using this procedure, allow us to achieve a Dice score of 95.36%.
The objective behind the experiment is the validation of the developed score function (section 2.3). The considered hypothesis is that values maximizing score function have that maximize the values of the metrics used to assess the segmentation process reported25, at least in the final diastole, since that was the only instant used to set the high pass filter parameters.
Data used in this study was obtained as sequences 4–D (3–D + time) of cardiac images acquired with a scanner (LightSpeed VCT General Electric Medical System). The database consisted of 20 volumes representing anatomical information for a complete cardiac cycle for a patient. Each volume contains 326 slices. The slice thickness is the 0.625 mm. In all volumes, the slices have an isotropic resolution of 512´512 pixels with a pixels size of 0.488 mm.
This experiment takes into account the 20 volumes of the cardiac sequence processed using the high pass filter, the low pass filter and the optimal parameters previously reported25. Moreover, the complete formulation of the similarity criterion (case 4) and n = 4 as the number of the neighbors of the cross shaped neighborhood is considered. In figure 2 the score function and Dice score computed from the cardiac dataset are shown. The score function is indicated by a black solid line and the Dice score is indicated by a gray dashdotted line.
The score function obtained (mean ± standard deviation) for a cardiac MSCT sequence, including 20 volumes is 50.45% ± 6.54%, with a maximum value of 67.13% and a minimum value of 26.95%. When the segmentation is performed for the same sequence the Dice score is 91.12% ± 3.60% with a maximum value of 95.36% and a minimum value of 81.25%.
The most relevant aspect of the values of score function and Dice score is that the maximum in both metrics is reached for the volume 18 which represents the instant at end-diastole, as can be seen in Figure 2. In the work previously reported25, it was expected that the best segmentation would be found at this instant, since the process of setting of parameters was performed for this volume. The segmentation with lower Dice score is reached in volume 13, meanwhile the least enhanced image corresponds to volume 12. The results thus validate the considered hypothesis that the score function is maximal at same cardiac instant where the Dice score is maximal.
Furthermore, from Figure 2 it can be seen that the group of scores is less spread out, among themselves, for the instants around volume 18, specifically between the volumes 14 and 20. Since MSCT images are obtained by using reconstruction procedures that take into account the complete information acquired by X–ray multi–detectors, these images are at high risk for artifacts and noise. The artifacts in MSCT heart images are theoretically considered as the differences between the computerized tomography (CT) values obtained after the tomographic reconstruction and the true values for the attenuation coefficients of cardiac tissues26. Theoretically, the noise in CT is directly related to the number of detected X-ray photons. These detected photons could be modeled using a Poisson distribution27. The use of appropriate technical factors in the acquisition of CT is not sufficient to diminish the impact of the noise and artifacts, which varies from beat to beat. In any case, the MSCT images are strongly sensitive to changes in heart rate, so it is expected that each volume requires a custom enhancement technique.
The use of a new score function for evaluating the performance of cardiac medical image enhancement is proposed. It is a hybrid of other quality measures proposed in literature, which appears to be the score the best suited to the general problem. A high value of this score function is associated with an effective enhancement.
In any event, the process for enhancing the information associated with structures from cardiac medical images is gaining increased importance in the diagnosis of cardiac diseases and in guiding minimally invasive surgical and therapeutic procedures. This is basically due to the fact that researchers continue to develop new segmentation approaches that require the development of robust preprocessing techniques to improve the quality of the image. And, consequently, the existence of an appropriate measure of effectiveness of the quality of image enhancement is required to quantify and report the accuracy, precision and efficiency obtained by applying the enhancement algorithms.
Acknowledgements
Authors would like to thank the Universidad Simón Bolívar, Colombia, and the Investigation Dean’s Office of Universidad Nacional Experimental del Táchira, Venezuela for their support to this research. This work was supported by the Universidad Simón Bolívar, Colombia, Grant C2011720117
j.chacon@unisimonbolivar.edu.co