• No results found

The purpose of this research was to create a four-chamber model of the heart based on a 3D ultrasound image. Simultaneous segmentation of the chambers has advantages over individual segmentation in that good visibility and segmentation of one chamber can be used to improve the placement of other chambers, and that less user placed landmarks are needed in order to perform segmentation.

Discussion

(a) Bland-Altman plot of LV EF errors. (b) Bland-Altman plot of RV EF errors.

(c) Bland-Altman plot of LA EF errors. (d) Bland-Altman plot of RA EF errors.

Figure III.4: Bland-Altman plot of the four chamber estimate and ground truth of ejection fractions.

Table III.6: Table of signed median, signed average, standard deviation and Pearson correlation coefficient of the ejection fraction error for each chamber. A positive value means the four chamber model estimate was larger than ground truth, negative means the opposite.

LV error RV error LA error RA error Signed Median 0.5 pts 0.1 pts -2.7 pts -9.3 pts Signed Average -1.2 pts -1.9 pts -2.7 pts -9.3 pts

SD 6.0 pts 6.3 pts 4.4 pts 9.9 pts

Pearson 0.73 0.67 0.78 0.57

GE’s AutoRVQ requires 6 points to be manually placed, but by segmenting together with the LV, LV’s placement can be used as a guide. This is true even if the entirety of the LV is not visible, as long as the septal area is well visualized.

In the same way the LA and RA can more easily be placed in the image if other feature’s placement can be used. This algorithm required only one input, which is needed due to the large variation in how the heart is placed in the image.

Some restrictions on the model was introduced in Section III.2.2. These included restrictions to make sure that the model did not self-intersect, and to stop the apex from going above the top of the ultrasound image. The reason for these restrictions was to avoid the outer RV wall being attracted towards the septum, which can happen if the outer wall is not properly visualized. The same is the case for not allowing the LA and RA apex to move too high up towards the base.

The presuppositions about the cardiac geometry is useful to have robustness

Figure III.5: A comparison of a single chamber and four chamber approach in a case with partially obscured LV. The four chamber approach is shown on the left, the single on the right. The four chamber version of the LV is better at determining the apex and the base, likely helped by the RV and LA placement.

against reduced view or image quality. In general, adding restrictions to the Kalman filter output based on what output is physically possible can be a good way of improving accuracy. This is especially true if the restrictions are based on different information than what is used in the prediction step, as it means that more of the systems characteristics can be put into the algorithm. Such a stage could have advantages in other applications of the Kalman filter.

It is possible that this four-chamber model could be helpful in cases where the field of view is partially limited for a chamber. The strong presuppositions on the final shape inherent to a deformable model means that it could be better at guessing the partially seen shape. An example of this is seen in Figure III.5.

Validating of and exploring this approach is an avenue of further research.

The average EF error was slightly negative for all chambers, meaning the four chamber model underestimated the EF. In addition, average end-diastole volume was underestimated for all chambers, while the end-systole volume was overestimated, pointing to an issue where the model is too rigid and having a slight issue with changing enough between end-systole and end-diastole. In median values, however, the signed LV and RV EF errors were positive, pointing to a slight overestimation of EF instead. This seems to be due to a few images with a high underestimation. This is likely to be cases where the image quality was poor, and high uncertainty about proper surface placement.

Segmentation of the RA has been less studied than the other chambers, and in this work a 2D biplane Simpson method was used for determining volume instead of a true 3D segmentation method. Use of the biplane Simpson for atrial volumes is considered standard, and has been used by for instance Wang et al[28] and Aune et al.[1]. The latter researched 3D methods of atrial volume quantification, but as that is not considered standard it was not used as a ground truth in this article. Aune et al. does conclude that a 3D method has higher reproducibility, and the high error of the RA in this article could partially be explained by the ground truth being a 2D method instead of 3D like the other chambers. In addition, the Simpson method is not designed for RA volume

Discussion quantification in particular, and there is no commercial software specifically designed for RA assessment. Using a more generic method rather than a specially designed one as ground truth could further explain the high error.

This work could be considered an extension of the the biventricular method made by Bersvendsen et al.[3], also made for segmentation of ultrasound images.

The implementation of the multichamber algorithm and the models used are different, even though the mathematics of thickness adjustment is the same.

Bersvendsen reported errors of -0.7±5.2 pts for the LV ejection fraction, and 2.4±7.2 pts for the RV. This is close to the values found in this paper.

A comparison can also be made to works on one-chamber cardiac segmentation of ultrasound images. Orderud et al.[21] used Kalman filters to segment the LV, and had a 3.6±21.4 ml error for ED and 9 ±17.4 ml error for ES. The algorithm presented in that paper used fewer control nodes, and that combined with the increased focus on an accurate apex in this paper might explain the difference. In general the 4 chamber model is as accurate as other segmentation algorithms using similar techniques.

There are several previous works dealing with four-chamber models, for instance by Pace et al [23], Zheng et al [31], Zhen et al [30] and Jafari et al[13].

These works focus on CT or MR images and use different methods than those found in this work. In terms of ultrasound, Medvedofsky et al.[17] did a four-chamber segmentation of 3D ultrasound images using an automated adaptive analytics Aalgorithm. While all four chambers were segmented, the only metrics were based on LV and LA volumes.

Comparing Pearson correlations between this work and the one in Medved-ofsky et al., the values are similar for LV EDV, LV ESV and LAV, but our work has a lower EF correlation. LA EF correlation is not listed in the other paper, and it only has a single value for LA volume, instead of EDV and ESV.

In terms of signed average error for LV EDV, LV ESV and LV EF this work got better results. Comparison of LA values is slightly more difficult as they did not provide LA ESV or LA EDV but instead LA volume at LV ES, but they list a higher signed average error than the ones in this paper for either LA volume error.

Deep learning methods has risen in popularity in later years, and have proven a viable tool in many fields including image analysis [25]. Li et al. [16] used deep learning on multi-view images of the LV, and compared it with several other algorithms on volumetric clinical indices. The signed average errors in their MV-RAN model for LV EDV and ESV were -7.5±11.0 ml and -3.8 ± 9.2 ml respectively. Signed average error on EF was -0.9±6.8 ml. This means that their error was smaller in terms of both LV EDV and ESV, but EF was essentially the same, with the algorithm in this paper being a small improvement.

Li et al. also calculated signed errors for clinical indices in the Net, U-Net++ and ACNN models. Much like with MC-RAN, their signed error was better than our model for LV EDV and ESV, but comparable or slightly worse in terms of EF.

A possible reason for the difference is the use of multi-view(MV) instead of full 3D images, as is used in this work. As both ground truth and model only

relied on 3 2D images, good accuracy only in those directions would lead to good results, so it might be slightly easier to get low errors with multi-view instead of 3D. The advantage of our model in terms of EF might be that if the model overestimates a chamber in EDV, it might make a similar overestimation at ESV, cancelling the error out in EF calculation. This is because the segmentation for each frame depends on the previous, so if an area has low image quality, an error is likely to persist and be segmented the same way across frames.

For deep learning used to segment a fully 3D image, Dong et al.[8] used a random forest to determine LV volumes, and used LV EDV, LV ESV and LV EF as metrics. Their results seem to have higher confidence intervals, with LV EF having a mean error of 4 pts and a confidence interval of (-19 pts, 12 pts), compared to a mean error of -1.2 pts with a confidence interval of (-12.7 pts, 10.5 pts).

Nillesen et al.[19] used a fully automatic algorithm using deformable models to segment the RV in 3D TEE images. The signed errors found there are slightly bigger, which could in part be due to the difference between a fully automatic model and the model in this work, which requires a manually done translation.

This difference is important to consider in all the above comparisons.

The deep learning articles referred to above have some commonalities in comparison to this article. They do not require user input, as they evaluate the entire image, but they appear to have weaker consistency across time.They all focus on one chamber, unlike the main novelty of this paper of evaluating all chambers. This also allows for evaluation of the less-studied LA and RA chambers. The promising results in this paper suggest that multi-chamber evaluation is a good method. While the EDV and ESV results are slightly worse in this work than the deep learning ones, EF values are mostly better.

Pearson correlation can be used to compare the output of the algorithm with the ground truth. The results in this paper can be compared with Moradi et al.[18] for the LV and Nillesen et al.[19] for the RV. Our algorithm got similar correlation coefficients, with the exception of having slightly better values for RV quantification, and worse results for the LV ejection fraction correlation, where Nillesen et al achieved a value of 0.85.

The model could easily be expanded to determine important landmarks on the model, or to include more features, like the aortic outflow tract. This would lead to an even more extensive evaluation and could be used to evaluate for instance aortic valve size. This could be a direction of further research.

There are some limitations to the study. Ultrasound images often have a limited field of view, and especially the RV free wall can be difficult to determine in the anterior wall[22]. This could indicate that there is some uncertainty to the ground truth values, and this is especially true for the right chambers.

Some chambers had high error either in terms of end-diastolic volume or end-systolic volume estimation. In particular determining the LV apex and the anterior RV wall proved difficult, as can be expected from the sometimes poor image quality in that area[22].

This model was not made to run in real time. Segmenting a single frame takes an average time of 0.1 seconds. By removing nodes in areas considered

Conclusions less important real-time running could be achieved, but for this work accuracy was considered more important.