This work aims at creating 3D freehand ultrasound reconstructions from 2D probes with image-based tracking, therefore not requiring expensive or cumbersome external tracking hardware. Existing model-based approaches such as speckle decorrelation only partially capture the underlying complexity of ultrasound image formation, thus producing reconstruction accuracies incompatible with current clinical requirements. Here, we introduce an alternative approach that relies on a statistical analysis rather than physical models, and use a convolutional neural network (CNN) to directly estimate the motion of successive ultrasound frames in an end-to-end fashion. We demonstrate how this technique is related to prior approaches, and derive how to further improve its predictive capabilities by incorporating additional information such as data from inertial measurement units (IMU). This novel method is thoroughly evaluated and analyzed on a dataset of 800 in vivo ultrasound sweeps, yielding unprecedentedly accurate reconstructions with a median normalized drift of 5.2%. Even on long sweeps exceeding 20 cm with complex trajectories, this allows to obtain length measurements with median errors of 3.4%, hence paving the way toward translation into clinical routine.
Keywords: 3D freehand ultrasound; Deep learning; Inertial measurement unit; Motion estimation.
Copyright © 2018 Elsevier B.V. All rights reserved.