Monocular depth estimation is an essential step in scene geometry understanding. However, the depth maps predicted by the existing methods have the problems of loss of small target details and blurred edges. To this end, we propose a monocular depth estimation method based on multilevel feature fusion and edge optimization to obtain depth maps with rich depth details and precise edges. First, we improved the encoder part to make it better adapted for the depth estimation task. In addition, we fill the U-shaped network with a dense feature fusion layer (Dense-FL) to capture global context information and then combine the proposed structure attention module to further enhance the obtained feature information, so the depth map contains more detailed information. Finally, we design an edge optimization module to incorporate edge features into the training process and combine the proposed reweighted loss and image edge detail loss to constrain the network, so as to further improve the learning ability of the model to the edge of the object. The experimental results on the KITTI dataset show that the depth prediction results obtained by our method perform better at small targets and the edges of objects in the scene, and the structure of the objects is more complete. Generalization experiments on the Cityscapes and Sceneflow datasets also further confirmed the effectiveness and superiority of our proposed method. |
ACCESS THE FULL ARTICLE
No SPIE Account? Create one
Convolution
Image fusion
Image processing
Computer programming
Cameras
Image restoration
RGB color model