Bi-lstm sequence modeling for on-the-fly fine-grained sketch-based image retrieval
IEEE Transactions on Artificial Intelligence, 2022•ieeexplore.ieee.org
Fine-grained sketch-based image retrieval (FG-SBIR) addresses the problem of retrieving a
particular photo for a given query sketch. However, its widespread applicability is limited by
the fact that it is difficult to draw a complete sketch, and the drawing process often takes
more time than the text/tag method. On-the-fly FG-SBIR was proposed to address this
problem, in which image retrieval is performed after each stroke. The aim is to retrieve the
target photo using the least number of strokes. Each photo corresponds to a sketch drawing …
particular photo for a given query sketch. However, its widespread applicability is limited by
the fact that it is difficult to draw a complete sketch, and the drawing process often takes
more time than the text/tag method. On-the-fly FG-SBIR was proposed to address this
problem, in which image retrieval is performed after each stroke. The aim is to retrieve the
target photo using the least number of strokes. Each photo corresponds to a sketch drawing …
Fine-grained sketch-based image retrieval (FG-SBIR) addresses the problem of retrieving a particular photo for a given query sketch. However, its widespread applicability is limited by the fact that it is difficult to draw a complete sketch, and the drawing process often takes more time than the text/tag method. On-the-fly FG-SBIR was proposed to address this problem, in which image retrieval is performed after each stroke. The aim is to retrieve the target photo using the least number of strokes. Each photo corresponds to a sketch drawing episode, in which a significant correlation exists between these incomplete sketches. This correlation will allow a more efficient learning embedding space for incomplete sketches, which is considered in this study. First, a triplet network, as used in the classical FG-SBIR framework, was designed to learn the joint embedding space shared between the photo and its corresponding complete sketch. Second, assuming strong time correlation, each sketch drawing episode is considered a sequence, and each incomplete sketch in the drawing episode is extracted as a feature vector. A learnable Bi-LSTM module and triplet loss function map the feature space of incomplete sketches obtained from the base model for efficient representation. In the experiments, we proposed more realistic challenges, and our method achieved superior early retrieval efficiency over the state-of-the-art baseline methods on two publicly available fine-grained sketch retrieval datasets.
ieeexplore.ieee.org
Showing the best result for this search. See all results