Identification of illegal outdoor advertisements based on CLIP fine-tuning and OCR technology
H Zhang, Z Ding, M Sharid Kayes Dipu, P Lv… - IEEE …, 2024 - ieeexplore.ieee.org
H Zhang, Z Ding, M Sharid Kayes Dipu, P Lv, Y Huang, HS Abdullahi, A Zhang, Z Song…
IEEE Access, 2024•ieeexplore.ieee.orgRecognizing unauthorized outdoor advertising is important for a city's visual appeal,
organizational structure, and adherence to regulations. This paper aims to solve the problem
that traditional models are difficult to accurately identify illegal outdoor advertising containing
only text or its variants in the process of recognizing outdoor advertising in the form of
graphic and text. The method described in the article for identifying illegal outdoor
advertisements based on CLIP fine-tuning and OCR technology involves a comprehensive …
organizational structure, and adherence to regulations. This paper aims to solve the problem
that traditional models are difficult to accurately identify illegal outdoor advertising containing
only text or its variants in the process of recognizing outdoor advertising in the form of
graphic and text. The method described in the article for identifying illegal outdoor
advertisements based on CLIP fine-tuning and OCR technology involves a comprehensive …
Recognizing unauthorized outdoor advertising is important for a city’s visual appeal, organizational structure, and adherence to regulations. This paper aims to solve the problem that traditional models are difficult to accurately identify illegal outdoor advertising containing only text or its variants in the process of recognizing outdoor advertising in the form of graphic and text. The method described in the article for identifying illegal outdoor advertisements based on CLIP fine-tuning and OCR technology involves a comprehensive approach. The methodology involves fine-tuning the CLIP model using a combination of graphics and text. The first process uses the fine-tuned CLIP model to perform image recognition of outdoor advertisements in the form of images and texts. In the fine-tuned CLIP model, the zero-shot classification ability of the CLIP model is utilized, and it is integrated with the tip-adapter technology and the cache model. In addition, few-shot learning is incorporated into the fine-tuning process to solve the problem of data scarcity in illegal outdoor advertising. The model is trained using features extracted from images of illegal outdoor advertisements, enabling the model to understand the diversity of such advertisements, adapt to various languages, and develop reasoning and context understanding capabilities. The second process involves leveraging the PP-OCRv4 model to extract text information from outdoor advertisement images and accurately matching it with keywords in a pre-established banned word database. Through these two processes, the recognition of outdoor advertisements in image-text form is achieved. Experimental results show that the method achieves a testing accuracy of 93.5% on a self-built dataset of outdoor advertisement images. Furthermore, the PP-OCRv4 model improves text recognition accuracy by 3.83% compared to the traditional PP-OCRv3 model, and enhances image recognition accuracy by 15.46% over the traditional ResNet50 model. Therefore, the proposed method of fine-tuning CLIP and OCR combined with illegal outdoor advertising recognition improves the recognition accuracy of illegal outdoor advertising combined with images and texts to a certain extent.
ieeexplore.ieee.org
Showing the best result for this search. See all results