# American Institute of Mathematical Sciences

November  2020, 3(4): 219-227. doi: 10.3934/mfc.2020013

## Sketch-based image retrieval via CAT loss with elastic net regularization

 1 School of Statistics and Mathematics, Big Data and Educational Statistics Application Laboratory, Collaborative Innovation Development Center of Pearl River Delta Science & Technology Finance Industry, Guangdong University of Finance & Economics, Guangzhou, Guangdong, 510320, China 2 School of Statistics and Mathematics, Guangdong University of Finance & Economics, Guangzhou, Guangdong, 510320, China 3 Information Science School, Guangdong University of Finance & Economics, Guangzhou, Guangdong, 510320, China

* Corresponding author: Jia Cai

Received  December 2019 Revised  March 2020 Published  June 2020

Fund Project: The first author is supported partially by National Natural Science Foundation of China (11871167,11671171), Science and Technology Program of Guangzhou (201707010228), Special Support Plan for High-Level Talents of Guangdong Province (2019TQ05X571), Foundation of Guangdong Educational Committee (2019KZDZX1023), Project of Collaborative Innovation Development Center of Pearl River Delta Science & Technology Finance Industry (19XT01), National Social Science Foundation (19AJY027), Natural Science Foundation of Guangdong (2016A030313710)

Fine-grained sketch-based image retrieval (FG-SBIR) is an important problem that uses free-hand human sketch as queries to perform instance-level retrieval of photos. Human sketches are generally highly abstract and iconic, which makes FG-SBIR a challenging task. Existing FG-SBIR approaches using triplet loss with $\ell_2$ regularization or higher-order energy function to conduct retrieval performance, which neglect the feature gap between different domains (sketches, photos) and need to select the weight layer matrix. This yields high computational complexity. In this paper, we define a new CAT loss function with elastic net regularization based on attention model. It can close the feature gap between different subnetworks and embody the sparsity of the sketches. Experiments demonstrate that the proposed approach is competitive with state-of-the-art methods.

Citation: Jia Cai, Guanglong Xu, Zhensheng Hu. Sketch-based image retrieval via CAT loss with elastic net regularization. Mathematical Foundations of Computing, 2020, 3 (4) : 219-227. doi: 10.3934/mfc.2020013
##### References:

show all references

##### References:
Architecture of the model
Examples of stroke removal
Network structure
 $Index$ Layer Type Filter size Filter number Stride Pad Output size $0$ $Input$ $-$ $-$ $-$ $-$ $225\times225$ $1$ $L1$ $Conv$ $15\times15$ 64 3 0 $71\times71$ $2$ $ReLU$ $-$ $-$ $-$ $-$ $71\times71$ $3$ Maxpool $3\times3$ $-$ 2 0 $35\times35$ $4$ $L2$ $Conv$ $5\times5$ 128 1 0 $31\times31$ $5$ $ReLU$ $-$ $-$ $-$ $-$ $31\times31$ $6$ Maxpool $3\times3$ $-$ 2 0 $15\times15$ $7$ $L3$ $Conv$ $3\times3$ 256 1 1 $15\times15$ $8$ $ReLU$ $-$ $-$ $-$ $-$ $15\times15$ $9$ $L4$ $Conv$ $3\times3$ 256 1 1 $15\times15$ $10$ $ReLU$ $-$ $-$ $-$ $-$ $15\times15$ $11$ $L5$ $Conv$ $3\times3$ 256 1 1 $15\times15$ $12$ $ReLU$ $-$ $-$ $-$ $-$ $15\times15$ $13$ Maxpool $3\times3$ $-$ 2 0 $7\times7$ $14$ $L6$ $Conv( = FC)$ $7\times7$ 512 1 $0$ $1\times1$ $15$ $ReLU$ $-$ $-$ $-$ $-$ $1\times1$ $16$ Dropout (0.55) $-$ $-$ $-$ $-$ $1\times1$ $17$ $L7$ $Conv( = FC)$ $1\times1$ 256 1 $0$ $1\times1$ $18$ $ReLU$ $-$ $-$ $-$ $-$ $1\times1$ $19$ Dropout (0.55) $-$ $-$ $-$ $-$ $1\times1$
 $Index$ Layer Type Filter size Filter number Stride Pad Output size $0$ $Input$ $-$ $-$ $-$ $-$ $225\times225$ $1$ $L1$ $Conv$ $15\times15$ 64 3 0 $71\times71$ $2$ $ReLU$ $-$ $-$ $-$ $-$ $71\times71$ $3$ Maxpool $3\times3$ $-$ 2 0 $35\times35$ $4$ $L2$ $Conv$ $5\times5$ 128 1 0 $31\times31$ $5$ $ReLU$ $-$ $-$ $-$ $-$ $31\times31$ $6$ Maxpool $3\times3$ $-$ 2 0 $15\times15$ $7$ $L3$ $Conv$ $3\times3$ 256 1 1 $15\times15$ $8$ $ReLU$ $-$ $-$ $-$ $-$ $15\times15$ $9$ $L4$ $Conv$ $3\times3$ 256 1 1 $15\times15$ $10$ $ReLU$ $-$ $-$ $-$ $-$ $15\times15$ $11$ $L5$ $Conv$ $3\times3$ 256 1 1 $15\times15$ $12$ $ReLU$ $-$ $-$ $-$ $-$ $15\times15$ $13$ Maxpool $3\times3$ $-$ 2 0 $7\times7$ $14$ $L6$ $Conv( = FC)$ $7\times7$ 512 1 $0$ $1\times1$ $15$ $ReLU$ $-$ $-$ $-$ $-$ $1\times1$ $16$ Dropout (0.55) $-$ $-$ $-$ $-$ $1\times1$ $17$ $L7$ $Conv( = FC)$ $1\times1$ 256 1 $0$ $1\times1$ $18$ $ReLU$ $-$ $-$ $-$ $-$ $1\times1$ $19$ Dropout (0.55) $-$ $-$ $-$ $-$ $1\times1$
Comparative results against baselines on QMUL-shoe dataset
 QMUL-shoe $Acc.@1$ $Acc.@10$ HOG+BoW+RankSVM 17.39% 67.83% Deep ISN 20.00% 62.61% Triplet SN 52.17% 92.17% Triplet DSSA 61.74% 94.78% Our model 56.52% 96.52%
 QMUL-shoe $Acc.@1$ $Acc.@10$ HOG+BoW+RankSVM 17.39% 67.83% Deep ISN 20.00% 62.61% Triplet SN 52.17% 92.17% Triplet DSSA 61.74% 94.78% Our model 56.52% 96.52%
Comparative results against baselines on QMUL-chair dataset
 QMUL-chair $Acc.@1$ $Acc.@10$ HOG+BoW+RankSVM 28.87% 67.01% Deep ISN 47.42% 82.47% Triplet SN 72.16% 98.96% Triplet DSSA 81.44% 95.88% Our model 81.44% 98.97%
 QMUL-chair $Acc.@1$ $Acc.@10$ HOG+BoW+RankSVM 28.87% 67.01% Deep ISN 47.42% 82.47% Triplet SN 72.16% 98.96% Triplet DSSA 81.44% 95.88% Our model 81.44% 98.97%
Comparative results against baselines on QMUL-handbag dataset
 QMUL-handbag $Acc.@1$ $Acc.@10$ HOG+BoW+RankSVM 2.38% 10.71% Deep ISN 9.52% 44.05% Triplet SN 39.88% 82.14% Triplet DSSA 49.40% 82.74% Our model 54.76% 88.69%
 QMUL-handbag $Acc.@1$ $Acc.@10$ HOG+BoW+RankSVM 2.38% 10.71% Deep ISN 9.52% 44.05% Triplet SN 39.88% 82.14% Triplet DSSA 49.40% 82.74% Our model 54.76% 88.69%
Contributions of different components
 QMUL-shoe $Acc.@1$ $Acc.@10$ Triplet loss+data aug 50.43% 93.91% CAT loss+no data aug 49.57% 94.78% Our model 54.78% 96.52% QMUL-chair $Acc.@1$ $Acc.@10$ Triplet loss+data aug 78.35% 97.94% CAT loss+no data aug 76.29% 96.91% Our model 81.44% 98.97% QMUL-handbag $Acc.@1$ $Acc.@10$ Triplet loss+data aug 51.19% 86.31% CAT loss+no data aug 51.79% 86.90% Our model 54.76% 88.69%
 QMUL-shoe $Acc.@1$ $Acc.@10$ Triplet loss+data aug 50.43% 93.91% CAT loss+no data aug 49.57% 94.78% Our model 54.78% 96.52% QMUL-chair $Acc.@1$ $Acc.@10$ Triplet loss+data aug 78.35% 97.94% CAT loss+no data aug 76.29% 96.91% Our model 81.44% 98.97% QMUL-handbag $Acc.@1$ $Acc.@10$ Triplet loss+data aug 51.19% 86.31% CAT loss+no data aug 51.79% 86.90% Our model 54.76% 88.69%
 [1] Israa Mohammed Khudher, Yahya Ismail Ibrahim, Suhaib Abduljabbar Altamir. Individual biometrics pattern based artificial image analysis techniques. Numerical Algebra, Control & Optimization, 2021  doi: 10.3934/naco.2020056 [2] Peter Benner, Jens Saak, M. Monir Uddin. Balancing based model reduction for structured index-2 unstable descriptor systems with application to flow control. Numerical Algebra, Control & Optimization, 2016, 6 (1) : 1-20. doi: 10.3934/naco.2016.6.1 [3] Y. Latushkin, B. Layton. The optimal gap condition for invariant manifolds. Discrete & Continuous Dynamical Systems - A, 1999, 5 (2) : 233-268. doi: 10.3934/dcds.1999.5.233 [4] Manoel J. Dos Santos, Baowei Feng, Dilberto S. Almeida Júnior, Mauro L. Santos. Global and exponential attractors for a nonlinear porous elastic system with delay term. Discrete & Continuous Dynamical Systems - B, 2021, 26 (5) : 2805-2828. doi: 10.3934/dcdsb.2020206 [5] Lekbir Afraites, Abdelghafour Atlas, Fahd Karami, Driss Meskine. Some class of parabolic systems applied to image processing. Discrete & Continuous Dynamical Systems - B, 2016, 21 (6) : 1671-1687. doi: 10.3934/dcdsb.2016017 [6] Zhihua Zhang, Naoki Saito. PHLST with adaptive tiling and its application to antarctic remote sensing image approximation. Inverse Problems & Imaging, 2014, 8 (1) : 321-337. doi: 10.3934/ipi.2014.8.321 [7] Enkhbat Rentsen, Battur Gompil. Generalized Nash equilibrium problem based on malfatti's problem. Numerical Algebra, Control & Optimization, 2021, 11 (2) : 209-220. doi: 10.3934/naco.2020022 [8] Teddy Pichard. A moment closure based on a projection on the boundary of the realizability domain: 1D case. Kinetic & Related Models, 2020, 13 (6) : 1243-1280. doi: 10.3934/krm.2020045 [9] Abdulrazzaq T. Abed, Azzam S. Y. Aladool. Applying particle swarm optimization based on Padé approximant to solve ordinary differential equation. Numerical Algebra, Control & Optimization, 2021  doi: 10.3934/naco.2021008 [10] Jan Prüss, Laurent Pujo-Menjouet, G.F. Webb, Rico Zacher. Analysis of a model for the dynamics of prions. Discrete & Continuous Dynamical Systems - B, 2006, 6 (1) : 225-235. doi: 10.3934/dcdsb.2006.6.225 [11] Johannes Kellendonk, Lorenzo Sadun. Conjugacies of model sets. Discrete & Continuous Dynamical Systems - A, 2017, 37 (7) : 3805-3830. doi: 10.3934/dcds.2017161 [12] Didier Bresch, Thierry Colin, Emmanuel Grenier, Benjamin Ribba, Olivier Saut. A viscoelastic model for avascular tumor growth. Conference Publications, 2009, 2009 (Special) : 101-108. doi: 10.3934/proc.2009.2009.101 [13] Ondrej Budáč, Michael Herrmann, Barbara Niethammer, Andrej Spielmann. On a model for mass aggregation with maximal size. Kinetic & Related Models, 2011, 4 (2) : 427-439. doi: 10.3934/krm.2011.4.427 [14] Martin Bohner, Sabrina Streipert. Optimal harvesting policy for the Beverton--Holt model. Mathematical Biosciences & Engineering, 2016, 13 (4) : 673-695. doi: 10.3934/mbe.2016014 [15] Juan Manuel Pastor, Javier García-Algarra, Javier Galeano, José María Iriondo, José J. Ramasco. A simple and bounded model of population dynamics for mutualistic networks. Networks & Heterogeneous Media, 2015, 10 (1) : 53-70. doi: 10.3934/nhm.2015.10.53 [16] Chin-Chin Wu. Existence of traveling wavefront for discrete bistable competition model. Discrete & Continuous Dynamical Systems - B, 2011, 16 (3) : 973-984. doi: 10.3934/dcdsb.2011.16.973 [17] Michael Grinfeld, Amy Novick-Cohen. Some remarks on stability for a phase field model with memory. Discrete & Continuous Dynamical Systems - A, 2006, 15 (4) : 1089-1117. doi: 10.3934/dcds.2006.15.1089 [18] Alba Málaga Sabogal, Serge Troubetzkoy. Minimality of the Ehrenfest wind-tree model. Journal of Modern Dynamics, 2016, 10: 209-228. doi: 10.3934/jmd.2016.10.209 [19] Paula A. González-Parra, Sunmi Lee, Leticia Velázquez, Carlos Castillo-Chavez. A note on the use of optimal control on a discrete time model of influenza dynamics. Mathematical Biosciences & Engineering, 2011, 8 (1) : 183-197. doi: 10.3934/mbe.2011.8.183 [20] Martial Agueh, Reinhard Illner, Ashlin Richardson. Analysis and simulations of a refined flocking and swarming model of Cucker-Smale type. Kinetic & Related Models, 2011, 4 (1) : 1-16. doi: 10.3934/krm.2011.4.1

Impact Factor: