# American Institute of Mathematical Sciences

November  2020, 3(4): 219-227. doi: 10.3934/mfc.2020013

## Sketch-based image retrieval via CAT loss with elastic net regularization

 1 School of Statistics and Mathematics, Big Data and Educational Statistics Application Laboratory, Collaborative Innovation Development Center of Pearl River Delta Science & Technology Finance Industry, Guangdong University of Finance & Economics, Guangzhou, Guangdong, 510320, China 2 School of Statistics and Mathematics, Guangdong University of Finance & Economics, Guangzhou, Guangdong, 510320, China 3 Information Science School, Guangdong University of Finance & Economics, Guangzhou, Guangdong, 510320, China

* Corresponding author: Jia Cai

Received  December 2019 Revised  March 2020 Published  June 2020

Fund Project: The first author is supported partially by National Natural Science Foundation of China (11871167,11671171), Science and Technology Program of Guangzhou (201707010228), Special Support Plan for High-Level Talents of Guangdong Province (2019TQ05X571), Foundation of Guangdong Educational Committee (2019KZDZX1023), Project of Collaborative Innovation Development Center of Pearl River Delta Science & Technology Finance Industry (19XT01), National Social Science Foundation (19AJY027), Natural Science Foundation of Guangdong (2016A030313710)

Fine-grained sketch-based image retrieval (FG-SBIR) is an important problem that uses free-hand human sketch as queries to perform instance-level retrieval of photos. Human sketches are generally highly abstract and iconic, which makes FG-SBIR a challenging task. Existing FG-SBIR approaches using triplet loss with $\ell_2$ regularization or higher-order energy function to conduct retrieval performance, which neglect the feature gap between different domains (sketches, photos) and need to select the weight layer matrix. This yields high computational complexity. In this paper, we define a new CAT loss function with elastic net regularization based on attention model. It can close the feature gap between different subnetworks and embody the sparsity of the sketches. Experiments demonstrate that the proposed approach is competitive with state-of-the-art methods.

Citation: Jia Cai, Guanglong Xu, Zhensheng Hu. Sketch-based image retrieval via CAT loss with elastic net regularization. Mathematical Foundations of Computing, 2020, 3 (4) : 219-227. doi: 10.3934/mfc.2020013
##### References:

show all references

##### References:
Architecture of the model
Examples of stroke removal
Network structure
 $Index$ Layer Type Filter size Filter number Stride Pad Output size $0$ $Input$ $-$ $-$ $-$ $-$ $225\times225$ $1$ $L1$ $Conv$ $15\times15$ 64 3 0 $71\times71$ $2$ $ReLU$ $-$ $-$ $-$ $-$ $71\times71$ $3$ Maxpool $3\times3$ $-$ 2 0 $35\times35$ $4$ $L2$ $Conv$ $5\times5$ 128 1 0 $31\times31$ $5$ $ReLU$ $-$ $-$ $-$ $-$ $31\times31$ $6$ Maxpool $3\times3$ $-$ 2 0 $15\times15$ $7$ $L3$ $Conv$ $3\times3$ 256 1 1 $15\times15$ $8$ $ReLU$ $-$ $-$ $-$ $-$ $15\times15$ $9$ $L4$ $Conv$ $3\times3$ 256 1 1 $15\times15$ $10$ $ReLU$ $-$ $-$ $-$ $-$ $15\times15$ $11$ $L5$ $Conv$ $3\times3$ 256 1 1 $15\times15$ $12$ $ReLU$ $-$ $-$ $-$ $-$ $15\times15$ $13$ Maxpool $3\times3$ $-$ 2 0 $7\times7$ $14$ $L6$ $Conv( = FC)$ $7\times7$ 512 1 $0$ $1\times1$ $15$ $ReLU$ $-$ $-$ $-$ $-$ $1\times1$ $16$ Dropout (0.55) $-$ $-$ $-$ $-$ $1\times1$ $17$ $L7$ $Conv( = FC)$ $1\times1$ 256 1 $0$ $1\times1$ $18$ $ReLU$ $-$ $-$ $-$ $-$ $1\times1$ $19$ Dropout (0.55) $-$ $-$ $-$ $-$ $1\times1$
 $Index$ Layer Type Filter size Filter number Stride Pad Output size $0$ $Input$ $-$ $-$ $-$ $-$ $225\times225$ $1$ $L1$ $Conv$ $15\times15$ 64 3 0 $71\times71$ $2$ $ReLU$ $-$ $-$ $-$ $-$ $71\times71$ $3$ Maxpool $3\times3$ $-$ 2 0 $35\times35$ $4$ $L2$ $Conv$ $5\times5$ 128 1 0 $31\times31$ $5$ $ReLU$ $-$ $-$ $-$ $-$ $31\times31$ $6$ Maxpool $3\times3$ $-$ 2 0 $15\times15$ $7$ $L3$ $Conv$ $3\times3$ 256 1 1 $15\times15$ $8$ $ReLU$ $-$ $-$ $-$ $-$ $15\times15$ $9$ $L4$ $Conv$ $3\times3$ 256 1 1 $15\times15$ $10$ $ReLU$ $-$ $-$ $-$ $-$ $15\times15$ $11$ $L5$ $Conv$ $3\times3$ 256 1 1 $15\times15$ $12$ $ReLU$ $-$ $-$ $-$ $-$ $15\times15$ $13$ Maxpool $3\times3$ $-$ 2 0 $7\times7$ $14$ $L6$ $Conv( = FC)$ $7\times7$ 512 1 $0$ $1\times1$ $15$ $ReLU$ $-$ $-$ $-$ $-$ $1\times1$ $16$ Dropout (0.55) $-$ $-$ $-$ $-$ $1\times1$ $17$ $L7$ $Conv( = FC)$ $1\times1$ 256 1 $0$ $1\times1$ $18$ $ReLU$ $-$ $-$ $-$ $-$ $1\times1$ $19$ Dropout (0.55) $-$ $-$ $-$ $-$ $1\times1$
Comparative results against baselines on QMUL-shoe dataset
 QMUL-shoe $Acc.@1$ $Acc.@10$ HOG+BoW+RankSVM 17.39% 67.83% Deep ISN 20.00% 62.61% Triplet SN 52.17% 92.17% Triplet DSSA 61.74% 94.78% Our model 56.52% 96.52%
 QMUL-shoe $Acc.@1$ $Acc.@10$ HOG+BoW+RankSVM 17.39% 67.83% Deep ISN 20.00% 62.61% Triplet SN 52.17% 92.17% Triplet DSSA 61.74% 94.78% Our model 56.52% 96.52%
Comparative results against baselines on QMUL-chair dataset
 QMUL-chair $Acc.@1$ $Acc.@10$ HOG+BoW+RankSVM 28.87% 67.01% Deep ISN 47.42% 82.47% Triplet SN 72.16% 98.96% Triplet DSSA 81.44% 95.88% Our model 81.44% 98.97%
 QMUL-chair $Acc.@1$ $Acc.@10$ HOG+BoW+RankSVM 28.87% 67.01% Deep ISN 47.42% 82.47% Triplet SN 72.16% 98.96% Triplet DSSA 81.44% 95.88% Our model 81.44% 98.97%
Comparative results against baselines on QMUL-handbag dataset
 QMUL-handbag $Acc.@1$ $Acc.@10$ HOG+BoW+RankSVM 2.38% 10.71% Deep ISN 9.52% 44.05% Triplet SN 39.88% 82.14% Triplet DSSA 49.40% 82.74% Our model 54.76% 88.69%
 QMUL-handbag $Acc.@1$ $Acc.@10$ HOG+BoW+RankSVM 2.38% 10.71% Deep ISN 9.52% 44.05% Triplet SN 39.88% 82.14% Triplet DSSA 49.40% 82.74% Our model 54.76% 88.69%
Contributions of different components
 QMUL-shoe $Acc.@1$ $Acc.@10$ Triplet loss+data aug 50.43% 93.91% CAT loss+no data aug 49.57% 94.78% Our model 54.78% 96.52% QMUL-chair $Acc.@1$ $Acc.@10$ Triplet loss+data aug 78.35% 97.94% CAT loss+no data aug 76.29% 96.91% Our model 81.44% 98.97% QMUL-handbag $Acc.@1$ $Acc.@10$ Triplet loss+data aug 51.19% 86.31% CAT loss+no data aug 51.79% 86.90% Our model 54.76% 88.69%
 QMUL-shoe $Acc.@1$ $Acc.@10$ Triplet loss+data aug 50.43% 93.91% CAT loss+no data aug 49.57% 94.78% Our model 54.78% 96.52% QMUL-chair $Acc.@1$ $Acc.@10$ Triplet loss+data aug 78.35% 97.94% CAT loss+no data aug 76.29% 96.91% Our model 81.44% 98.97% QMUL-handbag $Acc.@1$ $Acc.@10$ Triplet loss+data aug 51.19% 86.31% CAT loss+no data aug 51.79% 86.90% Our model 54.76% 88.69%
 [1] Jianli Xiang, Guozheng Yan. The uniqueness of the inverse elastic wave scattering problem based on the mixed reciprocity relation. Inverse Problems & Imaging, , () : -. doi: 10.3934/ipi.2021004 [2] Kateřina Škardová, Tomáš Oberhuber, Jaroslav Tintěra, Radomír Chabiniok. Signed-distance function based non-rigid registration of image series with varying image intensity. Discrete & Continuous Dynamical Systems - S, 2021, 14 (3) : 1145-1160. doi: 10.3934/dcdss.2020386 [3] Liam Burrows, Weihong Guo, Ke Chen, Francesco Torella. Reproducible kernel Hilbert space based global and local image segmentation. Inverse Problems & Imaging, 2021, 15 (1) : 1-25. doi: 10.3934/ipi.2020048 [4] Patrick W. Dondl, Martin Jesenko. Threshold phenomenon for homogenized fronts in random elastic media. Discrete & Continuous Dynamical Systems - S, 2021, 14 (1) : 353-372. doi: 10.3934/dcdss.2020329 [5] Pedro Branco. A post-quantum UC-commitment scheme in the global random oracle model from code-based assumptions. Advances in Mathematics of Communications, 2021, 15 (1) : 113-130. doi: 10.3934/amc.2020046 [6] Wenyan Zhuo, Honglin Yang, Leopoldo Eduardo Cárdenas-Barrón, Hong Wan. Loss-averse supply chain decisions with a capital constrained retailer. Journal of Industrial & Management Optimization, 2021, 17 (2) : 711-732. doi: 10.3934/jimo.2019131 [7] Nan Zhang, Linyi Qian, Zhuo Jin, Wei Wang. Optimal stop-loss reinsurance with joint utility constraints. Journal of Industrial & Management Optimization, 2021, 17 (2) : 841-868. doi: 10.3934/jimo.2020001 [8] Mehdi Bastani, Davod Khojasteh Salkuyeh. On the GSOR iteration method for image restoration. Numerical Algebra, Control & Optimization, 2021, 11 (1) : 27-43. doi: 10.3934/naco.2020013 [9] Mohammed Abdulrazaq Kahya, Suhaib Abduljabbar Altamir, Zakariya Yahya Algamal. Improving whale optimization algorithm for feature selection with a time-varying transfer function. Numerical Algebra, Control & Optimization, 2021, 11 (1) : 87-98. doi: 10.3934/naco.2020017 [10] Fioralba Cakoni, Pu-Zhao Kow, Jenn-Nan Wang. The interior transmission eigenvalue problem for elastic waves in media with obstacles. Inverse Problems & Imaging, , () : -. doi: 10.3934/ipi.2020075 [11] Russell Ricks. The unique measure of maximal entropy for a compact rank one locally CAT(0) space. Discrete & Continuous Dynamical Systems - A, 2021, 41 (2) : 507-523. doi: 10.3934/dcds.2020266 [12] Wenqin Zhang, Zhengchun Zhou, Udaya Parampalli, Vladimir Sidorenko. Capacity-achieving private information retrieval scheme with a smaller sub-packetization. Advances in Mathematics of Communications, 2021, 15 (2) : 347-363. doi: 10.3934/amc.2020070 [13] Manxue You, Shengjie Li. Perturbation of Image and conjugate duality for vector optimization. Journal of Industrial & Management Optimization, 2020  doi: 10.3934/jimo.2020176 [14] Abdelghafour Atlas, Mostafa Bendahmane, Fahd Karami, Driss Meskine, Omar Oubbih. A nonlinear fractional reaction-diffusion system applied to image denoising and decomposition. Discrete & Continuous Dynamical Systems - B, 2020  doi: 10.3934/dcdsb.2020321 [15] Balázs Kósa, Karol Mikula, Markjoe Olunna Uba, Antonia Weberling, Neophytos Christodoulou, Magdalena Zernicka-Goetz. 3D image segmentation supported by a point cloud. Discrete & Continuous Dynamical Systems - S, 2021, 14 (3) : 971-985. doi: 10.3934/dcdss.2020351 [16] Editorial Office. Retraction: Xiaohong Zhu, Zili Yang and Tabharit Zoubir, Research on the matching algorithm for heterologous image after deformation in the same scene. Discrete & Continuous Dynamical Systems - S, 2019, 12 (4&5) : 1281-1281. doi: 10.3934/dcdss.2019088 [17] Maika Goto, Kazunori Kuwana, Yasuhide Uegata, Shigetoshi Yazaki. A method how to determine parameters arising in a smoldering evolution equation by image segmentation for experiment's movies. Discrete & Continuous Dynamical Systems - S, 2021, 14 (3) : 881-891. doi: 10.3934/dcdss.2020233 [18] Editorial Office. Retraction: Xiaohong Zhu, Lihe Zhou, Zili Yang and Joyati Debnath, A new text information extraction algorithm of video image under multimedia environment. Discrete & Continuous Dynamical Systems - S, 2019, 12 (4&5) : 1265-1265. doi: 10.3934/dcdss.2019087 [19] Shuyang Dai, Fengru Wang, Jerry Zhijian Yang, Cheng Yuan. A comparative study of atomistic-based stress evaluation. Discrete & Continuous Dynamical Systems - B, 2020  doi: 10.3934/dcdsb.2020322 [20] Hong Niu, Zhijiang Feng, Qijin Xiao, Yajun Zhang. A PID control method based on optimal control strategy. Numerical Algebra, Control & Optimization, 2021, 11 (1) : 117-126. doi: 10.3934/naco.2020019

Impact Factor: