TY - GEN
T1 - A Modulation Module for Multi-task Learning with Applications in Image Retrieval
AU - Zhao, Xiangyun
AU - Li, Haoxiang
AU - Shen, Xiaohui
AU - Liang, Xiaodan
AU - Wu, Ying
N1 - Funding Information:
Acknowledgements. This work was supported in part by National Science Foundation grant IIS-1217302, IIS-1619078, the Army Research Office ARO W911NF-16-1-0138, and Adobe Collaboration Funding.
Funding Information:
This work was supported in part by National Science Foundation grant IIS-1217302, IIS-1619078, the Army Research Office ARO W911NF-16-1-0138, and Adobe Collaboration Funding.
Publisher Copyright:
© 2018, Springer Nature Switzerland AG.
PY - 2018
Y1 - 2018
N2 - Multi-task learning has been widely adopted in many computer vision tasks to improve overall computation efficiency or boost the performance of individual tasks, under the assumption that those tasks are correlated and complementary to each other. However, the relationships between the tasks are complicated in practice, especially when the number of involved tasks scales up. When two tasks are of weak relevance, they may compete or even distract each other during joint training of shared parameters, and as a consequence undermine the learning of all the tasks. This will raise destructive interference which decreases learning efficiency of shared parameters and lead to low quality loss local optimum w.r.t. shared parameters. To address the this problem, we propose a general modulation module, which can be inserted into any convolutional neural network architecture, to encourage the coupling and feature sharing of relevant tasks while disentangling the learning of irrelevant tasks with minor parameters addition. Equipped with this module, gradient directions from different tasks can be enforced to be consistent for those shared parameters, which benefits multi-task joint training. The module is end-to-end learnable without ad-hoc design for specific tasks, and can naturally handle many tasks at the same time. We apply our approach on two retrieval tasks, face retrieval on the CelebA dataset [12] and product retrieval on the UT-Zappos50K dataset [34, 35], and demonstrate its advantage over other multi-task learning methods in both accuracy and storage efficiency.
AB - Multi-task learning has been widely adopted in many computer vision tasks to improve overall computation efficiency or boost the performance of individual tasks, under the assumption that those tasks are correlated and complementary to each other. However, the relationships between the tasks are complicated in practice, especially when the number of involved tasks scales up. When two tasks are of weak relevance, they may compete or even distract each other during joint training of shared parameters, and as a consequence undermine the learning of all the tasks. This will raise destructive interference which decreases learning efficiency of shared parameters and lead to low quality loss local optimum w.r.t. shared parameters. To address the this problem, we propose a general modulation module, which can be inserted into any convolutional neural network architecture, to encourage the coupling and feature sharing of relevant tasks while disentangling the learning of irrelevant tasks with minor parameters addition. Equipped with this module, gradient directions from different tasks can be enforced to be consistent for those shared parameters, which benefits multi-task joint training. The module is end-to-end learnable without ad-hoc design for specific tasks, and can naturally handle many tasks at the same time. We apply our approach on two retrieval tasks, face retrieval on the CelebA dataset [12] and product retrieval on the UT-Zappos50K dataset [34, 35], and demonstrate its advantage over other multi-task learning methods in both accuracy and storage efficiency.
UR - http://www.scopus.com/inward/record.url?scp=85055110596&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85055110596&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-01246-5_25
DO - 10.1007/978-3-030-01246-5_25
M3 - Conference contribution
AN - SCOPUS:85055110596
SN - 9783030012458
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 415
EP - 432
BT - Computer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings
A2 - Hebert, Martial
A2 - Ferrari, Vittorio
A2 - Sminchisescu, Cristian
A2 - Weiss, Yair
PB - Springer Verlag
T2 - 15th European Conference on Computer Vision, ECCV 2018
Y2 - 8 September 2018 through 14 September 2018
ER -