Caron M, Misra I, Mairal J, Goyal P, and Bojanowski P (2020). Unsupervised learning of visual features by contrasting cluster assignments. In European Conference on Computer Vision (pp. 3-20), Springer, Online.
Caruana R (1997). Multitask learning, Machine Learning, 28, 41-75.
Chen L, Song J, and Zhang Z (2018). Multi-faceted hierarchical multi-task learning for a large number of tasks with multi-dimensional relations, In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 3404-3414.
Chen T, Kornblith S, Norouzi M, and Hinton G (2020). A simple framework for contrastive learning of visual representations, In International Conference on Machine Learning, 1597-1607, PMLR.
Ester M, Kriegel, Hans-Peter, Sander, J”org, and Xu, Xiaowei (1996). A density-based algorithm for discovering clusters in large spatial databases with noise, KDD, 96, 226-231.
Grill J-B, Strub F, Altché F et al. (2020). Bootstrap your own latent-a new approach to self-supervised learning, Advances in Neural Information Processing Systems, 33, 21271-21284.
Hastie T, Friedman J, Tibshirani R, Hastie T, Friedman J, and Tibshirani R (2001). Unsupervised learning, The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 437-508.
Kohonen T (1982). Self-organized formation of topologically correct feature maps, Biological Cybernetics, 43, 59-69.
Liu Y, Li W, Li H, Zheng Z-M, and Wang S (2015). Multi-task learning for natural language processing, In Proceedings of the 28th AAAI Conference on Artificial Intelligence, 2730-2736.
MacQueen J (1967). Some methods for classification and analysis of multivariate observations, Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, 1, 281-297.
Huang, Z., Rao, M., Raju, A., Zhang, Z., Bui, B., and Lee, C. (2022). Multi-task learning for speaker-role adaptation in neural conversation models, In Proceedings of the 4th Workshop on NLP for Conversational AI 2022, 120-130.
Misra I, Shrivastava A, Gupta A, and Hebert M (2016). Cross-stitch networks for multi-task learning, In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 3994-4003.
Munkhdalai T and Yu H (2017). Meta networks, In Proceedings of the 34th International Conference on Machine Learning-Volume 70, 2554-2563. JMLR. org
Murtagh F (2014). Ward’s hierarchical agglomerative clustering method: Which algorithms implement ward’s criterion?, Journal of Classification, 31, 274-295.
Pentina, Anastasia and Lampert, Christoph H (2015). Curriculum learning of multiple tasks, In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 5492-5500.
Ruder S (2017). An overview of multi-task learning in deep neural networks,
Shwartz-Ziv R and Armon A (2022). Tabular data: Deep learning is not all you need, Information Fusion, 81, 84-90.
Tan P-N, Steinbach M, and Kumar V (2016). Introduction to Data Mining, Pearson, Upper Saddle River, New Jersey.