Breiman L, Friedman J, Stone CJ, and Olshen RA (1984). Classification and Regression Trees (1st ed), Chapman & Hall, New-York.
Cybenko G (1989). Approximation by superpositions of a sigmoidal function, Mathematics of Control, Signals, and Systems, 2, 303-314.
Hastie T, Tibshirani R, and Friedman J (2001). The Elements of Statistical Learning, Springer-Verlag, New-York.
Hasan A, Wang Z, and Mahani AS (2016). Fast estimation of multinomial logit models: R package mnlogit, Journal of Statistical Software, 75, 1-19.
Hwang IJ, Kim HJ, Kim YJ, and Lee YD (2024). Generalized neural collaborative filtering, The Korean Journal of Applied Statistics, 37, 311-322.
Kim HJ, Kim YJ, Jang K, and Lee YD (2024a). A statistical journey to DNN, the second trip: Architecture of RNN and image classification, The Korean Journal of Applied Statistics, 37, 553-565.
Kim YJ, Hwang IJ, Jang K, and Lee YD (2024b). A statistical journey to DNN, the third trip: Language model and transformer, The Korean Journal of Applied Statistics, 37, 567-582.
Morris CN (1983). Natural exponential families with quadratic variance functions: Statistical theory, The Annals of Statistics, 11, 515-529.
Venables WN and Ripley BD (2002). Modern Applied Statistics with S (4th ed), Springer-Verlag, New-York.