Automatic tuning of the pooling operation in convolutional neural networks for coal rock classification
O.A. Kozlova1, V.V. Kitov1, 2
1 Lomonosov Moscow State University, Moscow, Russian Federation
2 Plekhanov Russian University of Economics, Moscow, Russian Federation
Russian Mining Industry №6 / 2024 p. 125-134
Abstract: Convolutional neural networks are used to automate image processing tasks such as classification, segmentation, object detection, style transfer, etc. These networks are actively applied in coal mining industry for automatic classification of coal rocks with high accuracy based on raw images. Accurate classification of the coal rocks is important for coal quality assessment, optimization of coal mining, preparation and processing. The main mathematical operations of convolutional networks are convolution and pooling. The paper discusses a generalization of the pooling operation. Usually the type of pooling is specified in advance by some aggregation operation, i.e. the average pooling or max pooling. The size of the aggregated area is also specified in advance. The type of pooling and the size of the aggregated region significantly affect the quality of coal-bed image processing. This paper proposes several parametric generalizations of the pooling operation, which cover the average and max pooling as special cases. A parametric generalization is also proposed for max pooling, which allows to vary the size of the aggregation region. Parameters of the proposed pooling generalizations are automatically trained along with the rest of the network weights.
Keywords: pooling generalization, auto-tuned pooling, neural architecture search, image classification, coal rocks, coal industry
Acknowledgments: The work was carried out within the framework of the state assignment in the field of scientific activity of the Ministry of Science and Higher Education of the Russian Federation, Project entitled “Models, methods and algorithms of artificial intelligence in economic problems for analyzing and transferring the style of multidimensional data sets, forecasting time series and building recommendation systems”, Grant No. FSSW-2023-0004.
For citation: Kozlova O.A., Kitov V.V. Automatic tuning of the pooling operation in convolutional neural networks for coal rock classification. Russian Mining Industry. 2024;(6):125–134. (In Russ.) https://doi.org/10.30686/1609-9192-2024-6-125-134
Article info
Received: 19.10.2024
Revised: 27.11.2024
Accepted: 04.12.2024
Information about the authors
Olga A. Kozlova – Software Engineer, Lomonosov Moscow State University, Moscow, Russian Federation; https://orcid.org/0009-0002-8271-4578; e-mail: This email address is being protected from spambots. You need JavaScript enabled to view it.
Victor V. Kitov – Cand. Sci. (Phys. and Math.), Leading Engineer, Plekhanov Russian University of Economics; Associate Professor, Lomonosov Moscow State University, Moscow, Russian Federation; https://orcid.org/0000-0002-3198-5792; e-mail: This email address is being protected from spambots. You need JavaScript enabled to view it.
References
1. Nesteruk S., Agafonova J., Pavlov I., Gerasimov M., Latyshev N., Dimitrov D. et al. MineralImage5k: A benchmark for zeroshot raw mineral visual recognition and description. Computers & Geosciences. 2023;178:105414. https://doi.org/10.1016/j.cageo.2023.105414
2. Ranzato M.A., Boureau Y.-L., LeCun Y. Sparse feature learning for deep belief networks. In: Advances in Neural Information Processing Systems 20 – Proceedings of the 2007 Conference. Vancouver, Canada: Neural Information Processing Systems; 2008. Available at: https://proceedings.neurips.cc/paper/2007/file/c60d060b946d6dd6145dcbad5c4ccf6f-Paper.pdf (accessed: 21.10.2024).
3. LeCun Y., Boser B., Denker J., Henderson D., Howard R., Hubbard W., Jackel L. Handwritten digit recognition with a backpropagation network. In: Touretzky D. (ed.) Advances in Neural Information Processing Systems (NIPS 1989), Denver, CO (Vol. 2). Morgan Kaufmann; 1990. Available at: https://proceedings.neurips.cc/paper_files/paper/1989/file/53c3bce66e43be4 f209556518c2fcb54-Paper.pdf (accessed: 21.10.2024).
4. Lee C.-Y., Gallagher P.W., Tu Z. Generalizing pooling functions in convolutional neural networks: Mixed, gated, and tree. arXiv preprint arXiv:1509.08985. 10 October 2015. Available at: https://arxiv.org/pdf/1509.08985 (accessed: 21.10.2024).
5. Yu D., Wang H., Chen P., Wei Z. Mixed pooling for convolutional neural networks. In: Miao D., Pedrycz W., Slezak D., Peters G., Hu Q., Wang R. (eds.) Rough Sets and Knowledge Technology. Springer, Cham; 2014, pp. 364–375. https://doi.org/10.1007/978-3-319-11740-9_34
6. Momeny M., Jahanbakhshi A., Jafarnezhad K., Zhang Y.-D. Accurate classification of cherry fruit using deep CNN based on hybrid pooling approach. Postharvest Biology and Technology. 2020;166:111204. https://doi.org/10.1016/j.postharvbio.2020.111204
7. Zhong S., Wen W., Qin J. Mix-pooling strategy for attention mechanism. arXiv preprint arXiv:2208.10322. 22 August 2022. Available at: https://arxiv.org/pdf/2208.10322v1 (accessed: 21.10.2024).
8. Tong Z., Tanaka G. Hybrid pooling for enhancement of generalization ability in deep convolutional neural networks. Neurocomputing. 2019;333:76–85. https://doi.org/10.1016/j.neucom.2018.12.036
9. Sermanet P., Chintala S., LeCun Y. Convolutional neural networks applied to house numbers digit classification. arXiv:1204.3968. 18 April 2012. https://doi.org/10.48550/arXiv.1204.3968
10. Nair V., Hinton G. Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning, Haifa, Israel, 2010, pp. 807–814. Available at: https://www.cs.toronto.edu/~hinton/absps/reluICML.pdf (accessed: 21.10.2024).
11. Ait Skourt B., El Hassani A., Majda A. Mixed-pooling-dropout for convolutional neural network regularization. Journal of King Saud University – Computer and Information Sciences. 2022;34(8, Part A):4756–4762. https://doi.org/10.1016/j.jksuci.2021.05.001
12. Sun M., Song Z., Jiang X., Pan J., Pang Y. Learning pooling for convolutional neural network. Neurocomputing. 2017;224:96–104. https://doi.org/10.1016/j.neucom.2016.10.049