Large Scale Segmentation Algorithm for Object Based Image Analysis suitable for HPC architectures in hybrid distributed-shared memory context
Résumé
The amount of remote sensing images is constantly growing due to the increasing amount of satellites with Very High Resolution or very short revisiting time. In this context, exact large scale OBIA segmentation algorithms have been proposed to process remote sensing images of arbitrary size [1,2], to overcome the memory limitation issue. It is important to note that in this context, The word exact means that algorithms ensure resulting segments to beidentical to those obtained without tiling, while using a tiling strategy with a proper stability margin However those solutions have been proposed for sequential execution, and current implementations can use only one single processing unit on one single computer. Thus the execution time can be very important, which still prevents those approaches to be used for OBIA processing in an operational context. Meanwhile, resorting to High Performance Computing (HPC) is becoming a common practice to tackle the computational complexity in Earth Observation information extraction, since it provides environments and programming facilities able to speed-up processes. In this paper, we focus on the parallelization of segmentation algorithms to process large images in a reasonable time, using HPC techniques. Our approach consists in a generic strategy of parallelization using the Message Passing Interface (MPI) standard with the concept of remote window. While MPI enables the scaling through multiple processing nodes, threaded parallelism is used in the shared memory context to optimize core algorithms. A specific attention has also been paid to IO operations which is a well known bottleneck in HPC environments. We have successfully applied our method to two state of the art segmentation algorithms commonly used in remote sensing images processing, namely the Generic Region Merging [2] and the Mean Shift Smoothing based segmentation [1]. The hybrid distributed-shared memory approach enables algorithms to benefit from both multiple CPUs of one processing node, and also multiple nodes connected through network thank to the MPI. As our aim is to enable those algorithms for operational applications, we have implemented our approach for the two segmentation algorithms cited above in the Orfeo ToolBox, a well known open-source library for geospatial images processing [3]. In the final paper and presentation, trends of execution times for different configurations of real remote sensing input data and HPC ressources will be presented and discussed. The source code corresponding to the application presented in this paper is available for download [4], and will be integrated in the forthcoming releases of the Orfeo Toolbox [5]. [1]Michel, J., Youssefi, D., & Grizonnet, M. (2015). Stable mean-shift algorithm and its application to the segmentation of arbitrarily large remote sensing images. IEEE Transactions on Geoscience and Remote Sensing, 53(2), 952-964. [2]Lassalle, P., Inglada, J., Michel, J., Grizonnet, M., & Malik, J. (2015). A scalable tile-based framework for region-merging segmentation. IEEE Transactions on Geoscience and Remote Sensing, 53(10),5473-5485. [3]Grizonnet, M., Michel, J., Poughon, V., Inglada, J., Savinaud, M., & Cresson, R. (2017). Orfeo ToolBox: open source processing of remote sensing images. Open Geospatial Data, Software and Standards, 2(1),15. [4]LSOBIA:https://github.com/RTOBIA/LSOBIA [5]OrfeoToolBox:www.orfeo-toolbox.org