ROS-based ground stereo vision detection: implementation and experiments
© The Author(s) 2016
Received: 2 March 2016
Accepted: 22 July 2016
Published: 2 September 2016
This article concentrates on open-source implementation on flying object detection in cluttered scenes. It is of significance for ground stereo-aided autonomous landing of unmanned aerial vehicles. The ground stereo vision guidance system is presented with details on system architecture and workflow. The Chan–Vese detection algorithm is further considered and implemented in the robot operating systems (ROS) environment. A data-driven interactive scheme is developed to collect datasets for parameter tuning and performance evaluating. The flying vehicle outdoor experiments capture the stereo sequential images dataset and record the simultaneous data from pan-and-tilt unit, onboard sensors and differential GPS. Experimental results by using the collected dataset validate the effectiveness of the published ROS-based detection algorithm.
KeywordsUnmanned aerial vehicle (UAV) Autonomous landing Robot operating systems (ROS) Chan–Vese Flying object detection
In the past decades, unmanned aerial vehicles (UAVs) have been widely used in many fields. The applications include environmental monitoring, planting and farming, remote observation and earthquake rescue . Most attention is generally paid on fixed-wing aerial vehicle recovery because of relatively higher risk involved during the landing phase. Many practical applications showed that recovery is the most challenging and hazardous period of UAV flights . Developing autonomous landing technologies has already been an important trend of runway-mode takeoff-and-landing UAV systems. It aims at reducing personnel dependency and workload and meanwhile improving adaptability and reliability of flying vehicles recovery. The success of flying aircraft navigation is mostly achieved by using onboard conventional sensors, such as global positioning system (GPS), inertial measurement unit (IMU) and magnetometer. However, autonomous landing task that requires higher accuracy in localization is still not achievable solely by these onboard sensors [3, 4].
Under such circumstances, a ground vision guidance scheme was proposed and developed [5–11]. The ground system possesses stronger computation resources and saves cost by implementing each set for a runway rather than individual vehicles. Moreover, image processing on the ground-captured images is more convenient than that on the onboard images with complicated backgrounds.
In this study, a synthetic data-driven scheme is developed and presented for target detection algorithm design, implementation, testing, evaluation and parameter tuning. The Chan–Vese  approach is demonstrated as a case study. The Chan–Vese object detection algorithm is to be implemented in the robot operating system (ROS) platform for general multi-user usages, open-source support and inheritable development. The dataset of stereo sequential images is constructed to evaluate detection performance and to tune appropriate parameters as well. The ROS package is developed and published on the open-source github Web site. The comparisons are made between the Chan–Vese automatic detection and the manual detection based on the collected dataset. The results show that the ROS-based Chan–Vese detection approach effectively extracts the aircraft coordinates with satisfied localization accuracy.
System architecture and workflow
Architecture of ground stereo vision system
Aerial vehicles autonomous landing on the runway is usually composed of three stages: approaching, descending and taxiing. The onboard navigation system guides the aircraft into the field of view of stereo cameras. Once the aircraft target is detected, the spatial coordinates are calculated by using the stereo vision localization algorithm. The data link connects the flying aircraft and the ground system and transfers the vision-based localized position onto the onboard autopilot. Detailed process and scenarios are presented in Fig. 1.
Stereo localization workflow
The ground stereo vision system consists of two independent modules. Each module is equipped with one camera on an independent pan–tilt unit. The two modules are independently connected to the computer. Landing image sequences are obtained by the symmetrically located two cameras on both sides of the runway. The pan–tilt units are automatically driven to keep the flying aircraft around the center of the vision field. The pan-and-tilt angles are fed back to the computer for calculating the spatial coordinates.
ROS-based detection algorithm
The ground stereo vision guidance system enables the UAV autonomy during takeoff-and-landing phases. As shown in Fig. 2, target detection is the first step and a kernel factor in the ground vision-based guidance. The detection algorithm aims at finding the flying vehicle’s coordinates from the captured sequential images. In the previous works [6–9], both corner-based and skeleton-based methods were employed into target detection for the ground stereo vision system. Typically, a skeleton-featured detection algorithm, namely Chan–Vese model, is considered and implemented in the ROS environment. Such an open-source implementation definitely draws attentions and technical supports from interested researchers. Advanced or newly developed detection algorithms are more smoothly fused into the ground stereo system.
Skeleton-featured detection algorithm
The skeleton or edge is an important feature in images. The Chan–Vese model  is a geometry-driven active contour model that fuses both curve evolution and level set theories. To some extent, it can be expressed as zero level set of level set function indirectly.
Since the skeleton is a scale-, gray- and rotation-invariant feature, the Chan–Vese model-based detection possesses adaptability to object geometry or topology evolving. Therefore, the Chan–Vese detection is potentially suitable for all the ground vision-captured aircraft images, regardless of approaching, landing and taxiing on the runway.
The employed Chan–Vese approach is a kind of geometric active contour models. Although improper initial outline may lead to local minimum, the continuous movement of cooperative target can figure it out by estimating target’s position according to target movement characters. Combining target’s shape transformations with movement characters greatly improves the object detection accuracy. At the same time, the accuracy and efficiency of extraction will be improved with the development of image segmentation based on the theory of geometric active contour model.
Level set method increases the problem’s dimension to be higher. For example, a plane curve C is implicitly expressed as a same-value curve of three-dimensional continuous functional surface \(\varphi (x,y,t)\), which is called level set function.
The Chan–Vese image segmentation is presented as follows. At first, a regular closed curve is given as the assumed original boundary. The closed curve iteratively evolves by numerically solving partial differential equations. Finally, it will converge to the target boundary.
Step 1: Run the multiple cameras driver to publish the captured images, and find the ports of cameras (port1 and port2).
Step 2: Run the PTU states publishing nodes.
Step 3: Run the Chan–Vese detection node.
Step 4: Run the stereo vision localization node.
Experiments and discussion
The outdoor flight experiments are performed to collect the images, D-GPS data for the algorithm testing and parameter tuning. Simultaneously, the experiments demonstrate the usage and feasibility of the developed open-source ROS package.
Workflow of data-driven detection
In this study, a synthetic data-driven scheme is proposed to promote flying object detection algorithms. We concentrate on target checking and tracking on the UAV landing image sequences from the ground stereo vision guidance system. A manual interactive system is established to collect the aircraft coordinates in the sequential images, and moreover, datasets are constructed for training and evaluating various detection algorithms.
Real-time feature analysis
Real-time feature of Chan–Vese detection algorithm
720 × 576
Number of landing flights
Average frames per flight
Average time cost per frame
157 ± 10
Minimum time cost for one frame
Maximum time cost for one frame
Ground vision-aided guidance is demonstrated as an effective approach for runway-mode UAV autonomous landing. Compared with the onboard scheme, the developed ground vision system has the necessary processing power and greater computation capacity and furthermore rids the need for individual aircrafts to carry such equipment. Truth be told, the ground vision system has potential pitfalls as well. It has a limited distance and scope to make the first catch of flying aircrafts and is limited to weather conditions significantly. Furthermore, the instrument landing system (ILS) has already been around for decades of years and is deployed in almost every airport and manned airplane. That system with incredible accuracy is reported precise enough to allow landings in essentially zero visibility. Generally, its practical application is restricted to commercial passenger airports for the expensive consumption, inconvenient deployment and professional operations. The ground vision-based system can be modularly assembled and practically deployed for low-cost unmanned aircrafts. From the engineering point of view, the ground system can not only be developed as an effective supplement to the ILS in the high-level airports, but also make low-cost substitutes of ILS within specified scenarios.
In this article, open-source ROS implementation is employed into the ground stereo guidance system. This open scheme definitely enriches technical innovations from numerous interested researchers. Newly developed detection algorithms can be conveniently employed into the flying object detection. One representative of the Chan–Vese approach is considered and demonstrated in the ROS indigo version and published in the github Web site. The detection approach aims at locating the flying aircraft coordinates in the captured sequential images of the ground stereo vision guidance system. The running operators are given at length in the ROS-supported platform. Meanwhile, a data-driven interactive scheme is constructed since object detection in cluttered scenes requires large image collections with ground truth labels [20, 21]. Collection and use of annotated images play an important role in training and evaluation of detection approaches. Experimental comparisons are made by using the collected datasets, including the stereo sequential images, PTU angles, D-GPS positions and other flying states from onboard sensors. Results validate the effectiveness and generality of the published Chan–Vese detection ROS indigo package.
The open-source mode follows the present tendency in this field to draw more attentions, inspiration and contribution from online users. Furthermore, the annotated images and spatial extents should make positive effects on detection algorithm training and parameter optimization in the following researches.
TH proposed and designed the ROS-based system architecture. DT and BZ implemented the Chan–Vese algorithm as the open-source package. DZ, WK and LS participated in the development of the stereo vision guidance system. All authors read and approved the final manuscript.
The work is supported by Major Application Basic Research Project of NUDT with Granted No. ZDYYJCYJ20140601. The authors would like to thank Dianle Zhou and Zhiwei Zhong for their contribution on the experimental prototype development. Thanks are also extended to Zhaowei Ma and Chongyu Pan for the corner-based and skeleton-based detection algorithms implementation. Hongchao Yu made great contribution on the dataset collection.
The authors declare that they have no competing interests.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
- Kumar V, Michael N. Opportunities and challenges with autonomous micro aerial vehicles. Int J Robot Res. 2012;31:1279–91.View ArticleGoogle Scholar
- Kendoul F. Survey of advances in guidance, navigation, and control of unmanned rotorcraft systems. J Field Robot. 2012;29:315–78.View ArticleGoogle Scholar
- Cesetti A, Frontoni E, Mancini A, Zingaretti P, Longhi S. A vision-based guidance system for UAV navigation and safe landing using natural landmarks. J Intell Rob Syst. 2010;57(1–4):233–57.View ArticleMATHGoogle Scholar
- Yang SW, Scherer SA, Zell A. An onboard monocular vision system for autonomous takeoff, hovering and landing of a micro aerial vehicle. J Intell Rob Syst. 2013;69:499–515.View ArticleGoogle Scholar
- Pebrianti D, Kendoul F, Azrad S, Wang W, Nonami K. Autonomous hovering and landing of a quad-rotor micro aerial vehicle by means of on ground stereo vision system. J Syst Des Dyn. 2010;4(2):269–84.Google Scholar
- Zhang D, Wang X, Kong W. A ground-based optical system for autonomous control of running takeoff and landing for a fixed-wing unmanned aerial vehicle. In: International conference on control, automation, robotics and vision (ICARCV); (2012). p. 990–4.
- Kong W., Zhou D., Zhang Y., Zhang D., Wang X., et al. A ground-based optical system for autonomous landing of a fixed wing UAV. In: IEEE/RSJ international conference on intelligent robots and systems (IROS); (2014). p. 4797–804.
- Tang D, Hu T, Shen L, Zhang D, Zhou D. Chan-Vese model based binocular visual object extraction for UAV autonomous take-off and landing. In: International conference on information science and technology (ICIST); (2015). p. 67–73.
- Tang D, Hu T, Shen L, et al. Ground stereo vision based navigation for autonomous take-off and landing of UAVs: a Chan-Vese Model approach. Int J Adv Rob Syst. 2016;13:67. doi:10.5772/62027.Google Scholar
- Huh S, Shim DH. A vision-based automatic landing method for fixed-wing UAVs. J Intell Rob Syst. 2010;57:217–31.View ArticleGoogle Scholar
- Miller A, Shah M, Harper D. Landing a UAV on a runway using image registration. In: IEEE international conference on robotics and automation (ICRA); (2008). p. 182–7.
- Laiacker M, Kondak K, Schwarzbach M, Muskardin T. Vision aided automatic landing system for fixed wing UAV. In: IEEE/RSJ international conference on intelligent robots and systems (IROS); (2013). p. 2971–6.
- Harris C. Geometry from visual motion. In: Blake A, Yuille A, editors. Active Vision. Cambridge: MIT press; 1992. p. 263–84.Google Scholar
- Lowe DG. Distinctive image features from scale-invariant keypoints. Int J Comput Vision. 2004;60(2):91–110.View ArticleGoogle Scholar
- Bay H, Tuytelaars T, Gool L. V. SURF: speeded up robust features. In: European conference on computer vision (ECCV); 2006.
- Rublee E, et al. ORB: an efficient alternative to SIFT or SURF. In: International conference on computer vision (ICCV); (2011). p. 2564–71.
- Trajkovic M, Hedley M. Fast corner detection. Image Vis Comput. 1998;16(2):75–87.View ArticleGoogle Scholar
- Leutenegger S, Chli M, Siegwart R. BRISK: binary robust invariant scalable keypoints. In: International conference on computer vision (ICCV); (2011). p. 2548–55.
- Chan TF, Vese LA. Active contours without edges. IEEE Trans Image Process. 2001;10:266–77.View ArticleMATHGoogle Scholar
- Torralba A, Russell BC, Yuen J. LabelMe: online image annotation and applications. In: Proceedings of the IEEE, 2010, 98.8.
- Russel BC, Torralba A, Murphy KP, Freeman WT. LabelMe: a database and web-based tool for image annotation. In: MIT computer science and artificial intelligence laboratory technical report, MIT-CSAIL-TR-2005-056, September 2005; (2005).