This thesis presents a novel system for calibrating the extrinsic parameters of an array of cameras, 3D lidars and GPS/INS sensors without the requirement for any markers or other calibration aids.
To achieve this, a new multi-modal metric, the gradient orientation measure is first presented. This metric operates by minimising the misalignment of gradients between the outputs of two candidate sensors and is able to handle the inherent differences in how sensors of different modalities perceive the world.
This metric is successfully demonstrated on a range of calibration problems, however to calibrate the systems in a reliable manner the metric requires an initial estimate to the solution and a constrained search space. These constraints are required as repeated and similar structure in the environment in combination with the limited field of view of the sensors result in the metric's cost function being non-convex. This non-convexity is an issue that affects all appearance-based markerless methods.
To overcome these limitations a second cue to the sensors' alignment is taken, the motion of the system. By estimating the motion that each individual sensor observes, an estimate of the extrinsic calibration of the sensors can be obtained. In this thesis standard techniques for this motion-based calibration (often referred to as hand-eye calibration) are extended by incorporating estimates of the accuracy of each sensor's readings. This allows the development of a probabilistic approach that calibrates all sensors simultaneously. The approach also facilities the estimation of the uncertainty in the final calibration. Finally, this motion-based approach is combined with appearance-based information to build a novel calibration framework. This framework does not require initialisation and can take advantage of all available alignment information to provide an accurate and robust calibration for the system.