P h D T H E S I S
Automated pedestrian detection, counting and tracking has received significant attention from the computer vision community of late. Many of the person detection techniques described so far in the literature work well in controlled environments, such as laboratory settings with a small number of people. This allows various assumptions to be made that simplify this complex problem. The performance of these techniques, however, tends to deteriorate when presented with unconstrained environments where pedestrian appearances, numbers, orientations, movements, occlusions and lighting conditions violate these convenient assumptions. Recently, 3D stereo information has been proposed as a technique to overcome some of these issues and to guide pedestrian detection. This thesis presents such an approach, whereby after obtaining robust 3D information via a novel disparity estimation technique, pedestrian detection is performed via a 3D point clustering process within a region-growing framework. This clustering process avoids using hard thresholds by using bio-metrically inspired constraints and a number of plan view statistics. This pedestrian detection technique requires no external training and is able to robustly handle challenging real-world unconstrained environments from various camera positions and orientations. In addition, this thesis presents a continuous detect-and-track approach, with additional kinematic constraints and explicit occlusion analysis, to obtain robust temporal tracking of pedestrians over time. These approaches are experimentally validated using challenging datasets consisting of both synthetic data and real-world sequences gathered from a number of environments. In each case, the techniques are evaluated using both 2D and 3D groundtruth methodologies.
The experimental results of the proposed pedestrian detection and tracking system from the 10 test sequences descrined in the thesis can be seen here and downloaded from here. In each sequence, the right image illustrates the bounding boxes of the pedestrians overlaid onto the input images and the left image illustrates the final pedestrian regions from a plan-view orientation. For the plan-view orientations, the white lines indicate the bounds of the scene, the blue and yellow lines at the bottom centre of the image indicates the cameras position and orientation, and the position of detected pedestrians in that frame are illustrated by a circle. In each of these images, each pedestrian is allocated a specific colour whereby their bounding box, final region and plan-view orientation circle are all depicted using the same colour.
For the 3D Analysis Vicon sequences, a person twice enters the scene and claps his hands. This is implemented for synchronisation purposes between the proposed system and the VICON camera system. The evaluated frames are those which appear between these two synchronisation frames. In each video, the regions appearing from the "synchronisation person" are not incorporated into the evaluation statistics presented in the thesis.
For the Grafton sequences, pedestrian faces have been removed for legal purposes.
The thesis entitled "Pedestrian Detection and Tracking using Stereo Vision Techniques" can be downloaded here.
Many of the datasets used in this thesis can be downloaded here.

