Abstract:
The introduction of Google Street View, which is an integral part of Google Maps, has brought to the surface a method of roof-mounted mobile cameras on vehicles. This is regarded as one of the highly known and adopted methodically for capturing street-level images. Computer vision as one of the frontier technologies in computer science has allowed for the use of building artificial systems to extract valuable information from images. This approach has a broad range of applications in various areas such as agriculture, business, and healthcare.
This dissertation contributes to the development and implementation of Image-Based Rendering (IBR) techniques by presenting a method that makes use of hexagon-based camera configuration for image capturing. Upon the image capturing, each segmented image was stored in a specific folder relative to the camera number. Following this process, the images were chosen based on their timestamp and GPS coordinates and copied to a master folder where the rendering took place.
However, before rendering can take place, the master folder was called inside Blender software. The reason for placing the master folder inside Blender3D was to ensure smooth blending of different image datasets with fewer resources and low computing power during the rendering process. This is feasible as all the image datasets are in one folder as compared to the calling of multiple datasets from different directories which might affect the processing power. Subsequently, OpenCV algorithms were utilised for the Structure from Motion and points of cloud simulation. These techniques and algorithms were based on the available image datasets that were created in the master folder.
Following the optimal image rendering, a process of image blending took place inside the Blender3D software where the captured images (dataset) were rendered for utilisation in the simulator. The use of the Structure from Motion algorithm was utilised for the development of the dense point image, feature, and matching detection. Furthermore, the process for extraction of a depth map model from the three-dimensional (3D) mesh was also highlighted as well as the image restoration process utilising the 3D warping approach. In addition, after these processes were completed, the IBR technique was utilised again for rendering the scenes from the multiple datasets that were captured from the Hexagon Camera Configuration Model to present a scenery that can allow for bidirectional movement. It is therefore noted that the entire work done in this dissertation was substantiated using simulations, genuine data, and physical analysis based on the physically gathered raw data and results from the analysis. The study objectives were therefore achieved by presenting a framework that allows for virtual driving and bidirectional movement of the scene from a Hexagon Camera Configuration Model. Furthermore, the image datasets showed an improvement in the visuals, spatial details and quality of the panoramic images for location identification based on GPS coordinates. Additionally, the rendered images were observed to be smaller than the originally captured images.
The study contribution was based on the GPS module which was utilised to observe and project the scene altitude and coordinates. Moreover, the contribution results process allows for free movement within the 3D-rendered scene to allow for back and forward motion as compared to a slide show that only allows for forwarding motion. In evaluating the efficacy of this research study, the objective argument highlights that through the use of a Hexagon Camera Configuration Model, the user is permitted to move in both the forward and reverse direction within a simulator as opposed to the one-directional movement. These results demonstrate the feasibility of utilising an alternative model for image capture as opposed to the utilisation of a 360o omnidirectional camera and image stitching protocol. Furthermore, the study results demonstrate that the more the input image data, the higher the realism of such a model. In contrast, for 364 image datasets, the output scene is high as a result of a large number of input image datasets with the scene realism observed for both points of cloud and mesh-based on 106110 points.