Artificial Neural Networks (ANN) have helped to revolutionise the world of Computer Vision (CV) with modern interpretations of the ANN based on visual cortex creating Convolutional Neural Network (CNN) and the research movement of Deep Learning (DL). Another more biologically inspired movement is that of Neuromorphic Engineering with its spiking neuron model and Spiking Neural Network (SNN). Recently, research has merged large parts of these two research fields allowing Neuromorphic Engineering to gain more momentum, creating a paradigm shift in the approach to CV. This provides the reality of havinga synchronous, low latency and low computational power approach available when utilising the SNN. A novel solution to both semantic segmentation and a framework in which to utilise it is developed. The Perception-Understanding-Action (PUA) framework aims to add a contextual understanding through semantic segmentation, with a low latency and computational SNN, entitled the Spiking Segmentation Network (SpikeSEG). This framework aims to improve the low latency and reactive Perception-Action Cycles used in many constrained robotics tasks. By adding understanding, a low latency approach aims to add no noticeable latency to the system, exploiting the asynchronous advantage that is available when using Neuromorphic Vision Sensors (NVS). The framework allows an end-to-end spiking system to be realised where latency and computational power are limiting factors. Further to semantic segmentation, a novel method for instance segmentation is also proposed with the Hierarchical Unravelling of Linked Kernels with Similarity Matching through Active Spike ashing (HULK-SMASH) algorithm. This solves the difficult problem of unsupervised class instance clustering, deciphering between separate instances of classes on a per sequence and sequence to sequence basis. The algorithm allows each instance within the classification layers to be traced during the decoding back to the pixel space, allowing a pixel-wise instance mapping of each class instance. The algorithm is successfully able to identify the same person within a neuromorphic vision face dataset, while also being able to successfully recover tracking of instance after complete occlusion. Additionally, a novel solution to the non-typical imaging problem, temporal imaging, is presented. This method of 3D imaging makes use of minimal spatial data from a single-point sensor, but with a high-resolution of temporal data captured in a time-of-flight (ToF) manner. To produce images from this method the novel network Spiking-Single Point Imager (Spike-SPI) is required to solve the inverse retrieval problem of creating a 3D depth map from only the temporal data, inferring the spatial locations based on previous temporal sequences it has seen. This network makes us of both encoder-decoder networks and CNNs and their training methods to train the system. These are then converted to an SNN to allow asynchronous, lower latency and lower computational processing. Spike-SPI was able to outperform the current state of the art in 3D depth estimation, losing no accuracy in the CNN to SNN conversion process while gaining the aforementioned benefits.
|Date of Award||10 May 2021|
- University Of Strathclyde
|Sponsors||University of Strathclyde|
|Supervisor||John Soraghan (Supervisor), Stephan Weiss (Supervisor) & Gaetano Di Caterina (Supervisor)|