Retinomorphic Vision Model for On-chip Feature Extraction

Computer vision plays a vital role in applications ranging from pose estimation and Earth observation, through SLAM and visual odometry, to navigation and landing [1–5]. Conventional charge-coupled device (CCD) sensors are not ideal for space applications due to their relatively high power consumption, fixed temporal resolution and high storage space and/or transmission bandwidth requirements. This has sparked growing interest in novel vision algorithms and sensors that can mitigate these shortcomings and add further useful capabilities. Specifically, dynamic vision (DV) sensors, whose operation is inspired by certain aspects of biological retinas, offer numerous advantages, such as completely asynchronous operation at the pixel level, sparse output, extremely high temporal resolution and very low power consumption [6]. These properties make DV sensors attractive for applications such as estimating optical flow for vision-based landing [7].

Conceptual representation of the mammalian retina [13].

However, DV sensors are based on a simplified model of a biological retina and therefore potentially miss out on more sophisticated processing performed within the retina. Most of the processing in the retina is geared towards adaptation to lighting conditions and contrast optimisation. For instance, a layer of neurons known as horizontal cells modulate the output of photoreceptors in the retina by subtracting the average brightness computed from a local region around the photoreceptor, thus preventing saturation [8]. Furthermore, retinal ganglion cells (RGCs) employ mechanisms that allow the cells to both amplify spatial contrast and to adapt to temporal changes in contrast [9]. As a result of integrating the input of multiple photoreceptors, RGCs can function as specialised detectors for various features and motion patterns in the visual scene, including ones that are of interest for space applications, such as ventral (approaching) and lateral motion [10,11]. However, these and other types of processing are not offered by existing DV sensors, where pixels merely indicate changes in brightness.

Project overview

This project aims to replicate the structure and functions performed by different retinal cell populations for the purpose of extracting higher-order information from the visual scene than is currently provided by DV sensors. Specifically, the goals of the project can be summarised as follows:

Develop a model and a simulation tool that reproduces the functional properties of neuronal layers in the retina.
Determine whether metrics of interest, such as divergence and optic flow, can be computed by this retinal model.
Implement the retinal model in hardware using an alternative general-purpose low-power programmable sensor (SCAMP) [11], paving the way for direct comparison with DV sensors in terms of power consumption and ability to extract useful features on-chip.

References

[1] Matthies, L. et al. (2007). Computer Vision on Mars. International Journal of Computer Vision, 75(1), 67–92. https://doi.org/10.1007/s11263-007-0046-z

[2] Wudenka, M. et al. (2021). Towards Robust Monocular Visual Odometry for Flying Robots on Planetary Missions. 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 8737–8744. https://doi.org/10.1109/IROS51168.2021.9636844

[3] Cheng, Y., Johnson, A., & Matthies, L. (2005). MER-DIMES: A planetary landing application of computer vision. 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), 806–813 vol. 1. https://doi.org/10.1109/CVPR.2005.222

[4] Kueng, B. et al. (2016). Low-latency visual odometry using event-based feature tracks. 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 16–23. https://doi.org/10.1109/IROS.2016.7758089

[5] Cohen, G. et al. (2019). Event-based Sensing for Space Situational Awareness. The Journal of the Astronautical Sciences, 66(2), 125–141. https://doi.org/10.1007/s40295-018-00140-5

[6] Messikommer, N. et al. (2020). Event-Based Asynchronous Sparse Convolutional Networks. Computer Vision – ECCV 2020 (Vol. 12353, pp. 415–431). Springer International Publishing. https://doi.org/10.1007/978-3-030-58598-3_25

[7] Sikorski, O., Izzo, D., & Meoni, G. (2021). Event-based spacecraft landing using time-to-contact. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 1941–1950. https://doi.org/10.1109/CVPRW53098.2021.00222

[8] Masland, R. H. (2011). Cell Populations of the Retina: The Proctor Lecture. Investigative Opthalmology & Visual Science, 52(7), 4581–4591. https://doi.org/10.1167/iovs.10-7083

[9] Kim, K. J. and Rieke, F. (2001). Temporal Contrast Adaptation in the Input and Output Signals of Salamander Retinal Ganglion Cells. The Journal of Neuroscience, 21(1), 287–299. https://doi.org/10.1523/JNEUROSCI.21-01-00287.2001

[10] Gollisch, T. and Meister, M. (2010). Eye Smarter than Scientists Believed: Neural Computations in Circuits of the Retina. Neuron, 65(2), 150–164. https://doi.org/10.1016/j.neuron.2009.12.009

[11] Münch, T. A. et al. (2009). Approach sensitivity in the retina processed by a multifunctional neural circuit. Nature Neuroscience, 12(10), 1308–1316. https://doi.org/10.1038/nn.2389

[12] Carey, S. J. et al. 100,000 fps Vision Sensor with Embedded 535GOPS/W 256x256 SIMD Processor Array. 2. https://personalpages.manchester.ac.uk/staff/p.dudek/default.htm

[13] Masland, R. H. (2001). The fundamental plan of the retina. Nature Neuroscience, 4(9), 877–886. https://doi.org/10.1038/nn0901-877

Advanced Concepts Team

multidisciplinary

Retinomorphic Vision Model for On-chip Feature Extraction

Project overview

References