## ARIADNA Study 12-5101

### Probabilistic Computing for Efficient Robotic Vision in Space

*
Type of activity: Standart study (25 k€)*/p>

#### Background and motivation

**1.Introduction.**

*1.1 General goal*

In space robotics, the computational efficiency of
algorithms is at a prime: tasks such as autonomous landing or
rover navigation require a quick and efficient determination of
actions, given an amount of energy and processing capacity,
that is much more stringent than in earth-based scenarios.

This study focuses on a novel integrated software and hardware approach for reducing the computational effort and energy expenditure of computer vision algorithms. It is based on a combination of local sampling on the software side and probabilistic computing on the hardware side.

Specifically, on the software-side, the study will investigate random local sampling techniques that reduce the number of necessary computations at the cost of a slightly lower accuracy, to be done mainly by the ACT researchers. On the hardware-side, the study will investigate probabilistic computing as a means of saving energy at the cost of occasional errors in the calculations.

This proposal is addressed at research groups with expertise in probabilistic computing.

*1.2 Software – efficient and robust algorithms.*

Visual attention models could form a successful strategy for
reducing the computational effort of computer vision
algorithms. The central idea of visual attention models is that
they devote most of the processing to the most relevant parts
of the image. Surely, the main challenge is to automatically
determine what is `relevant'. Several models have been devised
that can determine relevant parts of an image, but many of them
extensively process the entire image for determining regions of
interest, inadvertently still leading to a large computational
burden [1, 2, 3].

In order to achieve a higher computational efficiency, many attention models perform visual tasks on the basis of a restricted number of local samples [4,5,6,7]. In particular, these studies focus on informed sampling, in which the information from the current sample is used to select the next. This can lead to large computational efficiencies, but also often creates a challenging Partially Observable Markov Decision Problem (POMDP). Such a POMDP is currently difficult to solve, and the mentioned studies either make strong assumptions on the task [4] or have to train a model for each different task [6].

In this study the more straightforward strategy of random sampling is employed. It has been demonstrated that in some cases random sampling suffices to obtain speed-ups in the order of a 100 times with an unnoticeable cost in accuracy [8,9,10,11]. Below a figure illustrates this for an application of pitch and roll estimation of an Unmanned Air Vehicle that processes local image samples to find the horizon line. The y-axis represents the absolute error in each estimate, the x-axis the number of samples, with full sampling on the far right. It can be observed that most performance is gained after a 1000 samples the performance does not noticeably improve. Extracting only 1000 samples results in a speed-up of 14 times for a small image of 160 x 120 pixels (a border of 10 pixels is used in these specific experiments).

**Figure 1:** Number of samples extracted from an image versus
average absolute errors in the estimates of the pitch and roll
angle. The solid lines are for the results in which a linear
algebraic solution is used for estimating the horizon (LA), the
dashed lines are for the results with a linear perceptron.

*1.3 Hardware – probabilistic computing.*

If one accepts that vision algorithms can gain in efficiency
by using only part of the information in an image, one can
extend this reasoning to the hardware involved in the
calculations to save even more time / energy. In typical
processor hardware considerable amounts of energy are spent on
obtaining correct calculation results, e.g., for adding or
multiplying numbers. In probabilistic computing [12,13,14], the
energy spent by the processing units is lowered, resulting in
an increase of the probability that some operations might go
wrong. Fortunately, it has been shown that the amount of energy
saved is significantly larger than the amount of probability
traded in [12] - see the figure below. As a consequence, in
theory, large energy gains (in the order of 5 times) can be
obtained at a minimal cost in calculation errors.

**Figure 2:** Energy expenditure versus the probability p that
calculations will be incorrect. Figure adopted from [12].

Different applications of probabilistic computing have been
investigated. The study in [12] focused on the application of
probabilistic computing to the calculation of the Fourier
transform, showing that for human subjects the probabilistic
reconstruction is visually identical to the error-free one.
Another study involved the application of probabilistic
computing to Continuous Restricted Boltzmann Machines (CBRM)
[13]. Such CBRMs normally require the generation of
pseudo-random numbers. Instead, in [13] probabilistic computing
naturally provided random numbers.

This study proposes to perform research on probabilistic
computing in a different context, namely that of a vision
algorithm for robotics. The main idea behind the application is
that a slight loss in accuracy in the computations of the
vision algorithms can still lead to robust and successful
retrieval of information on the environment.

#### 2. Research approach.

The research approach is to study the application of probabilistic computing in the context of a specific vision algorithm. Section 2.1 explains the stereo-vision algorithm that will serve as a starting point for the research. Section 2.2 addresses the research activities for the probabilistic computing part.

*2.1. Specific vision algorithm: stereo-vision.*

The extraction of information on the three-dimensional
structure of the world is important to almost all robotic
applications. The current Mars Rovers mainly rely on
stereo-vision: the extraction of depth from two images made by
two cameras positioned at a known distance from each other.

In the last decade, there has been a considerable progress in
stereo-vision algorithms. On the one hand, there have been more
successful applications of robotic stereo-vision, for example
[15,16]. On the other hand, many techniques have been
introduced to improve the accuracy of stereo-vision algorithms,
cf. [17,18].

The most basic stereo vision algorithm takes a window from image 1 at position (x,y) and compares it with windows from image 2 at positions (x+?x,y) for various ?x. The comparison typically involves the sum of the squared differences between (grey-scale) pixel values:

Where i and j are coordinates within a window W, I1 contains the grey-values of the first image and I2 contains the grey-values of the right image. The ?x leading to the lowest C-value represents the disparity at (x,y) in image 1 and is a measure for the distance of the object seen at that image location. Most of the recently introduced more accurate methods combine the so-obtained local disparities with global constraints. One common method [18] is to segment image 1 (on the basis of colour and / or texture) and assume that each segment conforms to a certain model. For example, one can assume that each segment conforms to a plane. The disparities in the segment then serve as data for a plane fitting process, which smoothens out many of the local errors.

In this project, it is proposed to speed up the calculation of stereo-vision with the following method: instead of determining a disparity for every image location in image 1, only a subset of the locations is processed. In order to fit a plane through a segment, at least three disparities are necessary. To get a robust estimate, more samples are advisable, yet it is not necessary to determine the disparities for all pixels in the segment. Determining fewer disparities leads to fewer computations, but results in a less accurate estimate of the plane parameters. As in other cases of (random) sampling, we expect the trade-off between the number of samples and the accuracy of the estimates to be rather gentle: allowing significant speed-ups at a negligible cost in accuracy.

*2.2 Probabilistic processing architecture.*

The bulk of the stereo-vision processing consists of
calculating equation 1. The idea is to implement the related
calculations in a specifically designed probabilistic
processing architecture.

Proposals should include argued choices concerning the
following issues:

- Different designs are possible for the components that will
perform addition / subtraction / multiplication. One example is
that in an adder, all bits can be provided with the same
voltages, or higher voltages can be applied to the higher order
bits (as in the BIVOS approach [12]). Another example is that
instead of variable voltages and a traditional representation
of integers, a different representation of numbers can be
employed in combination with a single voltage for all bits.

- The proposal should clarify which methodology is proposed. In
particular, it should address the connection between the
probabilistic computing architecture and the stereo-vision
algorithm. The stereo-vision algorithm will be implemented in
MATLAB. In a first possible methodology, the work is split in
two independent parts. The hardware group then identifies for
each operation in equation 1 the probabilities and types of
errors that can occur at different voltages. These errors can
then be modelled in the MATLAB scripts. A second possible
methodology would be to couple the MATLAB stereo-vision script
with hardware simulator software: calculations would then be
done directly by the simulated hardware. A third possible
methodology is to have actual hardware in the loop. The first
methodology is the baseline requirement for the study. In any
case, the chosen methodology should provide for a way to
compare the amount of energy consumed in a probabilistic and a
non-probabilistic scenario.

- The design should address how the dedicated probabilistic
computing parts for addition / subtraction / multiplication fit
within a further non-probabilistic CPU / DSP. How would they
interface and does sending numbers to these dedicated
probabilistic processing components incur additional costs?

Considerations given to radiation effects would be a welcome though not essential addition to the study.

#### 3.Study Objectives

The main objectives of this study are:

(1) To research a probabilistic processing architecture that
would best fit the stereo-vision algorithm. Specifically, this
involves the optimization of a design of an adder, subtractor,
and multiplier. The designs should be optimized for operating
at voltages that are as low as possible in combination with
error probabilities and error magnitudes that remain within a
reasonable range.

(2) To evaluate the energy efficiency gained by employing the
probabilistic processing structure, and analyse its relation to
the performance (in terms of the probability and size of the
calculation errors and in terms of the 3D reconstruction).

#### 4. Collaboration with the Advanced Concepts Team of ESA

The study is addressed at a research group with expertise in probabilistic computing. The research will be conducted in close cooperation with ACT researchers.

The ACT will contribute to the research project the knowledge of stereo-vision algorithms, including the necessary code. This will constitute the basis for exploring the way in which probabilistic computing can be applied. It is expected that analysis from a probabilistic computing point of view could lead to adjustments of the stereo-vision algorithm. The end goal of the interaction is a stereo-vision algorithm that in combination with a probabilistic computing architecture should exploit the trade-off between energy and reconstruction result and lead to at least an order of magnitude of efficiency improvement with respect to standard approaches.

#### References

[1] L. Itti, C. Koch, E. Niebur, A model of saliency-based
visual attention for rapid scene analysis, IEEE Transactions on
Pattern Analysis and Machine Intelligence 20 (11) (1998) 1254 -
1259.

[2] U. Rajashekar, L. Cormack, A. Bovik, Image features that
draw ¯x- ations, in: 10th IEEE International Conference on
Image Processing (ICIP 2003), Barcelona, Spain, Vol. 3, IEEE
Computer Society, 2003,pp. 313-316.

[3] A. Torralba, A. Oliva, M. Castelhano, J. M. Henderson,
Contextual guidance of eye movements and attention in
real-world scenes: the role of global features in object
search, Psychological Review 113 (4) (2006) 766-786.

[4] J. Denzler, C. Brown, Information theoretic sensor data
selection for active object recognition and state estimation,
IEEE Transactions on Pattern Analysis and Machine Intelligence
24 (2) (2002) 145-157.

[5] N. Sprague, D. Ballard, Eye movements for reward
maximization, in: S. Thrun, L. Saul, B. SchÄolkopf (Eds.),
Advances in Neural Information Processing Systems 16, MIT
Press, Cambridge, MA, 2004.

[6] G. de Croon, E. Postma, H. van den Herik, A situated model
for sensory-motor coordination in gaze control, Pattern
Recognition Letters 27 (11) (2006) 1181-1190.

[7] S. Jodogne, J. Piater, Closed-loop learning of visual
control policies, Journal of Artificial Intelligence Research
28 (2007) 349-391.

[8] L. Xu, E. Oja, A new curve detection method: Randomized
hough transform, Pattern Recognition Letters 11 (1990) 331 -
338.

[9] J. Shotton, J. Winn, C. Rother, A. Criminisi, Textonboost:
Joint appearance, shape and context modeling for multi-class
object recognition and segmentation, in: ECCV 2006, 2006.

[10] C. Barnes, E. Shechtman, A. Finkelstein, D. Goldman,
Patchmatch: A randomized correspondence algorithm for
structural image editing, in: ACM Transactions on Graphics
(Proc. SIGGRAPH), 2009.

[11] de Croon, G.C.H.E., de Weerdt. E., de Wagter, C., Remes,
B.D.W. "The appearance variation cue for obstacle avoidance.",
in the IEEE conference on Robotics and Biomimetics 2010 (ROBIO
2010).

[12] Probabilistic Arithmetic and Energy Efficient Embedded
Signal Processing (Received Best Paper Award.) Jason George, Bo
Marr, Bilge E. S. Akgul and Krishna V. Palem. Proceedings of
the Intl. Conference on Compilers, Architecture and Synthesis
for Embedded Systems (CASES), Seoul, Korea, October 23-25,
2006.

[13] Probabilistic Computing with Future Deep Sub- Micrometer
Devices : A Modelling Approach, Hamid, Nor H, Murray, Alan F,
Laurenson, David, Roy, Scott Cheng, Binjie, Symposium A
Quarterly Journal In Modern Foreign Literatures, 2005, pages:
2510-2513

[14] Error Immune Logic for Low-Power Probabilistic Computing,
Marr, Bo, George, Jason, Degnan, Brian, Anderson, David V.,
Hasler, Paul, VLSI Design, 2010, pages 1-10

[15] Stereo based navigation in unstructured environments,
Mark, W. van der, Groen, F.C.A., & Heuvel, J.C. van den
(2001). In IEEE Instrumentation and Measurements Conference
(pp. 2038-2042). Budapest, Hungary.

[16] Learning long-range vision for autonomous off-road
driving, R. Hadsell, P. Sermanet, J. Ben, A. Erkan, M.
Scoffier, K. Kavukcuoglu, U. Muller, and Y. LeCunn

[17] Advances in computational stereo, M.Z. Brown, D. Burschka,
and G.D. Hager. In IEEE Transactions on Pattern Analysis and
Machine Intelligence, vol. 25, no. 8, 2003.

[18] A region based stereo matching algorithm using cooperative
optimization, Zeng-Fu Wang, Zhi-Gang Zheng, 2008 IEEE
Conference on Computer Vision and Pattern Recognition,
pp.1-8.