Intelligent Control and Automation
Vol.08 No.01(2017), Article ID:74324,22 pages

Cognitive Supervisor for an Autonomous Swarm of Robots

Vladimir G. Ivancevic, Darryn J. Reid

Decision Sciences, Joint and Operations Analysis Division, Defence Science & Technology Group, Canberra, Australia

Copyright © 2017 by authors and Scientific Research Publishing Inc.

This work is licensed under the Creative Commons Attribution International License (CC BY 4.0).

Received: January 16, 2017; Accepted: February 21, 2017; Published: February 24, 2017


As a sequel to our recent work [1] , in which a control framework was developed for large-scale joint swarms of unmanned ground (UGV) and aerial (UAV) vehicles, the present paper proposes cognitive and meta-cognitive supervisor models for this kind of distributed robotic system. The cognitive supervisor model is a formalization of the recently Nobel-awarded research in brain science on mammalian and human path integration and navigation, performed by the hippocampus. This is formalized here as an adaptive Hamiltonian path integral, and efficiently simulated for implementation on robotic vehicles as a pair of coupled nonlinear Schrödinger equations. The meta-cognitive supervisor model is a modal logic of actions and plans that hinges on a weak causality relation that specifies when atoms may change their values without specifying that they must change. This relatively simple logic is decidable yet sufficiently expressive to support the level of inference needed in our application. The atoms and action primitives of the logic framework also provide a straightforward way of connecting the meta-cognitive supervisor with the cognitive supervisor, with other modules, and to the meta-cognitive supervisors of other robotic platforms in the swarm.


Autonomous Robotic Swarm, Cognitive Supervisor, Hippocampus Path Integration and Navigation, Hamiltonian Path Integral, Modal Logic, Nonlinear Schrödinger Equation, Reasoning about Actions and Plans

1. Introduction

Recently, we have proposed in [1] a rigorous model for prediction and control of a large-scale joint swarm of unmanned ground vehicles (UGVs) and unmanned aerial vehicles (UAVs), performing an autonomous land-air operation. In that paper, we have also introduced a need for a cognitive supervisor for the high- dimensional distributed multi-robotic system. Its primary task is to have a birds- eye view of the situation across the joint land-air operation, and based on the GPS locations of both the target and all included robots, to provide them with good 2D and 3D attractor fields so that they can reach the proximity of the target in the shortest possible time. The purpose of the present paper is to develop a rigorous model for this cognitive supervisor, based on recent discoveries in brain science that show us how humans navigate in 2D environments and how bats navigate in 3D environments, and to couple this with a meta-cognitive supervisor model that allows vehicles to reason about actions and construct simple plans.

The 2014 Nobel Prize in Physiology or Medicine was awarded jointly to John O’Keefe, May-Britt Moser and Edvard I. Moser “for their discoveries of cells that constitute a positioning system in the brain’’, in other words, for the hippo- campus path integration and navigation system.

Briefly, there is a part of a mammal brain, called the hippocampal formation, which in humans is mostly developed in taxi drivers, grows in size with their experience and can be also trained (like a muscle) using fast-action video games. The hippocampal formation provides a cognitive map of a familiar environment which can be used to identify one’s current location and to navigate from one place to another. This mapping system provides two independent strategies for locating places, one based on environmental landmarks and the other on a path integration system (see [2] [3] and the references therein), which uses information about distances traveled in particular directions. This brain navigation system exists in all mammals, while in humans it additionally provides the basis for the so-called episodic memory [4] .1

Two main components of the hippocampal formation (discovered by O’Keefe) are: (i) hippocampal place cells, and (ii) grid cells from the entorhinal cortex (discovered by Mosers). In particular, according to Edvard Moser, “All network models for grid cells involve continuous attractors ...”―which is similar to our attractor Hamiltonian dynamics of UGVs and UAVs, given by Equations (1)-(2) in the next section.

As inspired by this discovery in brain science, the present paper proposes a novel probabilistic spatio-temporal model for mammalian path integration and navigation, formulated as an adaptive Hamiltonian path integral. The model combines: (i) a cognitive map p ( t ) performed by hippocampal place cells, (ii) an entorhinal map q ( t ) performed by grid cells, (iii) a current of sensory (extra-hippocampal) stimuli J ( t ) , and (iv) Hebbian learning in hippocampal synaptic weights w ( t ) . This model represents an infinite-dimensional neural network, which can be simulated (using 106 - 107 neurons) on IBM’s TrueNorth chip.

We also propose to couple this cognitive supervisor to a meta-cognitive supervisor, supporting dynamic mission planning, using on an established pro- positional multimodal logic framework. This approach gives robotic vehicles the ability to construct and execute simple plans on the fly against goals, given local sensor information, state information communicated locally between vehicles, and aspects of the state of the robot itself. The coupling to the cognitive supervisor and to the outside world is through the truth value of logical atoms and using multimodal actions.

2. Affine Hamiltonian Control for a Joint (UGV + UAV) Swarm

The affine Hamiltonian control model with many degrees-of-freedom has been presented in the form of 2n-dimensional (2ND) Langevin-type attractor matrix equations with nearest-neighbor couplings, which represent two recurrent neural networks:


q ˙ j k = φ 2 D ( q A 2 D q j k ω j k q j k 2 p j k ) j , k p j k H j k u j k + η q ( t ) , p ˙ j k = φ 2 D ( p A 2 D p j k ω j k p j k 2 q j k ) + j , k q j k H j k u j k + η p ( t ) . (1)


q ˙ j k l = φ 3 D ( q A 3 D q j k l ω j k l q j k l 2 p j k l ) j , k , l p j k l H j k l u j k l + η q ( t ) , p ˙ j k l = φ 3 D ( p A 3 D p j k l ω j k l p j k l 2 q j k l ) + j , k , l q j k l H j k l u j k l + η p ( t ) . (2)

The following terms are used in Equations (1) and (2): q j k = q j k ( t ) and p j k = p j k ( t ) are time-evolving matrices defining coordinates and momenta of the UGV-swarm, respectively, with initial conditions: q j k ( 0 ) and p j k ( 0 ) . Similarly, q j k l = q j k l ( t ) and p j k l = p j k l ( t ) are time-evolving tensors defining coordinates and momenta of the UAV-swarm, respectively, with initial conditions: q j k l ( 0 ) and p j k l ( 0 ) . q A 2 D and p A 2 D are the 2D attractors for the UGV swarm, while q A 3 D and p A 3 D are the 3D attractors for the UGV swarm. φ 2 D and φ 3 D are the attractor field strengths for the UGV and UAV swarms, respectively; ω j k and ω j k l are corresponding adaptive weights of both swarms which can be trained by Hebbian learning, ( q j k , p j k ) and ( q j k l , p j k l ) are the initial formations of both swarms, u j k and u j k l are Lie-derivative controllers for both swarms, H j k and H j k l are affine Hamiltonians of both swarms, while η q ( t ) and η p ( t ) are zero-mean, delta-correlated, Gaussian white noises added to q and p variables in both swarms. For more technical details on affine Hamiltonian (or, similar, port-Hamiltonian) control of large-scale dynamical systems, see [1] and the references therein.

The purpose of the cognitive supervisor is to provide the 2D and 3D inputs to the recurrent neural nets (1)-(2), or specifically, 2D attractors ( q A 2 D , p A 2 D ) for the UGV-swarm and 3D attractors ( q A 3 D , p A 3 D ) for the UAV-swarm (see, e.g. [5] ).

3. Adaptive Hamiltonian Path Integral

In his recent Nobel lecture, John O’Keefe referred to his pioneering 1976-paper [6] [7] , describing the function of the hippocampal place cells performing Tolman’s cognitive mapping: “When an animal had located itself in an environment (using environmental stimuli) the hippocampus could calculate subsequent positions in that environment on the basis of how far and in what direction the animal had moved in the interim ...” This quotation was accompanied by a commutative diagram depicting the vector addition and a suggestion that an animal moves in a sequence of vectors.

This extract from O’Keefe’s lecture is the motivation for the present mathematical model. Basically, any two-dimensional (2D) vector is equivalent to a complex number: z = x + i y , where ( x , y ) are Cartesian coordinates and i = 1 . The same complex number z can be also given in the polar form as: z = r e i θ , where r is the radius vector and θ is the heading. The sequence of N vectors is the sum of complex numbers:

particle likeanimalmotion = k = 1 N z k = k = 1 N x k + i y k = k = 1 N r k e i θ k . (3)

In this way, we can describe a particle-like animal motion in the complex plane , from some initial point A to the final point B , performed in N steps, as an integral complex number (3). This basic idea describes an animal’s motion in purely static and deterministic terms; it can be generalized into a more realistic, probabilistic dynamics, as follows.2

Now, instead of the complex plane , consider a particle-like animal motion in the phase plane (p-q), where p = p ( t , x ) represents the action of the hippocampal place cells and q = q ( t , x ) defines the action of the entorhinal grid cells. The animal moves from some initial point A given by canonical coordinates ( q 0 , p 0 ) at initial time t 0 , to the final point B given by canonical coordinate ( q 1 , p 1 ) at final time t 1 , via all possible paths, each path having an equal probability (so that the sum of all path-probabilities is = 1). This most general 2D motion is properly defined by the transition amplitude:

B | A q 1 , p 1 , t 1 | q 0 , p 0 , t 0

whose absolute square represents the transition probability density function: P D F = | B | A | 2 | q 1 , p 1 , t 1 | q 0 , p 0 , t 0 | 2 .

The transition amplitude B | A can be calculated via the following Hamiltonian path integral3

q 1 , p 1 , t 1 | q 0 , p 0 , t 0 = D[ q ]D[ p ]exp{ i t 0 t 1 [ p( τ ) q ˙ ( τ )H( p,q ) ]dτ }, with D[ q ]D[ p ] 1 2π τ dq( τ )dp( τ ), (4)

where the integration is performed over the p ( τ ) and q ( τ ) values at every time τ , with the time-step d τ t j t j 1 and the velocity q ˙ ( τ ) q ( t j ) q ( t j 1 ) t j t j 1 .

For technical details of the derivation of the Hamiltonian path integral (4), see e.g. [8] [9] and the references therein.

Next, the sources from various extra-hippocampal stimuli can be incorporated into the basic transition amplitude (4) by adding some form of a bio-electric current J , as:

Z [ J ] = D [ q ] D [ p ] e x p { i t 0 t 1 [ p ( τ ) q ˙ ( τ ) H ( p , q ) + J ( τ ) q ( τ ) ] d τ } , (5)

where Z [ J ] = q 1 , p 1 , t 1 | q 0 , p 0 , t 0 is the system’s partition function dependent on the current J and obeying the normalization condition: Z [ J = 0 ] = 1 .

A generalization from a single particle Hamiltonian path integral (4) to the probabilistic dynamics of an N-particle system is straightforward. The phase-

space functional integral that defines the transition amplitude q 1 1 q 1 N , p 1 1 p N 1 , t 1 | q 0 1 q 0 N , p 1 0 p N 0 , t 0 , from the initial ND point ( q 0 1 q 0 N ) at time t 0 to the final ND point ( q 1 1 q 1 N ) at time t 1 is given by:

q 1 1 q 1 N , p 1 1 p N 1 , t 1 | q 0 1 q 0 N , p 1 0 p N 0 , t 0 = D [ q ] D [ p ] e x p { i t 0 t 1 [ k = 1 N p k ( τ ) q ˙ k ( τ ) H ( p k , q k ) ] d τ } , with D [ q ] D [ p ] = 1 2 π τ k = 1 N d q k d p k ,

where we are allowing for the full Hamiltonian of the system H ( p k , q k ) to depend upon all the N coordinates q k ( τ ) and momenta p k ( τ ) collectively.

Again, we can add various sources as incoming bio-currents J k as a straightforward generalization of the single-particle partition function (5) to the system of N particles:

Z N [ J k ] = D [ p ] D [ q ] e x p { i t 0 t 1 [ k = 1 N p k ( τ ) q ˙ k ( τ ) H ( p k , q k ) + J k ( τ ) q k ( τ ) ] d τ } ,

where Z N [ J k ] = q 1 1 q 1 N , p 1 1 p N 1 , t 1 | q 0 1 q 0 N , p 1 0 p N 0 , t 0 is the system’s partition function dependent on all the incoming currents J k and obeying the normalization condition: Z N [ J k = 0 ] = 1 , for k = 1 , , N .

Our final step is to transform the N-particle partition function Z N [ J k ] into an infinite-dimensional recurrent neural network (of a generalized Hopfield type) by including the hippocampal synaptic weights w k ( τ ) into it (compare with [12] ), as:

Z N [ w k , J k ] = D [ w ] D [ q ] D [ p ] e x p { i t 0 t 1 [ k = 1 N p k ( τ ) q ˙ k ( τ ) H ( p k , q k ) + J k ( τ ) q k ( τ ) ] d τ } , with D [ w ] D [ q ] D [ p ] = 1 2 π τ k = 1 N d w k d q k d p k , (6)

where the weights w k ( τ ) are adapted (in a discrete time) by Hebbian-type learning:

w k ( τ + 1 ) = w k ( τ ) + q η [ w k D ( τ ) w k A ( τ ) ] , (7)

where q = q ( τ ) , η = η ( τ ) represent local signal and noise amplitudes, respectively, while superscripts D and A denote desired and achieved system states, respectively.

The system (6-7) defines the proposed adaptive Hamiltonian path integral model for a generic mammalian path integration and navigation. Both the sequential (Ising spin) Hopfield network [13] with its Galuber dynamics and the graded-response Hopfield network [14] with its Fokker-Planck dynamics can be considered as special cases of this general Hamiltonian neural system.

Direct computer simulations of the adaptive Hamiltonian path integral system (6-7) can be performed on the IBM TrueNorth chip (see [15] and the references therein) as a Markov-chain Monte Carlo simulation over a grid of Hopfield nets (which are already implemented in the TrueNorth chip, using 106-107 artificial neurons).

In the next section we propose a more efficient approach to simulate the path integral system (6-7).

4. A Pair of Coupled Nonlinear Schrödinger Equations

In this section, instead of the direct computer simulations on a supercomputer, we will present an indirect approach of simulating the path integral system (6-7) on an ordinary PC, represented as a pair of coupled nonlinear Schrödinger (NLS) equations. In his first paper [11] , Feynman showed that his Lagrangian q-path integral was equivalent to the standard linear Schrödinger equation from quantum mechanics, given here for the case of a free particle (in natural physical units: = 1 , m = 1 ):

i t ψ ( t , x ) = 1 2 x x ψ ( t , x ) , ( i = 1 ; with z ψ = ψ z ) , (8)

which defines the complex-valued microscopic wave function ψ = ψ ( t , x ) , whose absolute square | ψ ( x , t ) | 2 defines the probability density function (PDF).

In the last decade it was shown (see [16] and the references therein) that if the linear Schrödinger Equation (8) is put into an adaptive (iterative) feedback loop, it adds a cubic nonlinearity with a potential field V = V ( x ) and becomes the NLS equation:

i t ψ = 1 2 x x ψ + V | ψ | 2 ψ , (9)

which now defines the macroscopic wave function ψ = ψ ( t , x ) whose absolute square | ψ ( x , t ) | 2 still defines the PDF [17] . A variety of analytical solutions for the NLS Equation (9) have been reported in [18] [19] .

Finally, to represent the Hamiltonian ( q , p ) -path integral, we can use the ( q , p ) -pair of NLS equations, as follows.

4.1. Special Case: Analytical Soliton

We start with a simple ( q , p ) -NLS pair representation for the path integral system (6-7), which admits the analytical closed-form solution, given by the so-called Manakov system (with the constant potential V ):4

i t q = 1 2 x x q + V ( | q | 2 + | p | 2 ) q , (10)

i t p = 1 2 x x p + V ( | q | 2 + | p | 2 ) p , (11)

which was proven in [20] , using the Lax pair representation [21] , to be completely integrable Hamiltonian system, by the existence of infinite number of involutive integrals of motion.

The ( q , p ) -NLS pair (10)-(11) admits both “bright” and “dark” soliton as solutions, of which the simplest one is the so-called Manakov bright 2-soliton given by:

[ q ( t , x ) p ( t , x ) ] = 2 b [ c 1 c 2 ] sech [ 2 b ( x + 4 a t ) ] e 2 i ( 2 a 2 t + a x 2 b 2 t ) , (12)

where a and b are real-valued parameters and | c 1 | 2 + | c 2 | 2 = 1.

4.2. General Case: Numerical Simulation

Now that we have introduced the simple ( q , p ) -NLS pair, we can define our real representation for the path integral system (6-7), as the following more general ( q , p ) -NLS pair:

i t q ( t , x ) = 1 2 w k I ( t , x ) x x q ( t , x ) + V ( t , x ) | q ( t , x ) | 2 p ( t , x ) , (13)

i t p ( t , x ) = 1 2 w k J ( t , x ) x x p ( t , x ) + U ( t , x ) | p ( t , x ) | 2 q ( t , x ) , (14)

including the bell-shaped (sech) spatiotemporal potentials:

V ( t , x ) = 1 2 sech ( a 1 t x 3 ) , U ( t , x ) = 1 2 sech ( a 2 t 3 x ) ,

and the soft-step shaped (tanh) spatiotemporal input currents:

I ( t , x ) = a 3 tanh ( a 4 t 3 x ) , J ( t , x ) = a 5 tanh ( a 6 t x 3 ) ,

together with the common initial condition:

χ ( x ) = 1 2 e x p ( i x ) sech ( 1 2 x ) ,

the set of parameters:

( i = 1 , a 1 = 0.4 , a 2 = 0.6 , a 3 = 0.3 , a 4 = 0.8 , a 5 = 0.7 , a 6 = 0.2 )

and the set of adaptive synaptic weights w k .

The ( q , p ) -NLS pair (13)-(14) has been numerically simulated in mathema- tica®, producing 3D plots of real and imaginary parts of the ( q , p ) -wave functions (see Figure 1) and density plots of attractor fields for robotic swarms (see Figure 2), using the following code:

Figure 1. Simulation of the ( q , p ) -NLS pair (13)-(14) in mathematica®, showing the adaptation of the ( q , p ) -waves with the change of the synaptic weights w k [ 0,1 ] . Each 3D plot shows the following two surfaces (real and imaginary values of the wave function): (a-q) is the q-wave plot at w 1 = 0.1 , (a-p) is the p-wave plot at w 1 = 0.1 ; (b-q) is the q-wave plot at w 2 = 0.3 , (b-p) is the p-wave plot at w 2 = 0.3 ; (c-q) is the q-wave plot at w 3 = 0.5 , (c-p) is the p-wave plot at w 3 = 0.5 ; (d-q) is the q-wave plot at w 4 = 0.7 , (d-p) is the p-wave plot at w 4 = 0.7 .

Figure 2. Density plots corresponding to 3D plots in Figure 1, representing hypothetical attractor fields for robotic swarms.

Defining potentials:

V [ t _ , x _ ] : = 1 2 Sech [ 0.4 t x 3 ] ; U [ t _ , x _ ] : = 1 2 Sech [ 0.6 t 3 x ] ; I [ t _ , x _ ] : = 0.3 Tanh [ 0.8 t 3 x ] ; J [ t _ , x _ ] : = 0.7 Tanh [ 0.2 t x 3 ] ; IC [ x _ ] : = 1 2 Exp [ i x ] Sech [ 1 2 x ] ; Tfin = 50 ; L = 10 ; w = 0.9 ;

Defining NLS-equations:

NLS2 = { i t q [ t , x ] = = w 2 I [ t , x ] x , x q [ t , x ] + Vq [ t , x ] Abs [ q [ t , x ] ] 2 p [ t , x ] , i t p [ t , x ] = = w 2 J [ t , x ] x , x p [ t , x ] + Vp [ t , x ] Abs [ p [ t , x ] ] 2 q [ t , x ] } ;

Numerical solution:

sol = NDSolve [ { NLS2 , q [ 0, x ] = = IC [ x ] , q [ t , L ] = = q [ t , L ] , p [ 0, x ] = = IC [ x ] , p [ t , L ] = = p [ t , L ] } , { q , p } , { t ,0, Tfin } , { x , L , L } , Method { MethodOfLines , SpatialDiscretization { TensorProductGrid , DifferenceOrder Pseudospectral } } ] ;

3D Plots:

Qplot = Plot3D [ { Evaluate [ Re [ q [ t , x ] / .First [ sol ] ] ] , Evaluate [ Im [ q [ t , x ] / .First [ sol ] ] ] } , { x , L , L } , { t ,0, Tfin } , PlotRange All , ColorFunction ( Hue [ # ] & ) , AxesLabel { x , t , Re / Im [ q ] } , ImageSize 400 ] Pplot = Plot3D [ { Evaluate [ Re [ p [ t , x ] / .First [ sol ] ] ] , Evaluate [ Im [ p [ t , x ] / .First [ sol ] ] ] } , { x , L , L } , { t ,0, Tfin } , PlotRange All , ColorFunction ( Hue [ # ] & ) , AxesLabel { x , t , Re / Im [ p ] } , ImageSize 400 ]

The bidirectional associative memory, given by the NLS-pair (13)-(14) effectively performs quantum neural computation, by giving a spatiotemporal generalization of Hopfield, Grossberg and Kosko BAM family of recurrent neural networks (see [16] and the references therein). In addition, the shock- wave and solitary-wave nature of the coupled NLS equations may describe brain-like effects: propagation, reflection and collision of shock and solitary waves (see [24] ).

5. Meta-Cognitive Supervisor

The meta-cognitive supervisor model is concerned with equipping the elements of the robotic swarm with a limited capability for higher reasoning about potential consequences of its actions, using logical constraints that attempt to rule out actions leading to states that we wish to avoid. That is not to say that such states will never occur, so we also require considerable flexibility in our formulation, in the sense that it should maintain the potential to continue to operate under adverse conditions including partial system failure. Yet we require a formalism for capturing this that is as simple as possible, decidable, and computationally feasible in practice on small platforms; in this sense, we regard the Situation Calculus as too rich for our present purpose, since it is inherently first-order and consequently undecidable.

Our starting point is with the logic of actions and plans L A P , which is broadly similar to the propositional dynamic logic P D L [25] [26] , with the difference being that the necessity operator in L A P is slightly less expressive (and hence less expensive) than the iteration operator in P D L . This loss of expressivity does not currently appear to be a limiting factor in our application, while the compactness and strong completeness of LAP -which is not shared by P D L -represent a distinct advantage in the demanding highly dynamic context of our application and the limited resources of our envisaged platforms.

The logic of necessity in L A P is S4 [27] , and has a dual possibility operator , which is used to express goals. The logic of each modal operator [ υ ] , where υ V is a countable set of verbs designating distinct actions, is the basic modal logic K [27] . We also include a null action V . There is also a countable set A of atoms, which designate relevant state conditions in the environment (including within robotic system itself). The set of literals A ¯ consists of all atoms a A and their negations ¬ a . We symbolize the canonical tautology as , which evaluates to true in all interpretations, and the canonical contradiction as , which evaluates to false in all interpretations.

Formulae are then defined in the usual manner: and are formulae, all literals a and ¬ a for a A are formulae, and conjunctions f g , disjunctions f g , and material implications f g are formulae if f and g are formulae, and [ υ ] f is a formula if f is a formula and υ V is a verb. Nothing else is a formula.

For our application, we utilize a message passing approach for distributed communication, so actions for local broadcast of meta-cognitive supervisor messages are represented logically in meta-cognition as verbs and any messages received appear as atoms. Note also that this framework admits interaction with the cognitive supervisor, in both directions; specifically, some actions define initial conditions for the determination of the 2D ( q A 2 D , p A 2 D ) and 3D ( q A 3 D , p A 3 D ) attractors. Actions for attractor determination are relative to platform position rather than absolute. A formula that does not contain any of the modal operators , , or [ υ ] for υ V is described as classical.

Possibility and necessity are related as f ¬ ¬ f , and we write υ f to mean ¬ [ υ ] ¬ f . Goals are formulas having the form f , where f is classical. For every verb υ V and formula f , we have f [ υ ] f , which means that the formula f is invariant; formulae of the form f constitute integrity constraints. When f in f is classical, the constraint is a static constraint, while f where f contains [ υ ] for some υ V is a dynamic constraint that describes some action law. We are especially concerned with effect con-

straints, which have the form ( f [ υ ] g ) , and describe the consequences of performing actions. For instance, ( C a r r y i n g [ r e l e a s e ] D e l i v e r e d ) states

that if Carrying is true and the action release is executed, then Delivered will be made true.

Frame axioms are effect constraints of the form ( f [ υ ] f ) , which specifies that action [ υ ] cannot cause the condition ¬ f . For example, ( C a r r y i n g [ t r a v e l ] C a r r y i n g ) specifies that Carrying will remain true after action travel is executed if it was true before. Formulae of the form [ υ ] f mean that after performing action υ , condition f will be true, so [ r e l e a s e ] ¬ C a r r y i n g says that Carrying is false after executing release, for instance. We can also relate alternate actions υ and ψ depending on the truth or falsity of a current condition f to yield an outcome g using ( ( f [ υ ] g ) ( ¬ f [ ψ ] g ) ) , which we abbreviate as [ f ? υ : ψ ] g . A formula ¬ [ υ ] means that action υ is executable, while [ υ ] means that υ is not executable.

The major weakness with using L A P directly lies in the large number of frame axioms required; these axioms in bulk means that any action changes the truth value of relatively few formulae, which bogs down the inference procedure. Even though our current meta-cognitive definitions are small, these frame axioms would likely still be sufficiently numerous to cause problems for the limited computational resources we typically expect on the currently available small and cheap platforms our research is targeting. Consequently, our choice of logic of action and plans is a variant of L A P , namely L A P D [28] , which employs a weak ternary causal dependence relation involving atoms, actions and conditions to overcome the frame problem. This logic also means that our meta-cognitive supervisor is able to derive indirect effects of actions, which we expect will be very important in terms of being able to reason about exposure to unacceptable failure.

The logic L A P D is a variant of the earlier L A P [29] , which also solves the frame problem using a notion of weak causal dependence. The attraction for our problem of L A P D over L A P is that the former supports more compact domain descriptions without decreasing expressibility, increasing complexity or sacrificing decidability; using the latter would force us to still have to explicitly state conditional frame axioms, which have the form ( ( f g ) [ υ ] g ) . In L A P D . In either system we still have to state indirect dependencies.

In contrast to the ternary dependence relation provided by our choice of L A P D , the dependence relation in L A P is a binary relation between actions and literals; it is actually the complementary independence relation

that is used here to encode frame axioms, according to υ a ( ¬ a [ υ ] ¬ a ) .

We use the logic L A P D because it includes an extra parameter in the dependence relation to capture the conditions under which actions may impact on atoms, thereby avoiding the need to state conditional frame axioms in defining the problem domain. We write the ternary contextual dependence relation as f | υ a , where f is a classical formula, υ V is a verb and a A is an atom, to mean that if f is true then action υ may change the truth value of a . Note that weak contextual dependence does not mean that the action in the context causes a change in truth value of the atom, only that the change might happen. For instance, C a r r y i n g ¬ D e l i v e r e d | r e l e a s e D e l i v e r e d states that executing release when Carrying is true and Delivered is false may effect the value of Delivered; the conditional frame axiom ( ( ¬ C a r r y i n g ¬ D e l i v e r e d ) [ r e l e a s e ] ¬ D e l i v e r e d is not needed in the domain description. In our case, the conditions f in f | υ a are always conjunctions of literals, because disjunctions can be simply split into separate dependence statements. Figure 3 contains a small example of a robot meta- cognition scenario in L A P D .

The logic L A P D is decidable, with the satisfiability problem being EXPTIME-complete, which is the same decidability and complexity as the base system L A P . A tableau method for L A P is simply a combination of the tableau rules for the logics S4 and K, while L A P and L A P D require additional rules to handle their dependency relations. We use a notational variant of the definitions from [29] for tableau rules for L A P , with the different rules from [28] for the ternary weak dependence relation | in place of those for to describe the tableau calculus.

A labeled formula is a pair ( n , f ) where f is a formula and n is from a countable set of labels for possible worlds; we just use the non-negative integers . A Skeleton is a ternary relation Σ ( V { } ) × × , which represents the accessibility relations between possible worlds under actions; we write n υ m for ( υ , n , m ) Σ . A Tree, which corresponds to a L A P D -model, is a pair ( L , Σ ) consisting of a set L of labeled formulae and a skeleton Σ . A tableau for f is then the limit of a sequence T n for k = 0 , 1 , of sets of trees, where T 0 = { ( { ( 0 , f ) } , { } ) } and each T k + 1 is obtained from T k by the application of a tableau rule (see Figure 4). We also use additional rules for other connectives in practice, but they amount to simple combinations of the basic rules shown here.

W = { t r a v e l , ( ¬ C a r r y i n g p i c k u p ) , r e l e a s e , [ p i c k u p ] C a r r y i n g , [ r e l e a s e ] ¬ C a r r y i n g , ( C a r r y i n g [ r e l e a s e ] D e l i v e r e d ) , ¬ C a r r y i n g | p i c k u p C a r r y i n g , C a r r y i n g | r e l e a s e C a r r y i n g , C a r r y i n g ¬ D e l i v e r e d | r e l e a s e D e l i v e r e d } K = { ¬ C a r r y i n g , ¬ D e l i v e r e d }

Figure 3. A simple theory in L A P D . Note the absence of frame axioms, which we would have in L A P , and conditional frame axioms we would still need in L A P . We have L A P D ( S K ) [ p i c k u p ] [ t r a v e l ] [ r e l e a s e ] ( ¬ C a r r y i n g D e l i v e r e d ) .

( ) :if ( n , f ) L and ( n , ¬ f ) L thenadd ( n , ) to L ( ¬ ) :if ( n , ¬ ¬ f ) L thenadd ( n , f ) to L ( ) :if ( n , f g ) L thenadd ( n , f ) and ( n , g ) to L ( ) :if ( n , ¬ ( f g ) ) L thenadd ( n , f ) andadd ( L { ( n , ¬ g ) } , Σ ) to T i + 1 ( T ) :if ( n , f ) L thenadd ( n , f ) to L ( 4 ) :if ( n , f ) L and n υ m forsome υ V or n m thenadd ( m , f ) to L ( K V ) :if ( n , [ υ ] f ) L and n υ m thenadd ( m , f ) to L ( ) :if ( n , ¬ [ υ ] f ) L thenadd ( m , ¬ f ) to L and n υ m to Σ , where m isnew ( ) :if ( n , ¬ f ) ) L thenadd ( m , ¬ f ) to L and n m to Σ , where m isnew ( R P ) :if ( n , f ) L , where f isaliteral , n υ m , and L ( n , g ) forall g | υ f thenadd ( m , f ) to L ( R B ) :if ( m , f ) L , where f isaliteral , n υ m , and L ( n , g ) forall g | υ f thenadd ( n , f ) to L

Figure 4. Tableau rules for L A P D , as a modification of the tableau rules for L A P , with the final two rules substituted for new rules for handling the ternary dependence relation. The notation L ( n , a ) means that the literal a cannot be verified in world n from the formulae in L .

The rule ( R P ) states that all literals that depend on an action in a context that does not verify have to be propagated following the execution of that action. Rule ( R B ) is a back-propagation rule, which is required for completeness of the tableau method, and it states that literals that are true but whose truth value was not changed from the parent node in the tree must also have been true in the parent node. See [29] and [28] for further details.

Space precludes a full account of our test implementation; our system is in the pure functional language Haskell. We use a straightforward algebraic type for representing formulas, and labeled formulas are represented using a simple type synonym. The ordering of the clauses is important: formulas causing tableau branching are lower in the definition, which means that the ordering on formulas that the compiler generates by declaring the type to be an instance of “Ord” is used to prioritize reduction rule application to push branching towards the lower part of the tableau.

data LAPFormula = F|T|Atom String|Not LAPFormula And Formula

|Necessary LAPFormula|Possible LAPformula

|Cause String Formula...

deriving (Eq, Ord)

type Formula = (Int, Formula)

Space precludes a complete account of our test implementation, so we offer an abbreviated description focussing on the core components. In addition to the tableau evaluation module, we also have parser modules, and a number of supporting data structures particularly for representing skeleton relations and dependency relations. The tableau is implemented in operational semantics style, using an algebraic data type to represent a set of abstract operations on tableau branches. As a sample: “Fail” represents an error condition, ‘Result’ carries a return value, “Get” retrieves a formula for reduction, “Put” adds a labeled formula to the branch where it will be subject to further reduction, “Fresh” generates a new possible world index, “Claim” asserts accessibility between possible worlds under actions, “Close” closes the tableau branch by asserting a contradiction for a given world index, and “Split” forks the tableau into two tableau.

data Tableau u = Fail

|Result u

|Get (Formula -> Tableau u)

|Put Formula (Tableau u)

|Fresh (Int -> Tableau u)

|Claim (Int, String, Int) (Tableau u)

|Close Int (Tableau u)

|Split (Tableau u) (Tableau u)


There are also constructors for other operations such as those for adding labeled literals to a separate list where they will not be subject to further reduction steps, for searching the skeleton relation, and for searching the dependency relations (our implementation actually supports L A P as well as L A P ). These operations all follow the same basic pattern as those shown, so we have omitted them for brevity.

instance Monad Tableau where

return = Result

Fail >>= f = Fail

(Result u) >>= f = f u

(Get g) >>= f = Get $ \x -> g x >>= f

(Put x t) >>= f = Put x (t >>= f)

(Fresh g) >>= f = Fresh $ \n -> g n >>= f

(Close n t) >>= f = Close n (t >>= f)

(Split t t’) >>= f = Split (t >>= f) (t’ >>= f)


It is easy to verify that this obeys the left unit, right unit and associativity monad laws. For convenience, we use some simple wrapper functions around the constructors, rather than using them directly. For instance:

get :: Tableau Formula

get = Get Result

put :: Formula -> Tableau ()

put x = (Put x . Result) ()

close :: Int -> Tableau ()

close n = (Close n . Result) ()

The instantiation of the abstract machine uses a data type that has separate structures for holding labeled formulas that will be subject to further reduction, those that will not be further used to fire rules, the skeleton relation of accessibility between possible worlds, and the ternary dependency relation. Here “BBTree” is an ordered tree type that implements the priority relation on formulas, so that non-branching formulas are preferred for reduction over those that cause branching. “Skeleton” is an indexed tree structure supporting the various kinds of searches needed on accessibility relation instances, and “Dependency” represents the ternary dependency relation and similarly allows the necessary searches on its instances.

data Branch = Branch {

todo :: BBTree Formula

, lits :: BBTree Formula

, rho :: Skeleton

, index :: Int

, deps :: Dependency

} deriving (Eq, Ord)

The central function is the “runTab” function that defines the how the abstract operations should be applied to branch structures. The rest of the cases (not shown) follow the same basic pattern.

runTab:: (Ord u) => Tableau u -> Branch -> BBTree (u, Branch)

runTab Fail _ = nil

runTab (Result y) b = singleton (y, b)

runTab (Get g) b = case delmin (todo b) of

Nothing -> nil

(Just x, xs) -> run (g x) (b { todo = xs})

runTab (Put f t) b = let b’ = into b f

in runTab t b’

runTab (Fresh g) b = let n = 1 + ()index b)

b’ = b {index = n}

in runTab (g n) b’

runTab (Close n t) b = let b’ = b { todo = nil, lits = single (n, F) }

in runTab t b’

runTab (Split t t’) b = (runTab t b) ‘mplus’ (runTab t’ b)


The functions “nil” and “singleton” build trees with zero and one element, respectively. The case for ‘Put’ uses an auxiliary function ‘into’ that checks first to see if the branch already contains the negation of the formula to be inserted into the branch and, if so, inserts a formula containing contradiction “F” instead. This approach provides us with a very compact and natural definition for the tableau reduction rules, as illustrated below.

reduce :: Formula -> Tableau ()

reduce (n, F) = close n

reduce (n, T) = return ()

reduce (n, And x y) = put (n,x) >> put(n,y)

reduce (n, Or x y) = put (n, x) ‘mplus’ put (n,y)

reduce (n, Not (Necessary x)) = fresh >>= \n -> put (n’, neg x)

>> claim (n, “[]”, n’)

reduce (n, Necessary x) = put (n, x) >> fromN w

>>= mapM_ (\(_, m) -> put (m, Necessary x)))


The second to last case implements rule ( ) The last case in the snippet implements rules ( T ) and ( 4 ) . It uses a function “fromN” that returns a list of verbs and world indexes accessible from a given world index, and the standard library “mapM” to map each of these to actions that insert each resulting formula into the branch. A program for the abstract tableau machine for completely expanding a branch is also very simple.

expand :: Tableau ()

expand = isComplete >>= \c -> if c then return ()

else get >>= reduce >> reduction

Given an initial branch, the tableau can be applied to an initial branch, say b, with runTab expand b. We also have a number of convenience functions for producing an initial branch data structure from a list of formulas and dependency relation instances, and for extracting the results from the resulting tree of closed and saturated branches. Their implementation is straightforward though somewhat tedious.

6. Conclusions

In this paper, we have presented sophisticated cognitive and meta-cognitive supervisor models for joint swarms of robotic aerial and ground vehicles. Based on the research of the recent Nobel prize in Physiology on path integration and navigation in the mammalian and human hippocampus (briefly reviewed in the appendix), this paper develops a Hamiltonian path integral cognitive supervisor model. This model emulates an -dimensional neural recurrent neural network, yielding attractor fields for robotic swarms. While direct simulation of this Hamiltonian path integral can be done using IBMs’s TrueNorth chip, for the purpose of its immediate evaluation on common hardware, we have transformed this into a coupled pair of NLS equations and simulated this in Mathematica.

The central point of our meta-cognitive model is that we are not utilising inference in modal logic systems to determine low-level movement, but rather to simulate a kind of high-level awareness in light of descriptive goals, general conditions and broad actions to make sense of sensor data and guide overall vehicle behaviour. Our representation of meta-cognitive communication using simple atoms and verbs reflects this choice. By effectively delegating details, the relatively simple models are supported by the propositional multi-modal logic L A P D suffice, with decidability of inference and, in practice, reasonable computation costs and fairly compact meta-cognitive behavioural definitions. We plan to supplant our current use of a tableau method for meta-cognitive inference using the multimodal logic L A P D with a new path integral representation; that is, to eventually fully integrate meta-cognition and affine Hamiltonian control into what would amount to a single coherent generalized Hopfield type recurrent neural network.


The authors are grateful to Dr Yi Yue and Dr Martin Oxenham, Decision Sciences, Joint and Operations Analysis Division, DST Group, Australia―for their constructive comments which have improved the quality of this paper. This work is a part of the DSTG TAS SRI (Tyche) project.

Cite this paper

Ivancevic, V.G. and Reid, D.J. (2017) Cognitive Supervisor for an Autonomous Swarm of Robots. Intelligent Control and Automation, 8, 44- 65.


  1. 1. Ivancevic, V. and Yue, Y. (2016) Hamiltonian Dynamics and Control of a Joint Autonomous Land-Air Operation. Nonlinear Dynamics, 84, 1853-1865.

  2. 2. McNaughton, B.L., Battaglia, F.P., Jensen, O., Moser, E.I. and Moser, M.-B. (2006) Path Integration and the Neural Basis of the Cognitive Map. Nature Reviews Neuroscience, 7, 663-678.

  3. 3. Moser, E.I., Kropff, E. and Moser, M.-B. (2008) Place Cells, Grid Cells, and the Brain’s Spatial Representation System. Annual Review of Neuroscience, 31, 69-89.

  4. 4. O’Keefe, J., Burgess, N., Donnett, J.G., Jeffery, K.J. and Maguire, E.A. (1998) Place Cells, Navigational Accuracy, and the Human Hippocampus. Philosophical Transactions of the Royal Society of London A, 353, 1333-1340.

  5. 5. Monteiro, S., Vaz, M. and Bicho, E. (2004) Attractor Dynamics Generates Robot Formation, from Theory to Implementation. International Conference on Robotics and Automation, Vol. 3, New Orleans, 26 April-1 May 2004, 2582-2586.

  6. 6. O’Keefe, J. (1976) Place Units in the Hippocampus of the Freely Moving Rat. Experimental Neurology, 51, 78-109.

  7. 7. O’Keefe, J. and Nadel, L. (1978) The Hippocampus as a Cognitive Map. Clarendon Press, Oxford.

  8. 8. Ivancevic, V. and Ivancevic, T. (2008) Complex Nonlinearity, Chaos, Phase Transitions, Topology Change and Path Integrals. Springer, New York.

  9. 9. Ivancevic, V. and Reid, D. (2015) Complexity and Control, towards a Rigorous Behavioral Theory of Complex Dynamical Systems. World Scientific, Singapore.

  10. 10. Feynman, R.P. (1951) An Operator Calculus Having Applications in Quantum Electrodynamics. Physical Review, 84, 108-128.

  11. 11. Feynman, R.P. (1948) Space-Time Approach to Non-Relativistic Quantum Mechanics. Reviews of Modern Physics, 20, 267.

  12. 12. Ivancevic, V., Reid, D. and Scholz, J. (2014) Action-Amplitude Approach to Controlled Entropic Self-Organization. Entropy, 16, 2699-2712.

  13. 13. Hopfield, J.J. (1982) Neural Networks and Physical Systems with Emergent Collective Computational Abilities. Proceedings of the National Academy of Sciences, 79, 2554-2558.

  14. 14. Hopfield, J.J. (1984) Neurons with Graded Response Have Collective Computational Properties Like Those of Two-State Neurons. Proceedings of the National Academy of Sciences, 81, 3088-3092.

  15. 15. Service, R.F. (2014) The Brain Chip. Science, 345, 614-668.

  16. 16. Ivancevic, V. and Ivancevic, T. (2009) Quantum Neural Computation. Springer, Berlin.

  17. 17. Ivancevic, V. and Reid, D. (2012) Turbulence and Shock-Waves in Crowd Dynamics. Nonlinear Dynamics, 68, 285-304.

  18. 18. Ivancevic, V. (2010) Adaptive-Wave Alternative for the Black-Scholes Option Pricing Model. Cognitive Computation, 2, 17-30.

  19. 19. Ivancevic, V. (2011) Adaptive Wave Models for Sophisticated Option Pricing. Journal of Mathematical Finance, 1, 41-49.

  20. 20. Manakov, S.V. (1974) On the Theory of Two-Dimensional Stationary Self-Focusing of Electromagnetic Waves. Soviet Physics, 38, 248-253. (In Russian)

  21. 21. Lax, P. (1968) Integrals of Nonlinear Equations of Evolution and Solitary Waves. Communications on Pure and Applied Mathematics, 21, 467-490.

  22. 22. Haelterman, M. and Sheppard, A.P. (1994) Bifurcation Phenomena and Multiple Soliton Bound States in Isotropic Kerr Media. Physical Review E, 49, 3376-3381.

  23. 23. Yang, J. (1997) Classification of the Solitary Wave in Coupled Nonlinear Schrödinger Equations. Physica D, 108, 92-112.

  24. 24. Hanm, S.-H. and Koh, I.G. (1999) Stability of Neural Networks and Solitons of Field Theory. Physical Review E, 60, 7608-7611.

  25. 25. Fisher, M.J. and Ladner, R.E. (1979) Propositional Dynamic Logic of Regular Programs. Journal of Computer and System Sciences, 18, 194-211.

  26. 26. Zhang, D. and Foo, N.Y. (2001) EPDL, a Logic for Causal Reasoning. Proceedings of the IJCAI 2001, Seattle, 4-10 August 2001, 131-138.

  27. 27. Hughes, G.E. and Cresswell, M.J. (1996) A New Introduction to Modal Logic. Routledge, Abingdon-on-Thames.

  28. 28. Castilho, M.A., Herzig, A. and Varzinczak, I. (2002) It Depends on the Context! A Decidable Logic of Actions and Plans Based on a Ternary Dependence Relation. NMR’02, Toulouse, 19-21 April 2002, 343-348.

  29. 29. Castilho, M.A., Gasquet, O. and Herzig, A. (1999) Formalizing Action and Change in Modal Logic 1, the Frame Problem. Journal of Logic and Computation, 9, 701-735.

  30. 30. Hebb, D.O. (1949) The Organization of Behavior. Wiley, New York.

  31. 31. O’Keefe, J. and Dostrovsky, J. (1971) The Hippocampus as a Spatial Map, Preliminary Evidence from Unit Activity in the Freely Moving Rat. Brain Research, 34, 171-175.

  32. 32. Mittelstaedt, M.L. and Mittelstaedt, H. (1980) Homing by Path Integration in a Mammal. Naturwissenschaften, 67, 566-567. (In German)

  33. 33. Ranck, J.B. (1985) Electrical Activity of the Archicortex. Akademiai Kiado, Budapest, 217-220.

  34. 34. Bostock, E., Muller, R.U. and Kubie, J.L. (1991) Experience-Dependent Modifications of Hippocampal Place Cell Firing. Hippocampus, 1, 193-205.

  35. 35. McNaughton, B.L., Chen, L.L. and Markus, E.J. (1991) “Dead Reckoning”, Landmark Learning, and the Sense of Direction, a Neurophysiological and Computational Hypothesis. Journal of Cognitive Neuroscience, 3, 190-202.

  36. 36. Wilson, M.A. and McNaughton, B.L. (1993) Dynamics of the Hippocampal Ensemble Code for Space. Science, 261, 1055-1058.

  37. 37. O’Keefe, J. and Recce, M.L. (1993) Phase Relationship between Hippocampal Place Units and the EEG Theta Rhythm. Hippocampus, 3, 317-330.

  38. 38. McNaughton, B.L., et al. (1996) Deciphering the Hippocampal Polyglot, the Hippocampus as a Path Integration System. Journal of Experimental Biology, 199, 173-185.

  39. 39. Tsodyks, M. and Sejnowski, T. (1995) Associative Memory and Hippocampal Place Cells. International Journal of Neural Systems, 6, S81-S86.

  40. 40. Zhang, K. (1996) Representation of Spatial Orientation by the Intrinsic Dynamics of the Head-Direction Cell Ensemble, a Theory. Journal of Neuroscience, 16, 2112-2126.

  41. 41. Samsonovich, A. and McNaughton, B.L. (1997) Path Integration and Cognitive Mapping in a Continuous Attractor Neural Network Model. Journal of Neuroscience, 17, 5900-5920.

  42. 42. O’Keefe, J. (1999) Do Hippocampal Pyramidal Cells Signal Non-Spatial as Well as Spatial Information? Hippocampus, 9, 352-364.<352::AID-HIPO3>3.0.CO;2-1

  43. 43. Fyhn, M., Molden, S., Witter, M.P., Moser, E.I. and Moser, M.-B. (2004) Spatial Representation in the Entorhinal Cortex. Science, 305, 1258-1264.

  44. 44. Hafting, T., Fyhn, M., Molden, S., Moser, M.-B. and Moser, E.I. (2005) Microstructure of a Spatial Map in the Entorhinal Cortex. Nature, 436, 801-806.

  45. 45. Witter, M.P. and Moser, E.I. (2006) Spatial Representation and the Architecture of the Entorhinal Cortex. Trends in Neurosciences, 29, 671-678.

  46. 46. O’Keefe, J. and Burgess, N. (2005) Dual Phase and Rate Coding in Hippocampal Place Cells, Theoretical Significance and Relationship to Entorhinal Grid Cells. Hippocampus, 15, 853-866.

  47. 47. Solstad, T., Moser, E.I. and Einevoll, G.T. (2006) From Grid Cells to Place Cells, a Mathematical Model. Hippocampus, 16, 1026-1031.

  48. 48. Burgess, N., Barry, C. and O’Keefe, J. (2007) An Oscillatory Interference Model of Grid Cell Firing. Hippocampus, 17, 801-812.

  49. 49. Yartsev, M.M. and Ulanovsky, N. (2013) Representation of Three-Dimensional Space in the Hippocampus of Flying Bats. Science, 340, 367-372.

  50. 50. Yartsev, M.M. (2013) Space Bats, Multidimensional Spatial Representation in the Bat. Science, 342, 573-574.

  51. 51. Turing A.M. (1990) The Chemical Basis of Morphogenesis. Philosophical Transactions of the Royal Society B, 237, 37-72.


Hippocampal Path Integration and Navigation in Mammals and Humans

We start with a brief history of hippocampal navigation, from O’Keefe’s pioneering work to the discovery of grid cells by Mosers.

While most of neural network theory (including the concepts of associative synaptic plasticity, cell assemblies and phase sequences) is founded on Hebb’s seminal work [30] , and hippocampus as a spatial cognitive map was proposed in [7] [31] , the pioneering paper on hippocampal navigation was [6] , in which O’Keefe proposed a theoretical suggestion of a landmark-independent navigational system upstream of the hippocampus. A few years later, path integration in mammals was reported in [32] , followed by a quantitative description of head direction-sensitive cells in the brain by [33] , a report of remapping in hippo- campal place cells in [34] , and an early version of the head-direction path-inte- grator model in [35] , which formed the conceptual basis of subsequent continuous attractor models for path integration.

A landmark paper [36] introduced empirical understanding of hippocampal neurodynamics, by the ability to record simultaneously from many neurons in the freely behaving animal. The phase relationship between hippocampal place units and the EEG theta rhythm was shown in [37] , and the hippocampus as a path-integration system was proposed in [38] .

A series of continuous attractor papers started with [39] , followed by an attractor model of head direction cell by angular velocity integration in [40] and the introduction of the concept of periodic boundaries and an early introduction of medial entorhinal grid cells in mammals by [41] .

Next two papers by O’Keefe, [4] [42] , consider human hippocampus place cells, which are signaling both spatial and non-spatial information.

The pioneering study [43] reports that spatial position is represented accurately among ensembles of principal neurons in superficial layers of the medial entorhinal cortex (MEC), while the scale of representation increases along the MEC’s dorsoventral axis. It is followed by [44] that reports the discovery of grid cells, which are suggested as a foundation for a universal path integration-based neuronal map of the spatial environment. Spatial representation and the architecture of the entorhinal cortex was presented in [45] .

Next, we give a current brief overview of hippocampal formation: place cells and grid cells.

The review paper [2] shows that the hippocampal formation is able to encode relative spatial location of mammals and humans (without any reference to external cues) by the integration of translational and rotational self-motion, which is called the path integration.

Both theoretical and empirical studies show that the synaptic matrix of the MEC-grid cells of young mammals perform heavy self-organizing path-integra- tion computations, similar to Turing’s symmetry-breaking operation5, while the scale at which space is represented increases systematically along the dorsoventral axis in both the hippocampus and the MEC. Spatially periodic inputs (at multiple scales) converging from the MEC-grid cells, result in non-periodic spatial firing of the hippocampal place cells.

The paper [3] reviews how place cells and grid cells form the entorhinal-hip- pocampal representations, initially observed in [46] and mathematically modeled in [47] [48] , for quantitative spatio-temporal representation of places, routes, and associated experiences during behavior and in memory.

It has been observed that place cells perform both pattern completion and pattern separation, while hippocampal representations cannot always be discon- tinuous as in a sequential Hopfield network [13] , but rather similar to the graded-response Hopfield network [14] .

Finally, while all the research mentioned so far was dealing with 2D hippocampal path integration and navigation, which is relevant for our UGVs, in recent years this research has been generalized to 3D navigation of bats in [49] [50] .


1For technical details on the Nobel awarded work of John O’Keefe, May-Britt Moser and Edvard I. Moser, see Appendix and the references therein.

2For For simplicity reasons, we are using the same Hamiltonian symbols, q and p , for the cognitive representation of robotic coordinates and momenta, to emphasize the one-to-one correspondence between the physical robotic level and the mental supervisor level. However, while at the physical robotic level, q = q ( t ) and p = p ( t ) are only temporal variables, at the cognitive level, q = q ( t , x ) and p = p ( t , x ) represent spatiotemporal wave functions.

3The path integral (4) was formulated by R. Feynman in [10] . It has been widely appreciated that the phase-space (i.e., Hamiltonian) path integral is more generally applicable, or more robust, than the original, Lagrangian version of the path integral, introduced in Feynman’s first paper [11] . For example, the original Lagrangian path integral is satisfactory for Lagrangians of the form: L ( x ) = 1 2 m x ˙ 2 + A ( x ) x ˙ V ( x ) , but it is unsuitable, e.g., for the case of a particle with the Lagrangian (in normal units): L ( x ) = m q r t 1 x ˙ 2 . For such a system (as well as many more general expressions) the Hamiltonian path integral is more robust; e.g., the Hamiltonian path integral for the free particle: D [ p ] D [ q ] exp { i [ p q ˙ q r t p 2 + m 2 ] d t } is readily evaluated.

4The Manakov system has been used to describe the interaction between wave packets in dispersive conservative media, and also the interaction between orthogonally polarized components in nonlinear optical fibres (see, e.g. [22] [23] and the references therein).

5A landmark Turing’s paper [51] demonstrating that symmetry breaking can occur in the simple reaction-diffusion system, that results in spatially periodic structures can account for pattern formation in nature.