STEAM: Spike Time Encoded Addressable Memory

September 22, 2017

By Dr. Vinnie Monaco and Manny Vindiola, ARL

Computing architectures are on the cusp of a fundamental shift towards concurrency, as sequential models struggle to process large amounts of data in real time [1]. Alongside this shift, neural-inspired computing architectures designed to operate within size, weight, and power, or SwaP, constrained environments, have begun to emerge. Such massively parallel architectures are designed to mimic the way the brain functions through distributed representations and event-driven computation. While the conventional von Neumann architecture suffers from a communication bottleneck imposed by a physically separated memory from the central processing unit [2], neural-inspired architectures can achieve an extremely high processing throughput and power efficiency by having memory co-located with neuron or neuron-like computation units. In addition, the atomicity and simplicity of neuron-like units opens the door to exploiting novel materials for computation, moving beyond silicon and digital architectures. There is no doubt that such architectures could be utilized in a wide range of applications where low-power and high processing throughput is needed, such as medical, robotic, space, and aeronautics. However, programming such architectures remains a challenge.

Neural processing units hold the promise of being able to operate in SWaP constrained environments in addition to being robust to component failure, while symbolic processing on such architectures remains an open question. An outstanding goal in both the computational and cognitive sciences is integrated symbolic/sub-symbolic computation on a neural architecture. Many applications have focused on the latter, having utilized the NPU as an accelerator for deep neural networks and brain models, despite their being capable of universal computations. As a result, current NPUs must be co-located with a central processing unit, which performs the symbolic processing tasks required for basic device functionality, such as resource management, input/output control, and other tasks that require predefined behavior. These tasks are perhaps best left to symbolic programming techniques for which the CPU has a demonstrated ability to perform. But with the reliance on the CPU comes physically separated memory and the von Neumann bottleneck, not completely taking advantage of the benefits of event-driven computation. Utilizing the NPU for symbolic processing tasks and building a complete neuromorphic device requires a computation paradigm compatible with the asynchronous and even-driven nature of neural architectures.

Greater interaction between computational science and neuroscience promises to aid in the development of such architectures. It is apparent that humans are capable of both pattern recognition and symbolic processing tasks, but it is not yet clear how to efficiently integrate the two on a single architecture. Spiking neural networks, also being capable of universal computation, enjoy the properties of both digital computation, through spike event rates, and analog computation, through the precise timing between spikes. SNNs have become a focal point for near-future neuromorphic architectures, as there exist a variety of materials shown to emulate the behavior of the spiking neuron. At the nanoscale level, magnetic tunnel junctions can mimic the behavior of a biological neuron and seem a promising candidate for low-power neuromorphic computation.

More recently, graphene excitable lasers have demonstrated integrate-and-threshold behavior, analogous to a spiking neuron, and enable response properties on the order of picoseconds. These architectures open the door to exciting new applications, such as efficient constraint satisfaction, real time radio frequency processing, and high-speed integer factorization. Yet, a hurdle remains in programming such devices to perform (the sometimes mundane) symbolic processing tasks, such as variable binding, reference passing, copying, sorting, and searching.

figure 1 and 2 play Video Play Video

Play Video

We have demonstrated how addressable memory can be implemented in such a neural processing architecture. As a step towards integrated symbolic/sub-symbolic computation on a neural architecture, we developed Spike Time Encoded Addressable Memory, or STEAM. It is a framework for persistent and addressable memory in spiking neural networks. STEAM is built using principles from the STICK framework, which encodes values using the time intervals between spike events and relies on precise timing and synchrony to perform event-driven computation, fig. 1. Location addressable memory is implemented through a number of network primitives that encapsulate flow control and persistence, enabling symbolic processing constructs such as variable binding and reference passing, fig. 2. Memory addressing schemes are abstracted away from the memory itself, allowing the same memory contents to be accessed through multiple data structures, such as arrays and linked lists, fig. 3. Finally, in an example application, we demonstrate how sorting can be performed in both time and space O(N), taking advantage of the inherent parallel computation on spiking neural architectures. The STEAM framework can be implemented entirely using basic leaky integrate and fire (LIF) neurons, and we have done so on the TrueNorth neuromorphic processor.

There are two ways in which addressable memory in spiking architecture would benefit: memory-augmented machine learning networks and symbolic processing. This work focuses only on memory structure and addressing mechanisms in spiking neural networks, leaving training and compilation techniques (that could utilize such memory structures) for future work. Whereas symbolic memory in rate-encoded networks has often been differentiable and taken similar form as [3], the representation of a symbolic memory in a SNN is non-trivial. Since only basic LIF neurons are needed, the methods are applicable to a wide range of current and future neural architectures. This work doesn't aim to provide a biologically plausible or even biologically inspired memory, but instead, a practical framework for addressable memory on neuromorphic architectures.

References

[1] Herb Sutter, James Larus: "Software and the concurrency revolution," Queue, pp. 54—62, 2005.

[2] John Backus: "Can programming be liberated from the von Neumann style?: a functional style and its algebra of programs," Communications of the ACM, pp. 613—641, 1978.

[3] Edward Grefenstette, Karl Moritz Hermann, Mustafa Suleyman, Phil Blunsom: "Learning to transduce with unbounded memory," Advances in Neural Information Processing Systems, pp. 1828—1836, 2015.

[4] Xavier Lagorce, Ryad Benosman: "Stick: Spike time interval computational kernel, a framework for general purpose computation using neurons, precise timing, delays, and synchrony," Neural computation, 2015.

 

Last Update / Reviewed: September 22, 2017