This is one of the very first papers that uses a reinforcement learning (RL) framework for BMI decoding. The technique used is semi-supervised because only a scalar (in this case, binary) reward signal is provided after tasks. This is markedly different from the more traditional supervised learning (SL) approach to decoding that uses kinematic variables as desired signal to train a regression model, etc.

The authors claim that the RLBMI architecture involves two coupled systems - while the user learns to use the BMI controller through neural adaptation, the BMI controller learns to adapt to the user via RL. While in theory this sounds promising, not much neural data is presented to show the neural-adaptation aspect of the architecture.

Computational Architecture: Q-learning. To the controller, environment=User’s Brain, State=neural activity, actions=prosthetic movement, rewards=task complete.

Experiment setup used

Experiment protocol

Goal is for the rat to move the robot arm to the other end of the room, to the lever that’s lit up. During brain-control, both the rat and controller will be rewarded when the arm is naeuvered proximal to the target. So distance to goal is used as a reward metric. This is intention-estimate, which is what the closed-loop decoder adaptation (CLDA) approaches Carmena’s group use.

In this RLBMI architecture, the value function estimation (VFE) is a non-trivial task. The value function Q is too big to be stored in a lookup-table, since while the total number of actions (robot movements) is 27, the number of possible states (neural vector configurations) is intractable. Thus a fully connected neural network is used, with a single layer of hidden units. Updated with *temporal difference (TD) error via backpropagation.

Weights were initialized to random. Goal for the BMI is some big radius within the goal. As training continued, the radius becomes smaller and smaller until it contains just the goal.

Neural data analysis shows rats were biased toward using a subset of the available robot actions, which moves the arm to target with not the most direct trajectories for all targets. Hidden layer feature representations should be analyzed to see how this happened, and how much of this is contributed by neural adapation vs. decoder adaptation.

Problems: RL-Deep learning is usually trained with large-batch of offline simulation data to speed up learning the value function. In the paper, the available data were reused in multiple-epoch, offline VFE training. Suggested using model-based RL that includes an environemental model to estimate future states and rewards…but this sounds just like Kalman filters with adaptive elements. Finally, rewards were prorammed by the BMI designer, but ideally they should be translated from the user’s brain activity – either the cortex or maybe basal ganglia.

Image cited: DiGiovanna, J.; Mahmoudi, B.; Fortes, J.; Principe, J.C.; Sanchez, J.C., “Coadaptive Brain–Machine Interface via Reinforcement Learning,” Biomedical Engineering, IEEE Transactions on , vol.56, no.1, pp.54,64, Jan. 2009. doi: 10.1109/TBME.2008.926699

References to Read

R. S. Sutton and A. G. Barto. Reinforcement Learning: An Introduction, 1998, MIT Press
J. K. Chapin , K. A. Moxon , R. S. Markowitz and M. Nicolelis, “Real-time control of a robot arm using simultaneously recorded neurons in the motor cortex”, Nat. Neurosci., vol. 2, pp. 664-670, 1999

Jul 26, 2015 - Deep Learning - Review by LeCun, Bengio, and Hinton

tags: Deep Learning, Machine Learning, Review, Hinton, Bengio, LeCun

Nature review on Deep Learning by LeCun, Bengio, and Hinton

Representational Learning is a set of methods that allows a machine to be fed with raw data and automatically discover the representations needed for detection or classification. Deep-learning methods are repsentation-learning methods with multiple levels of representation, obtained by composing simple but non-linear modules that each transform teh representation at one level into a representation at a higher, slightly more abstract level. With the composition of enough such transformations, very complex functions can be learned.

Key advantage of Deep Learning is that it requires very little engineering by hand, so it can easily take advantage of increases in the amount of availabel computation and data. Feature extraction becomes easier. The number of nodes determine what kind of input-space transformation is possible, and there can classify data that otherwise cannot using lower-dimension techniques.

Interesting historical fact: in the late 1990s, neural nets and backpropagation were largely forsaken by the community. It was widely thought that learning useful, multistage, feature extractors with little prior knowledge was infeasible. In particular, it was commonly thought that simple gradient descent would get trapped in poor local minima.

In practice, however, poor local minima are rarely a problem with large networks. Regardless of the initial conditions, the system nearly always reaches solutions of very similar quality. Theoretical and empirical results suggest that the landscape is packed with a combinatorially large number of saddle points where the gradient is zero, and the surface curves up in most dimensions and down in the remainder […] saddle points with only a few downward curving directions are present in very large numbers, but almost all of them have very similar values of the objective function. Hence, it does not much matter which of these saddle points the algorithm gets stuck at.

Convolutional neural networks (convNet) - four key ideas:

local connections: in array data, local groups of values are often highly correlated, forming distinctive local motifs that are easily detected; the local statistics of images and other signals are invariant to location.
shared weights: If a motif can appear in one part of the image, it could appear anywhere, hence the idea of units at different locations sharing the same weights and detecting the same pattern in different parts of the array.
pooling: merrge semantically similar features into one. Many natural signals are compositional hierarchies, in which higher-level features are otained by compoising lower-level ones.
use of many layers.

Distributed representations. two different exponential advantages over classic learning algorithms that do not use distributed representations - both arise from the power of composition and depend on the underying data-generating distribution having an appropriate compnential structure.

Learning distributed representations enable generalization to new combinations of the values of learned features beyond those seen during training (can be very useful BMI).
Composing layers of representation in a deep net brings the potential for another exponential advantage (not sure what it means).

Recurrent neural networks for tasks that involve sequential inputs. Most likely useufl for BMI. Can be augmented with an explicity memory, e.g. long short-term memory (LSTM) that use special hidden units, the natural behavior of which is to remember inputs for a long time.

Much progress shold come with systems that train end-to-end and combine ConvNets with RNNs that use reinforcment learning to decide where to look.

References to Read

Bottou, L. & Bousquet, O. The tradeoffs of large scale learning. In Proc. Advances in Neural Information Processing Systems 20 161–168 (2007).
Hinton, G. E. What kind of graphical model is the brain? In Proc. 19th International Joint Conference on Artificial intelligence 1765–1775 (2005).
Hinton, G. E., Osindero, S. & Teh, Y.-W. A fast learning algorithm for deep belief nets. Neural Comp. 18, 1527–1554 (2006). This paper introduced a novel and effective way of training very deep neural networks by pre-training one hidden layer at a time using the unsupervised learning procedure for restricted Boltzmann machines.
Cadieu, C. F. et al. Deep neural networks rival the representation of primate it cortex for core visual object recognition. PLoS Comp. Biol. 10, e1003963 (2014).
Farabet, C. et al. Large-scale FPGA-based convolutional networks. In Scaling up Machine Learning: Parallel and Distributed Approaches (eds Bekkerman, R., Bilenko, M. & Langford, J.) 399–419 (Cambridge Univ. Press, 2011).
Weston, J. Chopra, S. & Bordes, A. Memory networks. http://arxiv.org/abs/1410.3916 (2014).

Jul 13, 2015 - SW Ch. 8: What provides limb stability?

tags: neuroscience reading, Shadmehr and Wise. The Computational Neurobiology of Reaching and Pointing, motor system, proprioception, motor movements, motor feedback

Mechanisms of limb stability:

Antagonist muscle architecture produces an equilibrium point. Polit and Bizzi experiments with de-afferented monkeys showed passive properties can reach target, but cannot resist pertubation very well.
Passive, spring-like properties of limb promotes stability.
CNS reflexes.
Neuropathy - cannot sense location of limbs without vision.
[Mussa-Ivaldi] shows stiffness of the arm remains roughly constant when expressed in terms of joint coordinates.

Jul 13, 2015 - SW Ch. 7: What generates force and feedback?

tags: neuroscience reading, Shadmehr and Wise. The Computational Neurobiology of Reaching and Pointing, motor system, proprioception, motor movements, motor feedback

What generates force and feedback?

Description of molecular muscle mechanism - actin and myosin coupling to shorten muscle fibers.

A muscle model with parallel and series springs are made to explain the active and passive forces generated by the muscle fiber. The force curve peaks around rest-length and decreases as a muscles stretches or contracts. Explains why isometric force is the greatest, i.e. force output when muscle length is not changing.

Covered how to convert forces applied by the muscle on joints to torques around those joints. Assuming constant force, this is done by relating joint angles, bone and muscle lengths. Equating angular work with linear work \[ \tau\Delta\theta = f\Delta\lambda \], we can derive the Jacobian \( \mathbf{J}=\mathbf{\frac{d\lambda}{d\theta}} \), where \( \mathbf{\theta} \) can be a vector. Using this, \( \mathbf{\tau}=-\mathbf{J^T}f \).

Muscle afferents including golgi tendon organs and muscle-spindle afferents which act as mechanical force sensors for the muscle. The muscle-spindles are innervated with \( \gamma \)-neurons at the poles. The primary muscle spindle afferents in the central nuclear bag correspond somewhat to muscle force velocity. Secondary muscle spindle afferents in the poles correspond to muscle length. \( \gamma \)-neurons innervate the poles to change length (co-activated with \( \alpha \)-neurons for the extrafusal muscles) as a type of target muscle activation. Perfect muscles length would result in 0 change in the primary afferent firing rates.

The \( \alpha \)-\( \gamma \) afferents monosynaptic connection is important in motor feedback.

This section is important for proprioception, motor movements, and motor feedback.

Jul 13, 2015 - KS Ch. 37: Voluntary Movement - The Primary Motor Cortex

tags: neuroscience reading, Kandel et al. Principles of Neuroscience, motor system, motor movements, motor cortex, neural decoding

Focus on the control of voluntary movements of the hand and arm in primates. Cortical networks that control voluntary movement, the role of the primary motor cortex in the generation of motor commands.

Control of voluntary movement involves more than generating a particular pattern of msucle activity - involves sensory, perceptual, ad cognitive processes not rigidly compartmentalized in neural structures.

Woosley and Penfield: recognition of the motor cortex (rostral to the central sulcus) and the “cortical homunculus”. Area of arm and hand concentrated in the “Fundus”.

Naive understanding of voluntary movements says that voluntary motor control appears to be strictly serial processing, with only the neurons related to the last processing stage connecting to the spinal cord. This is not correct, as the brain does not even have a single, unified perceptual representation of the world.

Two main areas: Primary Motor Cortex and Premotor Cortex, which lies directly rostral of primary motor cortex. The medial part of the premotor cortex is the Supplementary motor area. However, these three areas can further be functionally organized into more areas, especially the parts of the premotor cortex.

The supplementary motor area, dorsal premotor cortex (PMd), and ventral premotor cortex (PMv) have somatotopically reciprocal connections with the primary motor cortex and with each other. Those and the primary motor cortex (M1) also have somatotopically inputs from the primary somatosensory cortex and the rostral parietal cortex (sensory areas).

Pre-supplementary and pre-dorsal premotor areas do not project to the primary motor cortex or anything more rostral. They receive higher-order cognitive information through the prefrontal cortex.

Several cortical regions project in multiple parallel tracts to subcortical areas of teh brain as well as the spinal cord. Therefore the theory of the primary motor cortex as the “final common path” from the cortex to spinal cord is incorrect, and multiple cortical regions contribute to voluntary movements.

Corticomotoneurons are corticospinal axons that extend into the ventral horn of the spinal cord and contact the spinal motor neurons. The axons of these neurons become a bigger part of the corticospinal tract moving higher in primate phylogeny. This may explain why lesions of the primary motor cortex have bigger effect on motor control in humans compared to lower mammals. Pyramidal tract neurons is the aggregate of upper motor neuron nerve fibers that travel from the cortex and terminate either in the brainstem or spinal cord. Nerve fibers usually descend down the brain in columns.

Motor commands are population encodings (Georgopolos studies) - “further studies have confirmed that similar poopulation-coding mechnisms are used in all cortical motor areas”.

The motor cortex encodes both the kinematics and kinetics of movement. Experiments in which a load is applied to either oppose or assist some arm motion found the population and single neuron activity either increased or decreased accordingly, corresponding to increased or decreased muscle activity, confirming kinetics encoding. Studies in which the activity of some corticomotoneurons does not always correlate with the contraction of their target muscles, but instead correlate with carefully controlled or powerful movements hint that they may also encode kinematics. Signals about both the desired kinematics and required kinetics of movements may be generated simultaneously in different, or possibly even overlapping, populations of primary motor cortex neurons.

Hand and finger movements are directly controlled by the motor cortex. Specifically, cortical neurons controlling the hand and digits occupy the large central core of the pirmary motor cortex motor maps but also overlap extensively with populations of neurons controlling more proximial parts of the arm. We can imagine mapping the movements of the hand and digits into a component neuron space, where each neuron controls a combination of muscle activations. This is contrasted by the highly ordered representation of tactile sensory inputs from different parts of the hand and digits in the somatosensory cortex.

The motor map is dynamic and adaptable, and can experience functional reorganization. Learning a motor skill can induce reorganization, which can also decay when “out of practice” possibly due to horizontal connections and local inhibitory circuits (John Donoghue). Bizzi 2001 demonstrated 4 different motor cortex neurons during motor skill adaptation and washout - kinematic neurons (tuning does not change), dynamic neurons (tuning change in both adapation and washout), and memory neurons (change either during adaptation or washout only).

Studies found that adaptive changes in motor cortex activity lag the improvement in motor performance by several trials during adaptation. This suggests that leraning-related adjustments to motor commands are initially made elsewhere, with the cerebellum as one strong candidate. The primary motor cortex may thusb e more strongly involved in the slower process of long-term retention and recall of motor skills rather than the initial phase of learning a new skill.

The primary motor cortex is part of a distributed network of cortical motor areas, each with its own role in voluntary motor control. The primary motor cortex should be regarded as a dynamic computational map whose internal organization and spinal connections convert central signals about motor intentions and sensory feedback about the current state of the limb into motor output commands, rather than as a static map of specific muscles or movements of body parts. The tmoro cortex also provides a substrate for adatpve alterations during the acqustion of motor skills and the recovery of function after lesions.

Jul 13, 2015 - KS Ch. 33: The organization and planning of movement

tags: neuroscience reading, Kandel et al. Principles of Neuroscience, motor system

Movement error/variability is proportional with velocity and force.

The brain’s choice of spatial coordinate system depends on the task. This can sometimes be determined by:

*Plotting the movement errors along the different components of different suspected coordinate systems. *A likely coordinate system would result in uncorrelated errors in its principle axis/components. *This is similar to eigenvector analysis - decompose the movements into uncorrelated components, thes components/eigenvectors is then the used coordinate system.

Examples used include [Gordon, Ghilardi, and Ghez 1994] and [Soechting and Flanders 1989].

Fitt’s law describes the speed-accuracy trade-off, roughly log-inverse[Jeannerod 1988].

Stereotypical patterns are employed in many movements. Tendency to make straight-line movements characterizes a large class of movements, regardless of the motions of the joints required. Joint motions often vary while hand-trajectory reamin more invariant, also suggests planning with respect to hand [Morasso 1981].

Two-thrids Power Law - the relationship between the speed of hand motion and the degree of curvature of the hand path is roughly constant: velocity varies as a continuous function of the curvature raised to the power of two-thrids [Lacquaniti, Terzuolo, and Viviani 1983].

Feedback control cannot generate a command in anticipation of an error: It is always driven by an error. Feedforward control is based only on a desired/expected state and can therefore null the error. Most likely used to initiate an action, followed by feedback error correction.

Feedback control suffers from sensory delay. Feedforward control suffers from inaccurate estimates. Therefore, movement controls uses a combination of sensory feedback and motor prediction, using a forward model to estimate the current state of the body.

Sensory processing is different for action and perception - sensory information used to control actions is processed in neural pathways that are distinct from the afferent pathways that contribute to perception. Key points: visual information flows in two streams in the brain. Dorsal stream projects to the posterior parietal cortex, involved in use of vision for action. Ventral stream projects to the inferotemporal cortex and invovled in conscious visual perception.

Evidence for motor learning in reaching experiment: In [Brashers-Krug, Shadmehr, and Bizzi 1996], person’s arm holding an apparatus reaches for targets. The apparatus then applies a CCW force against the user, disturbing his otherwise straight trajectories. The user eventually adapts and forms relatively straight lines again. Two possible learning strategies are possible: 1) Stiffening arms to resist force; 2) Anticipate and learn a new internal model to compensate for the new forces. After turning force off, we see overcompensation in the trajectories, indicating that learning strategy 2 was used.

Dynamic motor task learning mainly through prioception and less vision. Kinematic motor task can be guided more by vision. Proprioception is critical for planning hand trajectories and controlling dynamics, is needed to update both inverse models ued to control movement and forward models used to estimate boyd positions resulting from motor commands. Derived from experiments comparing control vs. those that have lost proprioception [Ghez, Gordon, and Ghilardi 1995],[Sainburg et al., 1995].

Jul 12, 2015 - Generating tags in Jekyll

tags: github-pages, jekyll, webdev

This site is meant to be a collection of my readings - book chapters, web articles, academic papers, etc. As such, convenient tagging is crucial. While Jekyll offers a minimalistic and fast static-site generation and convenient hosting on github-pages, it has minimal tagging support.

Requirements:

Minimal effort initiating a new tag.
Automatic generation of a single tag page that lists posts associated with all available tags.
Automatic generation of separate pages associated with different tags.

There have been a few posts online about how to do this: Charlie Park, Christian Specht. The latter comes very close to what I needed, upon which I based my implementation.

Implementation:

The general approach is, whenever a new tag is created (e.g. entered into the “tags” field of a post’s YAML front-matter), we need to create a tags/my_new_tag.html file with the following content:

---
layout: tagpage
tag: my_new_tag
---

This means we need to create the layout _layouts/tagpage.html.

At this point, Jekyll will have generated a separate page for our newly created tag at /tags/my_new_tag.html.

To generate the single page that lists posts associated with all tags, we create alltags.html. Note the tags will be displayed in alphabetical order.

I also want to show, for each individual post, its asssociated tags. This is done by including _includes/tag_line.html in _layouts/post.html.

Jul 5, 2015 - My first post

tags: github, github-pages, jekyll

Let’s see how this works

Tags

Jul 26, 2015 - DiGiovanna 2009: Coadaptive Brain-Machine Interface via Reinforcement Learning