A biologically motivated and more powerful alternative to locality-sensitive hashing

Aug. 2, 2018

      I believe this community will be interested in a new downloadable Java app
that I 've just made available on my web site (on this page
<http://www.sparsey.com/CSA_explainer_app_page.html>), which describes an
alternative, biologically motivated/plausible, way to achieve the goals of
locality-sensitive hashing (LSH).  This alternative model, called Sparsey,
achieves a more graded notion of similarity preservation than LSH, and has
many other advantages as well.  Sparsey has a natural correspondence to the
brain's cortex, centering on the idea that all items of information are
stored as sparse distributed codes (SDCs), a.k.a., cell assemblies, in
superposition in mesoscale cortical modules, e.g., macrocolumns (though
other structures, e.g., mushroom bodies, are also candidates).

Briefly, Sparsey preserves similarity from input space to SDR code space
(measured as intersection size) as follows.

   - The process of choosing an SDC takes the form of Q independent softmax
   choices, one in each of the Q WTA competitive modules (CMs) that comprise
   the SDR coding field.
   - The familiarity (inverse novelty) of the input, denoted "G", which is
   in [0,1], is computed. This is an extremely simple computation.
   - The amount of noise in those Q softmax choice processes is modulated
   as a function of G.  Basically, the softmax is over the distribution of
   input summations of the competing cells (in a given CM), but we use G to
   modulate (i.e. sharpen vs. flatten) those distributions.
   - When G is near 1 (perfect familiarity), the distributions are greatly
   sharpened, causing the expected number of CMs in which the cell with the
   highest input summation wins (and thus, the expected intersection of the
   resulting SDR with the closest matching previously stored SDR) to increase
   towards Q.  When G is near 0 (completely novel), the distributions are
   flattened, causing the expected number of CMs in which the cell with max
   input summation wins (and thus, the expected intersection of the resulting
   SDR with the closest matching previously stored SDR) to decrease towards
   chance.  In other words, this G-based modulation of the distributions,
   which can be viewed as varying the amount of noise in the choice process,
   achieves similarity preservation.

I encourage members of this community to explore the app to understand this
simple and more powerful alternative to LSH.  I welcome your feedback.

Sincerely,

Rod Rinkus

-- 
Gerard (Rod) Rinkus, PhD
President,
rod at neurithmicsystems dot com
Neurithmic Systems LLC <http://sparsey.com>
275 Grove Street, Suite 2-400
Newton, MA 02466
617-997-6272

Visiting Scientist, Lisman Lab
Volen Center for Complex Systems
Brandeis University, Waltham, MA
grinkus at brandeis dot edu
http://people.brandeis.edu/~grinkus/
<http://people.brandeis.edu/%7Egrinkus/>

Rod Rinkus

tags

participants (1)