I
believe this community will be interested in a new downloadable Java app that I
've just made available on my web site (on this page),
which describes an alternative, biologically motivated/plausible, way to
achieve the goals of locality-sensitive hashing (LSH). This alternative model, called Sparsey,
achieves a more graded notion of similarity preservation than LSH, and
has many other advantages as well.
Sparsey has a natural correspondence to the brain's cortex, centering on
the idea that all items of information are stored as sparse distributed codes
(SDCs), a.k.a., cell assemblies, in superposition in mesoscale cortical
modules, e.g., macrocolumns (though other structures, e.g., mushroom bodies,
are also candidates).
Briefly,
Sparsey preserves similarity from input space to SDR code space (measured as
intersection size) as follows.
- The process of choosing an
SDC takes the form of Q independent softmax choices, one in each of the Q
WTA competitive modules (CMs) that comprise the SDR coding field.
- The familiarity (inverse
novelty) of the input, denoted "G", which is in [0,1], is
computed. This is an extremely simple computation.
- The amount of noise in those
Q softmax choice processes is modulated as a function of G.
Basically, the softmax is over the distribution of input summations of the
competing cells (in a given CM), but we use G to modulate (i.e. sharpen
vs. flatten) those distributions.
- When G is near 1 (perfect
familiarity), the distributions are greatly sharpened, causing the
expected number of CMs in which the cell with the highest input summation
wins (and thus, the expected intersection of the resulting SDR with the
closest matching previously stored SDR) to increase towards Q. When
G is near 0 (completely novel), the distributions are flattened, causing
the expected number of CMs in which the cell with max input summation wins
(and thus, the expected intersection of the resulting SDR with the closest
matching previously stored SDR) to decrease towards chance. In other
words, this G-based modulation of the distributions, which can be viewed
as varying the amount of noise in the choice process, achieves similarity
preservation.
I
encourage members of this community to explore the app to understand this
simple and more powerful alternative to LSH. I welcome your feedback.
Sincerely,
Rod
Rinkus
--
Gerard (Rod) Rinkus, PhD
President,
rod at neurithmicsystems dot com
Neurithmic Systems LLC275 Grove Street, Suite 2-400
Newton, MA 02466
617-997-6272
Visiting Scientist, Lisman Lab
Volen Center for Complex Systems
Brandeis University, Waltham, MA
grinkus at brandeis dot edu
http://people.brandeis.edu/~grinkus/