I believe this community will be interested in a new downloadable Java app that I 've just made available on my web site (on this page), which describes an alternative, biologically motivated/plausible, way to achieve the goals of locality-sensitive hashing (LSH). This alternative model, called Sparsey, achieves a more graded notion of similarity preservation than LSH, and has many other advantages as well. Sparsey has a natural correspondence to the brain's cortex, centering on the idea that all items of information are stored as sparse distributed codes (SDCs), a.k.a., cell assemblies, in superposition in mesoscale cortical modules, e.g., macrocolumns (though other structures, e.g., mushroom bodies, are also candidates).

Briefly, Sparsey preserves similarity from input space to SDR code space (measured as intersection size) as follows.

The process of choosing an SDC takes the form of Q independent softmax choices, one in each of the Q WTA competitive modules (CMs) that comprise the SDR coding field.
The familiarity (inverse novelty) of the input, denoted "G", which is in [0,1], is computed. This is an extremely simple computation.
The amount of noise in those Q softmax choice processes is modulated as a function of G. Basically, the softmax is over the distribution of input summations of the competing cells (in a given CM), but we use G to modulate (i.e. sharpen vs. flatten) those distributions.
When G is near 1 (perfect familiarity), the distributions are greatly sharpened, causing the expected number of CMs in which the cell with the highest input summation wins (and thus, the expected intersection of the resulting SDR with the closest matching previously stored SDR) to increase towards Q. When G is near 0 (completely novel), the distributions are flattened, causing the expected number of CMs in which the cell with max input summation wins (and thus, the expected intersection of the resulting SDR with the closest matching previously stored SDR) to decrease towards chance. In other words, this G-based modulation of the distributions, which can be viewed as varying the amount of noise in the choice process, achieves similarity preservation.

I encourage members of this community to explore the app to understand this simple and more powerful alternative to LSH. I welcome your feedback.

Sincerely,

Rod Rinkus

Gerard (Rod) Rinkus, PhD
President,
rod at neurithmicsystems dot com
Neurithmic Systems LLC
275 Grove Street, Suite 2-400
Newton, MA 02466
617-997-6272

Visiting Scientist, Lisman Lab
Volen Center for Complex Systems
Brandeis University, Waltham, MA
grinkus at brandeis dot edu
http://people.brandeis.edu/~grinkus/