Stephen,
You have been complaining about not getting enough credit since I first met you in the mid 1980s.
That is not true.
You refer to publishing before Hopfield's 1984 paper. You deliberately ignore his 1982 paper, which received 27946 citations:
Neural networks and physical systems with emergent collective computational abilities, JJ Hopfield
PNAS, April 15, 1982, 79 (8) 2554-2558
My post was originally on a web site with strict word limits. That is why I mentioned Hopfield’s 1984 article, but not his 1982 article.
Section 6 of my 1988 review article:
Grossberg, S. (1988) Nonlinear neural networks: Principles, mechanisms, and architectures. Neural Networks, 1 , 17-61.
https://sites.bu.edu/steveg/files/2016/06/Gro1988NN.pdf
reviews Binary, Linear, and Continuous-Nonlinear classical neural network models by many authors.
This article has been cited 2410 times.
Section 6 shows that the 1982 equation that Hopfield used is a variant of the 1943 McCulloch-Pitts model.
The Liapunov function for it is just a discrete version of the Liapunov function for the Additive Model.
In contrast, the 1971 student paper you mentioned received 118 citations.
I assume that you mean:
Grossberg, S. (1971). Embedding fields: Underlying philosophy, mathematics, and applications to psychology, physiology, and anatomy. Journal of Cybernetics, 1, 28-50.
https://sites.bu.edu/steveg/files/2016/06/Gro1971JoC.pdf
This was a review article published in a journal that ceased to exist in 1980 (see below) and thus could not be searched well by Google, which began in 1998.
The article discussed Generalized Additive Models (see its Section 6).
I was able to get articles published proving global limit and oscillation theorems for Generalized Additive Models in the Proceedings of the National Academy of Sciences
between 1967 and 1971; e.g.,
Grossberg, S. (1971). Pavlovian pattern learning by nonlinear neural networks. Proceedings of the National Academy of Sciences, 68, 828-831.
https://sites.bu.edu/steveg/files/2016/06/Gro1971ProNatAcaSci.pdf
See Equations 1, 2, 3, and 4 etc.
Your 1983 paper with Michael Cohen in IEEE Transaction of Man, Systems, and Cybernetics was received on August 1, 1982, several months after the 1982 paper by Hopfeld had appeared in print. You did get a good
number of citations on this paper, 3344, but not as good as 27946.
Michael Cohen and I tried to submit it to the
Journal of Cybernetics in 1980.
But there was a glitch in the submission process due to the fact that the
Journal of Cybernetics unexpectedly stopped publishing that year and our article got lost in the shuffle, unknown to us for a while.
We resubmitted in 1982, which is why it was published in 1983.
Our results pre-dated Hopfield 1982 and 1984 by two years.
About the difference in citations:
There is a difference between discovery and marketing. I was told that Hopfield knew about my work before and with Michael Cohen before he went around the country lecturing about
work that we had previously published, without citation.
You mention a 1982 paper with Michael Cohen; I have not been able to find it.
I think that was a typo. Michael published the following article in 1992:
Cohen, M. A. (1992). The construction of arbitrary stable dynamics in nonlinear neural networks.
Neural Networks, 5, 83 – 103.
https://www.sciencedirect.com/science/article/abs/pii/S0893608005800085
Here is the Abstract:
“In this paper, two methods for constructing systems of ordinary differential equations realizing any fixed finite set of equilibria in any fixed finite dimension are introduced;
no spurious equilibria are possible for either method. By using the first method, one can construct a system with the fewest number of equilibria, given a fixed set of attractors. Using a strict
Lyapunov function [boldface mine] for each of these differential equations, a large class of systems with the same set of equilibria is constructed. A method of fitting these nonlinear systems to trajectories is proposed. In addition, a general method
which will produce an arbitrary number of periodic orbits of shapes of arbitrary complexity is also discussed. A more general second method is given to construct a differential equation which converges to a fixed given finite set of equilibria. This technique
is much more general in that it allows this set of equilibria to have any of a large class of indices which are consistent with the Morse Inequalities. It is clear that this class is not universal, because there is a large class of additional vector fields
with convergent dynamics which cannot be constructed by the above method. The easiest way to see this is to enumerate the set of Morse indices which can be obtained by the above method and compare this class with the class of Morse indices of arbitrary differential
equations with convergent dynamics. The former set of indices are a proper subclass of the latter, therefore, the above construction cannot be universal. In general, it is a difficult open problem to construct a specific example of a differential equation
with a given fixed set of equilibria, permissible Morse indices, and permissible connections between stable and unstable manifolds. A strict Lyapunov function is given for this second case as well. This strict
Lyapunov function [boldface mine] as above enables construction of a large class of examples consistent with these more complicated dynamics and indices. The determination of all the basins of attraction in the general9 case for these systems is also
difficult and open.”
I have never heard anybody referring to you as 'the father of AI'.
As an acolyte of Hopfield, you wouldn’t.
What I do remember is a visit to your group at BU many, many years ago - in the late 80s or early 90s. I was amazed at finding out that you closely supervised every word in every slide that anybody in your
group was allowed to show when invited to give a talk. Everybody was under a lot of pressure to 'stay on message', where the 'message' was your view on things. I had never encountered a scientific group run as a cult. It made an impression, but not a positive
one. So much for training over 100 people - by teaching them not to think by themselves.
You describe my devoted teaching as something ugly, even ad hominem.
When I was a student, no one told me how to write an article or how to prepare and give a talk or poster. I developed a method to enable my students to grow intellectually and
become independent scholars.
When I started working with a new PhD student or postdoc, we discussed what topics would potentially interest them. Then we read hundreds of psychological and neurobiological
articles together about that topic and began to discuss the results in them.
For students interested in engineering and AI, we studied all the methods that were available to solve problems in a targeted problem domain.
I encouraged my students to advance their own concepts about what might be the underlying mechanisms that generated an article’s data or benchmarks. Because we studied large
amounts of data, favorite concepts often hit a brick wall and had to be discarded.
We developed a modeling method that I like to call the Method of Minimal Anatomies. See Figure 2.37 in my 2021 Magnum Opus
https://www.amazon.com/Conscious-Mind-Resonant-Brain-Makes/dp/0190070552
This Method is just Occam’s Razor applied to neural networks.
As the Figure summarizes, it “Operationalizes the proper level of abstraction” and “that you cannot ‘derive a brain’ in one step”. Our models have been getting incrementally
developed in a principled way to the present time.
Your statement that “the message was your [my] view on things” is the opposite of how we worked.
And how would you know? You were not there.
The proof of the pudding is that all my students got positions that they wanted after they earned their PhD’s with me.
I was also told by multiple famous scientists that our department sent them their best postdoctoral fellows.
In fact, various major labs would contact me to let me know that they had an open position and asked if one of my students was about to graduate.
My students have gone on to successful careers after their postdocs and other first post-PhD positions.
Around half of them went into academe, and the other half into engineering, technology, and AI.
Groups of us have periodically gotten together for happy social events, either at international conferences or in the Boston area.
And they sent me information about their accomplishments and growing families for many years.
As for back-propagation, this is not what Hinton is cited for in the Nobel Prize citation. It is for the Boltzmann machine.
Yes, I know that, and I have also noted that my PhD student, James Williamson, developed statistical neural network models that did not have problems related to deep Learning.
I have elsewhere noted that:
Restricted Boltzmann Machines (RBMS) also have analogs in the Adaptive Resonance Theory (ART) family of models, but without the problems that follow from using variants of Deep Learning.
Here are two of them, both developed by my PhD student, James R. Williamson:
Gaussian ARTMAP: A Neural Network for Fast Incremental Learning of Noisy Multidimensional Maps
https://www.sciencedirect.com/science/article/pii/0893608095001158
A Constructive, Incremental Learning Network for Mixture Modeling and Classification
http://techlab.bu.edu/files/resources/articles_tt/A%20constructive,%20incremental-learning%20network%20for%20mixture%20modeling%20and%20classification..pdf
Finally, I always wondered: If ART solves all problems, why are there ML/AI problems that remain to be solved?
I would never claim that. I have always taught that science is never done. See the Method of Minimal Anatomies.
The fact that you can even think that about my work shows, sad to say, how biased you have become.