Friday, June 16, 2017

Google Tensor and CNN Programming is god aweful, 30 years behind

I had an interesting exchange on a Geoff Hinton video. To which, one person replied "Are you from the future?"

Sadly, the rediscovery of neural networks and the terrible models at stanford, Google's horrific tensor, and the generally terribly mathematical focus of OpenCL and CUDA (even RocM and AMDs instinct initiative) seem to think of NNs as matrix operations.

It's no wonder it seems like I'm from the future. Very frustrating that I will most likely have to write my own programming language to do what we want to do with Noonean Cybernetics.

---
I was doing a type of CNN called a Neural Cube in 1986 based on Fukishima's NeoCognitron work. Kunihiko Fukishima did much of the earliest work published in 1975 but his breakthrough was 1980 - http://www.cs.princeton.edu/courses/archive/spr08/cos598B/Readings/Fukushima1980.pdf "This network is given a nickname "neocognitron"l, because it is a further extention of the "cognitron", which also is a self-organizing multilayered neural network model proposed by the author before (Fukushima, 1975). Incidentally, the conventional cognitron also had an ability to recognize patterns, but its response was dependent upon the position of the stimulus patterns. That is, the same patterns which were presented at different positions were taken as different patterns by the conventional cognitron. In the neocognitron proposed here, however, the response of the network is little affected by the position of the stimulus patterns." I recommend my students start with that paper. My good friend Gerald Edelman, who sadly passed two years ago, led a huge amount of work on darwinistic CNNs through the 1980s and 1990s. He released Darwin III in 1988. Just to give you a sense of time. Karl Pribram's holonomic theory postulates that perception works like a hologram and that the dendritic processes function to take a "spectral" transformation of the "episodes of perception". This transformed "spectral" information is stored distributed over large numbers of neurons. When the episode is remembered, an inverse transformation occurs that is also a result of dendritic processes. It is the process of transformation that gives us conscious awareness. Pribram says that both time and spectral information are simultaneously stored in the brain. He also draws attention to a limit with which both spectral and time values can be concurrently determined in any measurement and this uncertainty describes a fundamental minimum defined by Gabor in 1946 (the inventor of the hologram) as a quantum of information. Working this into our neural model of perception and recognition is the leap that has occurred in the 30 years we stepped forward from the initial 1980s work. Every last commercial software, topology, tensorflow, all of that is absolute garbage. It's a bunch of junk. It's baby stuff. Useless except for googelian match the picture. CNN work has transitioned to evolving neural cube technology over 30 yrs since this early work. Hinton is a great thinker and I value his abstract thinking most. I wish he would move into theoretical concepts rather then demonstrate his old designs. When I saw his ted talk I was incredulous, dug out my thesis papers from 88 and there it was, the exact same stuff in a Hinton article. Restricted Boltzman machines are like seeing Edison's first prototype for an electric light bulb. Many of us were restrained because the computing power we needed to commercialize the tech in the 1980s was just out of reach. That's one part of what Noonean Cybernetics is helping solve, a 100 Teraflop desktop neural computer. With that there is no need to wait on university supercomputer resources in fact, we did an analysis of UofA's stampede doing neural work vs. one of ours and based on what you could actually reserve - their large array (and even then you had limited access to that much resource!), our computing on the desktop was actually more powerful than that massive system which cost 100 million dollars just to build the rooms and energy interconnects to install it into. Rather than spending 5 million dollars a year just on maintenance, The Noonean computer is just a personal computer. Embeddable low watt systems are still going to be 8 years off. But that is our direction. With holographic plane neural cube designs which have much more about excitation of pathways and dynamic re-organization than static CNNs, based on the work of Fukushima, Pribram, and Edelman, and our own original approach we are able to achieve much better recognition with much shorter training times. Minutes not days. Holographic neural frames train in a few minutes per feature. If you think about it, a hologram is much like Hinton's "capsules" as they are location equivariant. As I said. Take their google computer onto a busy street in NYC and show me where all the triangles are. They have a fundamentally broken approach. the hard stuff today is 3-d spatial awareness mapping, mapping recognition into long term memories, and conscious systems all integrated into language. And a fully unified system which does listening, speech, thought, vision, and interacts as a human would. That is our focus today. Another two decades away but we will have glimpses and early successes by 2025.