Recently, Intel announced that they had developed an 80 core processor which would be available by the end of the decade. That got me to thinking that we are fast approaching a point in processor power where a true artificial intelligence would be possible. Let me explain my reasoning for this.
One of the best books I've read about AI is Jeff Hawkins "On Intelligence". In this book the premise for intelligence is the ability to predict what will happen next. If you hear a word spoken, or a note played your mind is already busy predicting the next word, or note. If you see a scene with a ball thrown, a bird taking flight and a person walking, you mind predicts the person’s next step, where the bird will go and where the ball is headed. All of these associations are happening from moment to moment.
Now lets say we want to create a computer system to do the same thing. We would need stereo optic camera vision systems, stereo microphone audio and a bunch of other sensors. All of the data has to be gathered, processed and stored in real time. The processing includes detecting the distances and relative motions from the stereo vision, isolating images and sounds and storing all of this data in a time sequence. Then as soon as it is stored in short term storage, we have to extract cross reference points, such as what kind of bird did we see, what noise did it make and what would be its next action? Then we need to look into our long term memory and extract this information from a multitude of cross indexes. Once we have referenced these memories we predict the next action is a squawk and another flap of the wings.
At some point we start to fill up short term memory and will need to save the information in long term memory along with all of the cross indexes such as time of day, the rest of the elements in the scene, what we were smelling and hearing, etc.
As you can see, with this amount of data gathering, processing, storing and retrieving there is no way it can happen in real time on just one processor, no matter how fast it is. This stuff has to happen in parallel. And some of these processors will have to be dedicated and specialized. DSPs will receive each audio channel and break it down into elements, compress it and time stamp it for storage. The data will be passed to other processors that look for words and voices to use as indexes to compare to what is stored in long term memory. More processors will scan those memories and yet other processors will compare items from the memories, and then predict the actions and reactions.
The ideal design for AI processors would be a multi core, or multi processor systems. Each core or processor would have its own built in flash and OS with MPI messaging. Application tasks would be loaded into the core along with any data and then run. Some tasks such as database searches could be divided between several cores, so the same application task is loaded into each along with different data sets.
Tasks would remain in the core’s flash until the space is needed for a new task. A new task would flush memory so the core can tackle the new task cleanly.
Most of the actions in this AI system are comparing patterns from sensors to stored patterns in the various memories. And when you think about it, lots of what we sense are just patterns. Words in speech are patterns made up of phonemes, pitch and amplitude. Visual objects such as birds are colors, outlines and other cues.
These are all pattern matching problems. Similar to finding the owner of a particular set of fingerprints. You cannot match every line, curve and loop of fingerprints against a giant database. Instead they look at specific points of the fingerprint and compare them to the same points on other prints. If all points match then it is considered a match.
The same method can be used to find the word spoken out of a database of words.
We just need to define the points of interest that work best.
For example, to recognize speech, we would design a system that breaks each word into segments based on frequency changes and relative amplitude. This way the absolute amplitude from loudness or closeness can be subtracted and absolute frequency from voice pitch can be ignored.
Then it is simply a matter of matching these points of interest to find the word spoken and its meaning. This would allow the use of parallel processing to speed up the process.
Different pattern matching engines can all be working in parallel to look at sounds, vision, even scents and touch. All of this parallel processing would be the first step towards an intelligent robot.