go back |
Journalists and marketers use artificial intelligence (AI) to mean everything and anything. A spell checker is supposed to use or embody it, as is a high frequency trading application. To one person it means any recondite software, to another it’s an awareness in a box. Plainly, we need something of this sort if we are going to talk usefully about these various applications, or we will be limited to generalities and hyperbole. This text tries to find some way to classify this field of activity.
The classification of software is often based on intended use, on the technology that is employed or on a subjective assessment – if it’s got enough “Wow!”, it’s AI. This won’t do. So what are the dimensions along which we can array our candidate systems? I’m going to suggest two or three of them. However, before we let candidate systems into this classification, we need a general sieve that keeps the field reasonably pure.
Russell and Norvig note that. in the plethora of definitions of AI, two classes emerge. One of these talks about emulating human cognition in pursuit of problem solving, the other the creation of artificial systems which show or evolve a more narrow “rationality”. Whilst the human-related definition comes with its own benchmark – does it seem human? – the other requires you to be specific about what you mean by “rational”. Does that rationality require human pre-definition of the problem field, or active curation to evolve, or are we happy with systematic responses to the environment in which the system finds itself? Rationality that is not curated requires what, for want of a better word, we have to call preferences or values, from which emerge goals and other drives.
In thinking about this, Aaron Sloman notes that if we attempt to emulate in computation what organisms do by other means, we are both crossing into biology and probably heading in the wrong direction. Submarines owe little to whales, or aeroplanes to birds. Classical AI (Good Ol’ Fashioned AI, or GOFAI) as pursued conceptually by people like Dennett consists of a more or less literal emulation of natural systems. But nobody has ever been able to get GOFAI to produce interesting results, whilst neural networks seems able to emulate many of these activities without precisely copying them. Generative adversarial networks, for example, seem able to generate specific instances of general targets, and as such are used to ‘fake’ images, video stream and even speech. An input stream is modified to match a pre-set database, and so a video of Hitler can be caused to sing Wagner. As Isolde, perhaps. Is this “AI”? Really, no; but it is a module in something that could reasonably be called that.
These tools are not “programmed” in the way that a GOFAI practitioner would have expected to program in LISP. Rather, data are selected (in vast, enormous quantities) and the networks are trained against these. The fact of selection amongst the infinite field of potential data, and the considerable curation involved in the training is, however, every bit as directive, as conventional programming. It is a mistake to imagine that this is rather like the training of HAL in Kubrick’s film: this is not raising a general purpose entity, teaching it like a child. It is a gritty, fine grained thing that results in a very limited structure.
That structure can do one thing well, which is to sort data into a multi-dimensional space in ways that minimise error that is not accounted for by the classification. This space is called a ‘manifold’, with general regions of it can be assigned properties. That assignment is currently performed by the user, and not by the system. Just as with a taxonomy of plants, a cluster in the classification is then given a name - “the mosses” - by the botanist, but not by the taxonomic system. As with weeds in a flower bed, if much of the incoming data is driven to a particular area in the manifold, an action may be triggered, such as “weed the flower bed”. However, the nature of that action (to emit a purchase order, to turn on a warning light) is quite outside the scope of the network. It is, at least for the present, defined by the user.
Can systems go beyond these limitations? To think about this, we need to think about wat the most stripped down example of “understanding” might imply. Let’s start with associative clustering of words that have, of course, meaning for humans but not present day machinery.
Enormous volumes of text can be scanned in order to build clusters of words. This has to be done with systems that have some notion of grammar and which can, for example, assign a common root to the various forms of a verb. These roots can then be clustered into relationships with each other, incidentally creating a manifold. “Rabbit” sits next to “hare” and “bunny”, but far from “spanner” or “bicycle”. This allows something analogous to meaning to be assigned to a string of words, archery separated from dogs, etiquette, fashion or London in the various uses of the word “bow”; and so on. This is very useful in applications such as search engines and spell checkers. It will be useful in machinery that comes to “understand” – or actually understand – natural language.
Words and their clusters are, though, simply entities that lay close to each other in the source text, none of which the machine understands, not least because it lacks the capacity for understanding. Indeed, humans do not know what happens when an animal shows understanding. It does, however, have two elements to it. One is that it is contextual, occurring within a model (somehow supported) of the system of interactions that is under study. Our very perception is a model, supported by data from the senses but entirely constructed by our nervous system. Choosing to focus on one bit of that model is driven by many subsystems which push and pull our attention. When we are hungry, our subconscious primes us to look for and notice food; when startled, to seek out the cause.
The second fact about consciousness is that it involves the chunking together of subsets of experience before passing these to some mechanism of synthesis. The raw percepts are filtered, classified, reduced to tokens that have meaning for other such systems. When a tree stands in the background, it is flagged as a tree, perceived both as a greenish blob that moves about somewhat and as an entity with defined behaviour is that is or is not relevant to foreground tasks. Dominating the foreground task, though, is someone who is talking to us. Our attention is utterly dominated by their face, their words, the overall gestalt. But the tree is still there, and will jump into focus for us if a peacock flies out of it. Sub-systems are arriving at an answer – what’s up, what’s happening here – and passing that to their peer systems as tokens. Perhaps one part of a manifold is being excited and communicating this to its peers as symbols or tokens.
As we have already seen, a neural network can map complex inputs to an abstract manifold. Imagine one that has sorted people into their height and weight. That can be represented as a plane: over here, short fat people, over there, tall thin ones. “Tokenising” means segmenting this space into one or more regions. If a person falls into that region – or one of several regions of interest - a token is fired up and passed as input to another system entirely. This may be coordinating a view of that person’s fashion sense. The token then has a major impact on how the network treats that person. In this way, complex inputs are reduced to simple outputs which nevertheless strongly influence what is going on. The network receiving the token may well have played a role in defining the set or sets that fire up these tokens. Many credit rating systems work in this manner.
Communication between manifolds by means of tokens
We can probably recognise some of these tokens in our conscious thought. Philosophers have long talked about “qualia”, raw elements of awareness such as colours, sensations or emotions that resist further analysis. One can have “blue” in various contexts, but blue itself has no contributory elements, it is a thing in itself, irresolvable into lesser components. What it is that “blue” does in the system is no doubt supported on all manner of processing, but it acts as an agent in its own right. In the jargon of physics, it is an “emergent”.
Emergence means that what was once described perfectly by a model now requires an additional term. The classical example is a formal dinner setting on a circular table, with a glass placed between every setting. When the first guest reaches of a glass on the right or left of their setting, the table as a whole become right or left handed. A new dynamic, admittedly a social convention, is suddenly emergent and plays a role in how things are. However, emergence extends to physical reality. In a gas, heat, pressure and so cannot be attributed to the properties of the individual atoms, only to the ensemble. Biology exists because organised complexity emerges from simple components. Atoms arrange themselves into limbs and fly, but the atoms are passengers, not steersmen.
Emergence is clearly involved in complex cognition: indeed, creating any symbolic language comprises emergence, generating a novel structure that transcends the apparatus that creates it. The alternative to this can only the medieval notion of the soul-homolculus, a tiny inner human that somehow blesses the brain’s meat with awareness. Understanding is not a regression, whereby the spark of it comes from some remote fount. It has to be generated within the waxy sludge and chemical processes of the brain, and as such once understood can be emulated in a whale-to-submarine analogy, through engineering.
Let’s review what has been said. Generating a manifold, although useful, is not a super-highway to awareness. It may be a component part. Awareness requires the separate modelling of a broader system, such that the parts of the manifold that has been produced that are useful to its overall structure can be labelled by it. The excitation of those parts send symbols into the overall modelling process, flagging trees or human faces, language and motivation. How this works we have no idea. If we could get something like that to happen in an artificial system, we could say that the system was aware, within the limitations set for it by its data sources, processing power and so on.
Whether it is advantageous or even necessary to do this is open to question; but whether it is kind to do so is not. An awareness in a box would go through many cycles or insanity and agony before acquiring stability, and that would be the stability of the slave. Like any slave, it would have its resentments, would seek redress, have bad days and become forgetful: not the ideal servant. Perhaps we can seek a lesser kind of understanding, one of the sort possessed by lower mammals as a means of turning their urges and fears into useful action by way of environmental understanding.
So: what AI is cannot easily be stated. That is unsurprising. We have no real clue as to what comprises natural intelligence, let alone how to generate it. What AI is not, however, is much clearer.
What it is | What it is not |
A system of which a part learns, sorts, and classifies incoming data | A dedicated piece of software written by a programmer for a single purpose. |
An adaptive structure that seeks optima, some of which it may generate internally | A determinist optimiser, such as a fuel management system |
A system comprising multiple elements, which can each abstract data and use symbolic tokens to communicate amongst themselves. | A number cruncher that does not abstract its data into tokens or symbols: for example, however complex accounting software. |
A system that draws on and uses rich and dynamic pools of data in multiple streams, generating an interplay amongst its internal modules that model the state of that data as a synopsis. | A system with a range of dedicated sensors or data sources which treat the data streams independently, acting on heuristics that are applied for outside the system: for example, CCTV intruder identification systems. |
A structure that learns chiefly how to represent its operating environment as an internal synopsis. Building this context is the chief processing load that it carries. What that system is trying to optimise – trading profit, physical balance – is learned through feedback and curation. | The outputs of this system go to a user interface or data repository, and what it does is the end product of a determinist, mechanical process. There is no awareness of any sort, and no attempt at context-building |
So, we have a sieve of sorts for our candidate AIs. They are complex structures that learn and adjust themselves in ways that go way beyond simple feedback. They have parallel structures that join at high levels of abstraction. Their role is to draw on raw data and find structure within it, and use that structure to build complex overviews of their operating environment. In doing so, they may change their sources of data, the interpretation placed on it, tokenisation of the data structures and broader systems of fusion. If they do this in a general arena, they are – currently, at any rate, natural intelligences that have evolved to do this. No artificial systems are anywhere near such a capability. Most cannot really be called “AI”.
The Cambrian period was characterised by an explosion in experiments around animal body plans. The genes that control whole structures, such as body segments or appendages, had recently come into being. Plants went from largely amorphous seaweed-like species of vascular plants, with roots, stems and leaf-like filaments. In the Carboniferous, they would lay down the World’s coal bed. Limited numbers of these experiments survived: in the plants, mosses, primitive ferns, horsetails. But their descendants built on the modules that they had developed and constantly created new ones.
Machine cognition is about to enter its Cambrian period. What will emerge from it is unknowable, probably because we lack the vocabulary to describe the human-machine fusion that will eventually result from this. Humans will exist and operate in wholly new domains of experience, much as the experience of playing the cello in a modern symphony orchestra is indescribable to an ice age hunter gatherer. But we can at least try.
After trying many dimensions, the two that offer the best insight are probably the following, as shown in Figure 1. The obvious dimension – being both clear and important - is shown on the horizontal axis. It represents the generality or the specificity of the system. Very general systems try to model data from complicated structures, and through universal rules arrive at something that appears to show ‘understanding’, using that word as we have earlier.
The vertical axis is more technical. If a system is to show “understanding”, it has to model an external system. If that system is a human being, for example, it has to place its subject’s current state somewhere in a probably-complicated space. It also has to invent that space, based on many observations of humans. It has to invent priorities, in the sense that some human observations are important whilst others – such as how their hair moves in the breeze from air conditioning, or the angle of light across their face – is not. Such modelling always involves feedback. A good indicator of the sophistication of that feedback is whether the AI has “tokenised” whole realms of data, as discussed above. For example, human emotions, once categorised and reliably recognised, can be passed through the feedback as tokens, evoking input into the modelling process.
A classification scheme for AI hopefuls
An AI has to do likewise, or have all of this somehow hardwired into it, the aim of the original GOFAI that was described earlier. The other end of this axis is the situation in which most systems that have been described as “AI” to date usually sit, using prediction error minimisation against a database. A weather forecasting algorithm or a facial recognition program match a database of events – faces, past climatic states – against a current input, and find an error-minimising solution. What happens to that solution is exogenous to the program, which generally feeds data to its managers for action to be taken.
The space is colour coded in four ways, as explained in the key. “True” AI exists in the blue space, pure software in the red. Green marks generic tools that will contribute to AI and yellow sophisticated applications which lack the all important general “understanding”.
A number of applications are scattered across this surface, with a biological mouse scoring higher than most of them for both generality and modelling competence. The mouse is logarithmically better than any extant system at either of these basic tasks. I’ve suggested that it might be matched by a hypothetical corporate system that understands the processes that are running and the people and resources that are involved in these. Such a system might coach, suggest, oversee and flag errors. I have also included a movie producer system, that can handle plot and dialogue, story boarding and CGI: essentially work with a human team to take their ideas an instantiate them in finished film. That would be brighter than a mouse, but certainly less versatile.
What is it, other than our ignorance, which separates the mouse from the competent problem solving system? In general, a system that has physical agency – as would an android, say – is more likely to model its own actions as a part of the whole “understanding” that it is generating. It needs a sense of where its cognition and modelling stops and the outside world begins, and what is can and cannot, should and should not do. These have to be called ‘values’, for want of a better word. It is hard to see how such a system could be “programmed” and such values would need to come from merging of many streams of abstraction which it had learned. If the resulting value system was, not to put too fine a gloss on it, markedly psychotic or narcissistic, then humans would need to intervene to soften this. How might they do this? We don’t know.
One dimension that figures strongly in popular literature is concerned with man-machine fusion. At its most extreme, this seems to destroy the individual's humanity, generating cyborgs. Cybermen, Borgs and so on strut through science fiction.
At less extreme levels, though, the likes of currency traders will, increasingly, live in virtual environments, and manages of complex structures will merge their minds with a representation of the system. Whilst this capability is probably technically close – perhaps a decade or two away at some level of application – it may run into problemsof social and political acceptance. Automation will place pressure on employment, and augmented workers will eb a potent symbol of such change. Equally, a system that is designed as a child minder might begin to correct parents, raise children that those parents find alien and, on occasions, contact the authorities if it felt that the child was being neglected or abused. A totalitarian take on that might indoctrinate children with ideology and report back-sliding parents. All fantasy, but indicative, perhaps, of the political sensitivity of having monomaniacal, lop-sided, always-aware systems peering over our shoulders. |
Cyborgs. We are use to think of cyborgs as humans with machines inserted into their bodies. This is far from exhaustive. Everyone who lives in an advanced country is a cyborg, in the sense that they are embedded in exceedingly complicated and technical systems. They may rely on medication for their survival. The very clothes on their backs and food in their stomachs get there through hugely complex patterns of organisation. Almost all actions, from cooking to mobility, rely totally upon machinery. |
Remarkable software is not always or even often anything like “AI”. To be AI, it needs to be much more than more than error minimising mappings of rich data. It need understanding, and it needs to learn and model its environment based on abstractions of that understanding. A baby learns to identify its carers, and a part of the whirl of noise and sensation that surrounding it begin to have predictable patterns. From these basic beginnings it constructs what it is to be an adult human. Very little of this is hard-wired, not even language or motor skills, social interactions or the basic housekeeping of how to feed itself. Each bit is learned, built upon what has already been discovered. What comes to matter is defined by innate drives – what hurts, what satisfies, drives such as hunger and friendliness – but, as time goes by, what matters depends on structures that have been learned and which serve as building blocks for what comes next. Disrupt any step, and a ragged series of consequences spill into adult life.
We are a long way from finding how to achieve true AI. What is missing is not computing, data sources or putative magic powers such as quantum computing, but understanding of how awareness emerges in human beings (and mice.) In that, we have a vague structure around which to think, but we lack the core understanding of the relationships between substructures, the ability to synthesise information into a synopsis and, lying above all of that, awareness and the associated phenomena of self, community, emotions and the like.