A new paradigm?

Next Previous Table of content

11. A new paradigm?

The approach to regard emotions as a characteristic of the architecture of an intelligent system led in the last years to an increased interest in this topic. While in 1997 only two papers dealing with the topic "emotion" were presented at the leading congresses dealing with agents, 1998 already saw the first congress exclusively dedicated to "emotional agents"

A number of researchers have started to develop emotional autonomous agents based on the principles of Simon, Toda, and Sloman. Many of these approaches still are in the stage of theoretical exploration; but some have already been rudimentarily implemented.

A fundamental factor shared by all agent-centered approaches is the view of emotions as control signals in an architecture which must possess a system that can move independently in an uncertain environment. The function of emotions is it to direct the attention of the system toward an external or internal aspect which possesses meaning for substantial goals or concerns of the system and to assure it of processing priority.

11.1. The models of Velásquez

11.1.1. Cathexis

Velásquez (1997) developed a model based on the "Society of Mind" theory of Minsky (1985). He calls it Cathexis , a term which he defines as "concentration of emotional energy on an object or idea " (Velásquez, 1997, p.10).

Emotions consist in his model of a variety of subsystems:

"Emotions, moods, and temperaments are modeled in Cathexis as a network of special emotional systems comparable to Minsky's "proto-specialist" agents (...) Each of these proto-specialists represents a specific emotion family...such as Fear or Disgust."

(Velásquez, 1997, p. 10)

Each of these proto-specialists has four kinds of sensors, which are responsible for the measurement of internal and external states: Neural sensors, sensorimotor sensors, motivational sensors, and cognitive sensors. In addition, each proto-specialist is characterized by two threshold values which Velásquez calls Alpha and Omega: Alpha is the threshold above which the activation of the respective proto-specialist begins; Omega is the saturation limit of a proto-specialist. Finally, each proto-specialist has a decay function which affects the duration of its activation.

Velásquez differentiates in his model between basic emotions and emotion blends/mixed emotions. For the definition of basic emotions he builds on Ekman and Izard and defines them as follows:

"In this model the term basic...is used to claim that there are a number of separate discrete emotions which differ from one another in important ways, and which have evolved to prepare us to deal with fundamental life tasks..."

(Velásquez, 1997, p.11)

The basic emotions in Cathexis are Anger, Fear, Distress/Sadness, Enjoyment/Happiness, Disgust and Surprise.

Emotion blends or mixed emotions, respectively, are emotional states which arise when several different proto-specialists representing the basic emotions are active without one of them dominating the others.

Finally, his model contains moods which differ from emotions only by the value of their excitation level.

Emotions in Cathexis are caused by cognitive and non-cognitive elicitors which originate from the same categories as the sensors of the system. The cognitive elicitors for the basic emotions are based on a modified version of Roseman's emotion model.

The intensity of an emotion in Velásquez ' model is affected by several factors:

"Thus, in Cathexis, the intensity of an emotion is affected by several factors, including the previous level of arousal for that emotion..., the contributions of each of the emotion elicitors for that particular emotion, and the interaction with other emotions..."

(Velásquez, 1997, p. 12)

The behaviour repertoire of the system knows three substantial elements: an expressive component with whose assistance it communicates its present emotional condition, consisting of face, body, and voice; an experiential component which learns from experiences and affects motivations and action tendencies of the system as well as an action selection mechanism, which selects from the calculated behavior values of different action alternatives the one with the highest value.

The system regularly goes through so-called update cycles in which the following cycle is completed:

"1. Both the internal variables (i.e. motivations) and the environment are sensed.

2. The values for all of the agent's motivations (both drives and emotions) are updated....

3. The values of all behaviors are updated based on the current sensory stimuli (external stimuli and internal motivations).

4. The behavior with the highest value becomes the active behavior. Its expressive component is used to modify the agent's expression, and its experiential component is evaluated in order to update all appropiate motivations."

(Velásquez, 1997, p. 13f.)

Velásquez has implemented Cathexis in a computer model which he calls "Simón the Toddler". The screen shows the face of a baby which is capable of different emotional modes of expression and rudimentary verbalizations. The user interacts with the system by, for example, changing the parameters of Simó's proto-specialist, varying the level of neurotransmitters, or interacting directly with it by feeding it, stroking it etc..

At present, the model consists of 5 drive-proto-specialists (hunger, thirst, temperature regulation, fatigue, interest) and a repertoire of 15 behaviour alternatives, among them sleeping, eating, drinking, laughter, crying, kissing, and playing with toys. These are to be extended step by step in the course of the further development of the model.

11.1.2. Yuppy

Yuppy is a robot which represents an emotional pet. It is an advancement of the model Simón the Toddler. Yuppy was developed first as a virtual simulation before it received a body. Velásquez calls it an example of a system with emotion-based control .

The model is constructed from a number of computational units which consist of three main components: an input, an assessment mechanism and outputs. A substantial part of the assessment mechanism are the Releasers. They filter sense data and identify special conditions, according to those they then send excitatory or inhibitory signals to the subsystems connected with them.

Velásquez follows Damasio and LeDoux and differentiates between natural and learned Releasers . Natural Releasers are firmly built into the system (hard-wired); Learned Releasers are learned and represent stimuli which are associated with the occurrence of Natural Releasers or can predict their occurrence. In the language of other models, the Natural Releasers correspond to the primary emotions, while Learned Releasers are identical with secondary emotions. The latter require more processing capacity and are more complex, since they are based, among other things, on personal emotional memories which must be activated.

Drives in Yuppy are motivational systems which propel the agent into action. Drive systems are clearly distinct from emotion systems.

Yuppy's emotion systems represent six groups of affective basic reactions: Anger, Fear , Distress/Sadness , Enjoyment/Happiness , Disgust and Surprise . Velásquez differentiates between cognitive and non-cognitive Releasers of emotions. He differentiates between four groups:

The neural group covers the effects of neurotransmitters, brain temperature and other neuroactive agents which can lead to an emotion and are affected by hormones, sleep, nutrition and environmental conditions.
The sensorimotor group covers sensorimotor processes such as face expressions, body posture and muscle potential which regulate not only existing emotions, but can also cause emotions.
The motivational group covers all motivations which lead to an emotion.
The cognitive group covers all kinds of cognitions which activate emotions, e.g. appraisal of events, comparisons, attributions, desires, beliefs, or memories.

Yuppy's perception system consists of two color CCD cameras as eyes; a stereo audio system with 2 microphones as ears; infrared sensors for the discovery of obstacles; an air pressure sensor in order to simulate contacts; a Pyrosensor which notices changes of the ambient temperature if humans enter the area as well as a simple proprioceptive system.

Yuppy's drive system contains four drives: Charging adjustment, temperature adjustment, fatigue, and curiosity. Each of these drives controls an internal variable assigned to it which represent the charge of the battery, the height of the temperature, the quantity of energy and the measure of the interest of the agent, respectively.

Yuppy's emotion production system consists of emotional systems with Natural Releasers for the basic emotions. Velásquez divides the emotional systems into three groups:

Interactions with Drive Systems: Unsatisfied drives produce Distress and Anger; over-satisfied drives produce Distress, and drive satisfaction produces Happiness.

Interactions with the environment: All objects with pink colour produce Happiness in different amounts; yellow objects produce Disgust. Darkness produces Fear , and loud noises produce Surprise.

Interactions with People: Humans can stroke and punish Yuppy. This produces either joy or pain. Joy leads to Happiness; pain produces Distress and Anger.

Yuppy's behaviour system consists of a distributed net of approximately 19 different kinds of behaviour which cover predominantly the satisfaction of its needs and the interaction with humans. Examples of such behaviour are " search for bone ", " approach bone ", " recharge battery " or " approach human ".

Like the drive systems and the emotional systems, Yuppy's behaviour systems also have their own Releasers .

The user can control Yuppy's affective style by the manipulation of parameters such as threshold values, inhibitory or excitatory connections etc.. In addition, he can present to the robot internal and external stimuli. Velásquez describes the result as follows:

"Using the model described before, both the simulated and physical Yuppys will exhibit emotional behaviors under different circumstances. For instance, when the robot's Curiosity drive is high, Yuppy wanders around, looking for the pink bone which people may carry. When it encounters one, the activity of the Happiness Emotional System increases and specific behaviors, such as "wag the tail" and "approach the bone" become active. On the other hand, as time passes by without finding any bone, the activity of its Distress Emotional System rises and appropriate responses, such as "droop the tail", get executed. Similarly, while wandering around, it may encounter dark places which will elicit fearful responses in which it backs up and changes direction."

(Velásquez, 1998, p. 5)

Furthermore, Yuppy is able to learn secondary emotions which are stored as new or modified cognitive Releasers. If, for example, a human holds a bone in his hand and makes Yuppy to come and get it, he can stroke or discipline it afterwards. Depending upon experience, Yuppy produces a positive or negative emotional memory regarding humans which then affects its following behaviour.

11.2. The model of Foliot and Michel

Foliot and Michel (1998) define emotions as an "evaluation system operating automatically either at the perceptual level or at the cognition level, by measuring efficiency and significance" (Foliot and Michel, 1998, p. 5). For them, emotions are the basis of every cognition. With their model they aim to show "how emotion based structures could contribute to the emergence of cognition by creating suitable learning conditions" (Foliot und Michel, 1998, p. 1).

The model was implemented in a virtual Khepera robot. Khepera is a miniature robot model that contains a number of sensors and can be extended depending upon requirement by further components. The "Webots simulator" does not only simulate a Khepera; programs developed with Webots can be transferred directly into a Khepera.

The environment of the virtual Khepera consists of a city with buildings, a river and green areas. Each of these elements possesses a specific colour. The robot must move through the city and learn to evade different kinds of obstacles.

Foliot and Michel represent emotions on two levels. The level of process can evaluate stimuli and elicit different emotions; the level of state can supply informations about the system.

For Foliot and Michel, the basis of their first experiment was the assumption that an emotion is characterized by a reaction to a positive or negative signal. The model consists of four components:

A reflex structure, which leads to a motor movement in opposite direction to an obstacle which is discovered by an infrared sensor.
An association matrix between the motor behaviour and the input of the infrared sensors, whose initial value is set to zero.
A signal which, if the robot meets an obstacle, each time produces an association in the matrix.
A behaviour system with the three alternatives: (a) Movement on a straight line if no obstacle stands in the way and no learned association is active; (b) the release of a reflex behaviour with the impact with an obstacle; (c) association of a motor configuration with a well-known sense pattern.

The experiment resulted in the fact that the robot collided gradually less and less with obstacles but could never move completely error free. In order to examine whether the improvement of the training system by affective signals would furnish better results, the authors performed a second experiment.

The second experiment was based on on the emotion theory of Scherer. It consists of five components:

A linear evaluation system, which corresponds with Scherers SEC and in which each evaluation stage is used for the following evaluation.
Two state systems, one of which represents the assessment of a situation, the other one the physical body.
Two cognitive processes, one of which is responsible for attention selection, the other one for the decision over the next movement. The state systems can affect these processes directly.
A data base with goals.
A sensorimotor, a schematic, and a conceptional level. The schematic level produces association schemata between significant patterns and actions.

Fig. 17: Controller model by Foliot and Michel (after Foliot and Michel, 1998, p. 4)

The model differentiates between cognitive and emotional processes. Each emotional process is defined by an assessment sequence which classifies stimuli according to the criteria novelty, pleasantness, goal significance, and coping. Each stage of this process uses the results of the preceding stages as input. Coping knows the alternatives "reaction possibility" and "no reaction possibility".

The cognitive processes know a primary goal (forward movement) and four secondary goals (anti-clockwise rotation, clockwise rotation, left wall follow, right wall follow). Each goal is defined by a value in the body representation.

Learning happens in this model whenever the average state of the system contains a strong displeasure value:

"This produces a new scheme containing the newer stimulus as a sensory input. The process then waits to observe which goal is associated to this stimulus and [to] check whether this goal allows to come back to a normal state. If this normal state is reached within a small amount of time, the representation is associated to the scheme, otherwise, the scheme is destroyed."

(Foliot und Michel, 1998, p. 5)

Central component of the model is the mechanism which produces schemata. The experiment showed that during the avoidance of obstacles this took place either on the sensorimotor level, if an obstacle was detected by the infrared sensors, or on the schematic level, if an obstacle was not detected. The schematic level corresponds with a temporary goal change which the authors interpret as consequence of a danger signal or of an internal assessment process.

Concerning the learning process, the system exhibited two fundamental instabilities in its behavior: Either the robot persisted in its once selected goal or it changed its goals nonstop. Foliot and Michel conclude, nevertheless, that their approach is in principle correct but requires a more detailed definition of the individual components.

11.3. The model of Gadanho und Hallam

Gadanho and Hallam examined which role emotions play in an autonomous robot which adapts to its environment by reinforcement learning (Gadanho and Hallam, 1998). For this purpose they worked with a simulated Khepera robot.

They built their emotion model after the somatic marker hypothesis suggested by Damasio (1994). Damasio assumes that emotions cause special body feelings. These body feelings are the result of experiences with internal preference systems and external events and help to predict results of certain scenarios. Somatic markers help humans to make fast decisions without using a high processing capacity and a long time.

The model developed on this basis by Gadanho and Hallam knows four fundamental emotions: Happiness, Sadness, Fear, and Anger. The intensity of each emotion is determined by the internal feelings of the robot. These feelings are: Hunger, Pain, Restlessness, Temperature, Eating, Smell, Warmth, and Proximity. Each emotion is defined by a set of constant feeling dependencies and a bias value. For example, the intensity of Sadness is high, if Hunger and Restlessness are high and the robot does not eat.

In the model of Gadanho and Hallam, each emotion tries to affect the body state in such a way that the resulting body state resembles the one which causes that specific emotion. To achieve this, the emotion uses a simple hormoneal system. With each feeling, a hormone is associated. The intensity of a feeling is derived not directly from the value of the body perception, which causes the feeling, but from the sum of the perception and the hormone value:

"The hormone values can be (positively or negatively) high enough to totally hide the real sensations from the robot's perception of its body. The hormone quantities produced by each emotion are directly related to its intensity and its dependencies on the associated feelings. The stronger the dependency on a certain feeling, the greater quantity of the associated hormone is produced by an emotion."

(Gadanho and Hallam, 1998, p. 2)

The hormone values can rise fast; however, they fade away slowly, so that the emotional state remains for some time, even if the emotion-releasing situation is already long past.

The robot equipped with this emotion system has the task to visit sources of food scattered in its environment and to take up energy. The faster it moves, the more energy it uses. The sources of food consist of lights which the robot can detect. In order to draw energy from it, it must push the source of food. This sets free energy for a short time, and a smell which the robot can detect. In order to take up the energy, the robot must turn around and turn its back to the source of food. After a short time the source of food is empty and needs a certain period of time to regenerate again. The robot must thus visit other sources of food. If a source of food has no energy, its light goes out.

In the context of this task, the emotional dependencies of feelings look as follows:

The robot is happy if the present situation does not exhibit problems. It is particularly happy if it used its motors much or just takes up energy.

The robot is sad if it has little energy and currently does not take up energy.

If the robot drives against a wall, the felt pain makes it fearful.

If the robot remains a too long time at a position, it becomes jerky. That makes it angry. The anger remains, until it moves or changes its current actions.

The system learns by Reinforcement Learning. In order to shorten the learning process, the fundamental behaviours of the robot were programmed from the start, so that the system could concentrate on the learning of behaviour co-ordination. The three fundamental behaviours of the robot are the avoidance of obstacles, approaching sources of light as well as driving along a wall.

The system has a controller with two separate modules. The Associative Memory Module is a neural net which associates the feelings of the robot with the values it expects from each of its three behaviours. The algorithm used here is Q-Learning. The Behaviour Selection Module makes a stochastic selection, based on the information of the other module, which behaviour is to be executed next.

Fig. 18: Adaptive controller (after Gadanho and Hallam, 1998, p. 3)

Reward or punishment with an autonomous robot pose, according to the authors, a special problem. From moment to moment the environment or the internal state of the robot change. If during each transition all information is analyzed and the behaviours are changed, this would cost not only enormous processing capacity, but would also supply the robot with no feedback whether a selected behaviour leads to success perhaps only after a set of transitions. On the other hand, it must be able to change a dysfunctional behaviour fast. Here the emotions come into play: Their task is to determine these state transitions.

In order to test this hypothesis, the authors developed a controller with emotion-dependent event detection. An event is detected if one of three conditions occurs:

there is a change of the dominant emotion;

the value of the currently dominant emotion differs statistically significantly from the values which were recorded since the last state transition;

a limit of 10.000 steps is reached.

Fig. 19: Emotions and control (after Gadanho and Hallam, 1998, p. 4)

To test the effectiveness of this event-directed controller, the authors developed three further controllers:

Regular intervals - the adaptive controller is released all 35 steps.

Hand crafted - all behaviours are programmed firmly in the controller, the system cannot learn anything.

Random selection - the controller selects a new behaviour with each step.

Each of these four controllers went through an identical experimental setup. It consisted of thirty different attempts with three million learning steps. In each attempt, a fully loaded robot was placed at a randomly selected initial position. For evaluation purposes, units with 50,000 steps each were evaluated and data collected over the following variables:

the average of the reinforcement received over all steps;

the average of the reinforcement during those steps in which the adaptive controller was released;

the average of the energy level of the robot;

the number of collisions;

the frequency of the release of the adaptive controller.

The result looks as follows:

Controller

Reinforcement

Event reinforcement

Energy

Collisions (%)

Events (%)

Hand-crafted

0.34

-0.03

0.83

3.0

6.15

Event-driven

0.24

0.04

0.63

0.6

0.52

Regular intervals

0.24

0.20

0.62

1.7

2.86

Random selection

-0.38

0.02

5.6

100

Tab. 9: Results of the experiments of Gadanho and Hallam (after Gadanho and Hallam, 1998, p. 5)

The results show, according to the authors, that the learning controllers have fulfilled their task. Their energy level is, on average, significantly lower, but reaches no critical value. Between the two learning controllers the main difference lies in the number of collisions - here the event-directed controller is better.

In all, the event-directed controller is not significantly better than its competitor, but it obtains its learning success with a significantly lower number of events and thereby saves substantially more time.

The authors come to the conclusion that the experiments have confirmed their hypothesis about the role of emotions in reinforcement learning.

11.4. The model of Staller und Petta

Staller and Petta developed the TABASCO architecture, an acronym for "Tractable Appraisal-Based Architecture for Situated Cognizers" (Staller and Petta, 1998). TABASCO is based to a large extent on the emotion theory of Scherer and has so far not been implemented in a simulation.

Staller and Petta understand emotions as processes which are related to the interaction of an agent with its environment. "In particular, TABASCO models the appraisal process, the generation of action tendencies, and coping." (Staller and Petta, 1998, p. 3)

The fundamental idea of TABASCO consists of the fact that the levels of the emotion system (sensorimotor, schematic and conceptional), postulated by Scherer, have not only validity regarding appraisals, but also regarding action generation. The two main components of the architecture, Perception and Appraisal and Action, are therefore constructed as hierarchies with three levels.

Fig. 20: TABASCO architecture (after Staller and Petta, 1998, p. 4)

The component Perception and Appraisal: The sensory layer consists of feature detectors for the detection of, for example, sudden, intensive stimuli or the quality of an stimulus (e.g. pleasantness). The schematic layer compares the input with patterns, particularly with social and self patterns. The conceptional layer can, based on propositional knowledge and beliefs, think abstractly and infer.

The component Action: The motor layer contains motor commands. The schematic layer contains action tendencies and what Frijda calls "flexible programs" (Frijda, 1986, p. 83). The conceptional layer is responsible for coping.

Between these two components moderates the Appraisal Register which goes back to a suggestion of Smith et al. (1996). It discovers and combines the appraisal results of the three layers of the Perception and Appraisal component and affects, on the basis of the appraised state, the Action component.

The Action Monitoring component finally observes the planning and execution processes of the Action component and conveys the results to the Perception and Appraisal component, where they are integrated into the appraisal process.

Staller and Petta call their system a situated cognizer. With this term they want to underline the importance of both components for an autonomous system. They define cognizing (a term suggested first by Chomsky) as "having access to knowledge that is not necessarily accessible to consciousness" (Staller and Petta, 1998, p. 5).

11.5. The model of Botelho and Coelho

Botelho and Coelho define emotion in the context of their Salt & Pepper project as "a process that involves appraisal stages, generation of signals used to regulate the agent's behavior, and emotional responses" (Botelho and Coelho, 1997, p.4). With Salt & Pepper they want to define an architecture containing mechanisms which play the same role for autonomous agents as the mechanisms that make humans so successful.

Starting point of their considerations is the classification of emotions in a multidimensional matrix ""that may be used with any set of emotion classification dimensions" (Botelho and Coelho, 1997, p. 4).

Dimension of classification

Examples

Process component

Role/function of emotion

Attention shift warning, performance evaluation, malfunctioning-component warning, motivation intensifier

Emotion-signal

Process by which emotion fulfills its role

Reflexive action, creation of motivators, setting plan selection criteria

Emotion-response

Urgency of the repairing process

Urgent (e.g. need to immediately attend the external environment), not urgent (e.g. need for long-term improvement of default criteria for plan selection)

Emotion-response

Source of appraisal

External environment, internal state, past events, current events

Appraisal stage

Type of appraisal

Affective appraisal, cognitive appraisal

Appraisal stage

Table 10: Dimensions of emotion classification (after Botelho and Coelho, 1997, p. 5)

The authors differentiate between affective and cognitive appraisal and postulate that it is, in principle, possible to differentiate clearly between these two components in any given architecture. They call the respective modules Affective Engine and Cognitive Engine.

The Affective Engine and the Cognitive Engine differ in three respects:

Kind of processed information: The Affective Engine processes information which has to do with the hypothetical or real satisfiability of the motives of the agent, while the Cognitive Engine processes additionally problem solution informations, decision informations, and declarative informations about different aspects of the world.
Purpose of information processing: The principal purpose of the information processing of the Affective Engine is the production of signals which help the Cognitive Engine to fulfill its tasks, for example selection of the cognitive structures relevant for a situation, control of attention etc.. The principal purposes of the Cognitive Engine are goal attainment, problem solving, and deciding. "A simple way to put it is to say the Cognitive Engine reasons at the object level, whereas the Affective Engine reasons at the meta-level." (Botelho and Coelho, 1997, p. 10).
Typical response time: The Affective Engine reacts much faster than the Cognitive Engine, because it needs only a fraction of the information and because its architecture contributes likewise to faster decisions.

The authors suggest a mechanism which makes it possible for the Affective Engine to react quickly: the reduction of explicit and long comparison chains to short, specific rules. They describe an example of such a process:

if someone risks dying, he or she will feel a lot of fear;

risks_dying(A) -> activate(fear, negative, 15)

if someone risks running out of food, he or she risks dying;

risks_running_out_of_food(A) -> risks_dying(A)

if someone risks running out of money, he or she risks running out of food;

risks_running_out_of_money(A) -> risks_running_out_of_food(A)

if someone loses some amount of money, he or she risks running out of money;

loses_money(A) -> risks_running_out_of-money(A)

loses_money(A) -> activate(fear, negative,15)

(Botelho and Coelho, 1997, p. 11)

These explicit and implicit rules should be organized in a hierarchy in which the longer rules are used only if no suitable short rule is found.

The Salt & Pepper architecture consists of three main components: the Affective Engine, the Cognitive Engine and an Interrupt Manager. The Affective Engine possesses Affective Sensors, an Affective Generator, and an Affective Monitor. The latter two initiate the process of emotion production together. All other modules of the system (except the Interrupt Manager) are assigned to the Cognitive Engine.

Fig. 21: Salt & Pepper architecture (after Botelho and Coelho, 1997, p. 12)

The long-term memory is an associative network. Each node of the network possesses an identification, an activation level, a set of associations with other nodes and a number of symbolic structures which represent motives, plans, actions and declarative knowledge. The more a node is activated, the easier it is noticed by a search process (accessability).

The Input Buffer and the Affective Generator activate nodes in long-term memory. The Cognitive Monitor and the Affective Monitor suggest certain nodes for the attention of the agent. If such a suggestion process runs, the Interrupt Manager decides whether the current cognitive process is to be interrupted and the content of the suggested node is to be loaded into working memory to be processed. If the contents of a node in working memory are processed, the node receives a certain level of activation and thus more accessability.

Nodes which are based on certain experiences of the agent are called episodic nodes and form the episodic memory.

Emotions are described in this system by a set of parameters:

a label E which describes the emotion class and a list of arguments, for example the source of the appraisal;
a valence V, which can take on the values positive, negative, or neutral;
an intensity I;
an emotion program P, which represents an action succession which is implemented as soon as the appraisal stage has produced a label;
an emotional reaction R which is only implemented if a node in long-term memory, which agrees with the label of the emotion, is selected and processed in working memory.

The emotion program differs from the emotional reaction by the fact that it is executed by the Affective Generator, without interrupting the current cognitive processing of the agent.

The Affective Generator undertakes a partial evaluation of the external and internal state of the agent, the so-called affective estimate. If the calling conditions of a certain emotion are fulfilled, the Affective Generator produces the label, the intensity and the valence of the emotion and executes the emotion program. The Affective Monitor then scans the long-term memory, until it finds a node which corresponds to the label of the emotion and possesses the same valence. It activates it with an activation level which represents a function of the produced emotion intensity.

The system contains mechanisms which make emotional learning possible. The authors differentiate between three classes of emotional learning:

Learning of new and optimization of appraisal rules: With this they mean the learning of new circumstances which can release an emotion signal, the reduction of appraisal rules (see above) and the change of the characteristics of the generated emotion signal.
Extension of the repertoire of emotion signals.
Learning from emotional reactions: Among this they count the optimization of existing behaviour reactions, learning new reactions as well as learning as a result of a reaction to an emotion signal.

The authors specify the conditions under which a system is able to accomplish these learning procedures (Botelho and Coelho, 1998).

Some elements of Salt & Pepper were implemented so far and, according to the authors, have confirmed the theoretical assumptions (Botelho and Coelho, 1997).

11.6. The model of Canamero

Canamero (1997) also pursues an approach based on Minsky's "Society of Mind" (1985). In a two-dimensional world called Gridland live the Abbotts, artificial organisms which have a motivational and emotional system.

An Abbott consists of a number of agents which, viewed individually, are quite "simple", but reach a new quality when they interact with one another. An Abbott possesses three kinds of sensors (somatic, tactile, visual); two kinds of recognizers which react to complex stimuli and can both learn and forget; eight so-called direction nemes which supply informations from the spatial environment of the Abbott; two categories of maps (tactile and visual) which receive their information from the recognizers and direction-nemes and represent these internally; three effectors (hand, foot, mouth); a behaviour repertoire (Attack, Drink, Eat, Play, Rest, Withdraw etc..) as well as a set of managers (e.g. finder, look-for, go-toward) which correspond with appetitive behaviour. Furthermore, the Abbotts possess a set of physiological variables, e.g..adrenalin, blood sugar, endorphines, body temperature etc..

The Abbotts move in a world which contains sources of food, obstacles and enemies. They come into this world as "newborns", equipped with a basic set of characteristics, and must then develop in their environment.

What is interesting in Canamero's model is that her creatures are equipped from the outset with motivations and emotions. They are called, after Minsky, proto-specialists , because they are primitive mechanisms responsible for action selection and control functions.

Theoretical basis for the motivations is a homoeostatic approach:

"In general, motivations can be seen as homeostatic processes which maintain a controlled physiological variable within a certain range."

(Canamero, 1997, p. 6)

The motivational agents of the Abbotts consist of

"a controlled variable, the set point and the normal variability range of which are defined by the corresponding sensor that tracks ist real value; an incentive stimulus that can increase the motivation's activation level, but cannot trigger it; an error signal or drive; and a satiation criterion."

(Canamero, 1997, p. 6f.)

Thus the error message "too low blood sugars" calls the motivation hunger, for example, whose goal is it to increase the blood sugar level. The activation of a motivation is proportional to the size of the error message (or the deviation of a physiological value from the homoeostatic state); according to the activation level the intensity of the motivation is computed. The motivation with the highest activation level tries to organize the behaviour of the Abbott in such a way that the associated drive is satisfied. If the motivation cannot find and call an appropriate behaviour, it activates the finder agent and hands to it the intensity value, so that it can pass it on to other agents who are activated by it. The intensity affects a behaviour substantially: with the escape behaviour, for example, the strength of the motor activity, with other behaviours, for example, their duration.

Activation level and intensity of a motivation can now be modified by emotions. In Canamero's system emotions are composed of

"an incentive stimulus; an intensity proportional to its level of activation; a list of hormones it releases when activated; a list of physiological symptoms; and a list of physiological variables it can affect."

(Canamero, 1997, p. 7)

Emotional states are activated and differentiated from each other by three kinds of elicitors:

External events, i.e. an object or the result of a behaviour, whereby the reaction to it can be either inherited or learned.
General stimulation patterns which cause different changes in the physiological variables and thus let the same emotion work under different circumstances. As example Canamero cites the anger agent which is called by continually too high a level of a variable. In this way, emotions contribute to the control of homoeostatic processes.
Special value patterns of physiological variables which permit a distinction between emotions which are caused by the same general mechanism. As an example Canamero cites fear (with high heartbeat frequency) and interest (with low heartbeat frequency).

Since Abbott is a primitive system, it is always in a clear emotional state. The three elicitors are arranged hierarchically in the order mentioned above. The selected emotion affects the action selection mechanism in two ways: It can lower or increase the intensity of the current motivation and thus at the same time also the intensity of the selected behaviour; besides it modifies the results of the sensors which measure the variables that affect the emotion and changes thereby the perceived physical state (happiness agent - > release of endorphin - > less pain perception).

The action selection of an Abbott thus takes place in four stages:

The activation level of all agents is set to zero.
Internal variables and environmental data are read in and maps are formed.
Motivations are appraised and the effects of the emotional state are computed. The motivation with the highest activation is selected.
The active motivation selects the behaviour(s) which can best satisfy its drive.

Canamero grants that her Abbotts at present still operate on a very primitive level and need a number of additional agents in order to develop long-term learning and strategies. Emotions play a substantial role in her model:

"In particular, as far as learning is concerned, our model of emotions provides a means to have different reward and punishment mechanisms...Again, motivations and emotions constitute a key factor in determining what has to be remembered and why."

(Canamero, 1997, p. 8)

11.7. Summary and evaluation

As the preceding examples show, there exist a variety of approaches to model emotions in the field of autonomous agents . The connections to psychological theory among them are quite different.

It is noticeable that most authors are quite eclectic with their theories. They fall back predominantly upon theories which are suitable for an operationalization. Frequently, only certain elements are picked out which are then extended by own components, often without making this explicitly clear.

In order to obtain fast results, only parts of the sketched models are implemented in real simulations or robots. Pragmatic solutions are used which necessarily reduce complex processes to some few variables. Besides, these variables are frequently arbitrarily defined in order to be able to realize the model at all.

It is remarkable that the majority of the authors regards emotions as a substantial component of the control system of an agent and defines them in this regard functionally. Emotions are regarded no more as appendages of the cognitive system, but rather as an indispensable condition for the reliable functioning of cognition.