poodleOwned
As I said before, while the tension and release might be applicable
from a certain perspective, it is still a reward, if only to state that
the release of tension is a "reward." I guess in many ways that this is sort of R-. if you must.. Nothing despairs me more that hearing trainers have a good old slag at behavourism. I am not a behavourist much at all, but it is like damming Newton. (The bloke who did a fair bit of it before Einstein).
With the tools available at the time, The concept of systemetising reward and punishment seemed fair enough. At a high level of abstraction , it makes sense and is backed up with reasonable data.
Hi PoodleOwned,
Thanks for your comments.
You're right that the neo-Freudian model I use -- i.e., of releasing internal tension -- would (or could) be categorized as -R -- the removal of an unpleasant stimulus. However, new research done by neuroscientists -- where they actually measure what's going on in the brains of their test subjects on a neuronal level -- suggests (and quite strongly) that animals don't learn through "the law of consequences," or through cause-and-effect (which are the foundations of behavioral science), but through paying attention to shifting patterns in their environments.
I explain a bit more about this in my most recent article for PsychologyToday.com: "Toward a Unified Dog Theory."
Here's a snippet or two from the article:
For instance, in his paper "Dopamine and Reward: Comment on Hernandez et al. (2006)," Neuroscientist Randy Gallistel of Rutgers writes, "In the monkey, dopamine neurons do not fire in response to an expected reward, only in response to an unexpected or uncertain one, and, most distressingly of all, to the omission of an expected one." [emphasis mine.]
So missing out on a reward is pleasurable? How could that be?
In another article, "Deconstructing the Law of Effect,"
Gallistel poses the problem of learning from an information theory
perspective, contrasting Edward Thorndike's model, which operates as a
feedback system, and a feedforward model based on Claude Shannon's information theory.
It's well-known that shaping animal behavior
via operant or classical conditioning requires a certain amount of
time and repetition. But in the feedforward model learning can take
place instantly, in real time.
Why the difference? And is it important?
I
think so. Which is more adaptive, being able to learn a new behavior on
the fly, in the heat of the moment, or waiting for more and more
repetitions of the exact same experience to set a new behavior in place?
In Thorndike's model, the main focus is on targeting which events in a stream seem to create changes in behavior. But according to information theory, the intervals between events, when nothing is happening, also
carry information, sometimes even more than is carried during the US
(unconditioned stimulus). This would explain why the monkey's brains
were producing dopamine when the animals detected a big change in the
pattern of reward, i.e., no reward at all!
We're now discovering
that the real purpose of dopamine is to help motivate us to gather new
information about the outside world quickly and efficiently. In fact
dopamine is released during negative experiences as well as
positive ones. (The puppy who gets his nose scratched by the cat doesn't
need further lessons to reinforce the "no-chasing-the-cat" rule; he
learns that instantaneously, with a single swipe of the cat's paw.)
This
adds further importance to the idea that learning is not as much about
pairing behaviors with their consequences as it is about paying close
attention to salient changes in our environment: the bigger the changes,
the more dopamine is released, and, therefore, the deeper the learning.
Randy Gallistel again: "...behavior is not the result of a learning process that selects behaviors on the basis of their consequences ... both the appearance of ‘conditioned' responses and their relative strengths may depend simply on perceived patterns of reward without regard to the behavior that produced those rewards." ("The Rat Approximates an Ideal Detector of Changes in Rates of Reward: Implications for the Law of Effect," Journal of Experimental Psychology: 2001, 27, 354-372.)
Cheers!
LCK