NZSM Online

Get TurboNote+ desktop sticky notes

Interclue makes your browsing smarter, faster, more informative

SciTech Daily Review

Webcentre Ltd: Web solutions, Smart software, Quality graphics

Feature

Talking in Real Time

There's good reason for the fast drone of the racing commentator -- it's a linguistic means of making efficient use of memory resources.

Koenraad Kuiper

One of the many very normal, but nevertheless complex and fascinating, things which human beings are able to do is to talk in real time. Given that talking is a complex business it would be possible for delays in one part of the speech-encoding process to lead to the whole process grinding to a halt for a period while the tardy processing area caught up.

This doesn't happen in normal speech and it is one of the interesting questions therefore about human speech as to why it doesn't happen -- how is it that human speech is so well integrated we are able to talk in real time?

In attempting to answer this, linguists this century have almost unanimously made a distinction between what speakers know when they know a language and what they do when they are using a language that they know. The most widely used terms for these two aspects of native speaker linguistic abilities are "competence" for linguistic knowledge and "performance" for the use of language.

Embedding

Two examples will readily show that these two aspects are significantly different and that each must be separately dealt with before a full picture can emerge. In the early 1960s, Chomsky and Miller showed that native speakers have linguistic competence such that they know that structures such as the following are grammatical (the sequences of words and phrases are linguistically permissible in English):

The rabbit saw the trees.

The rabbit [that the dog scented] saw the trees.

What is going on in such structures is that the first sentence contains a single clause (roughly speaking a single proposition grammatically expressed), while the second contains two clauses where the inner clause marked by square brackets is modifying the noun "rabbit", adding to what is said about the rabbit. The grammatical process which allows for this kind of embedded structure is recursive -- it could potentially go on an infinitum at least as far as the speaker's competence goes. Such recursive embedding can be illustrated in a sentence such as the following:

I met the man to whom my aunt gave the bottle which I had found at the tip which I go to in the car which I bought from my neighbour who moved here last year by truck which he did ...

The embedding in this sentence is right branching in that the embedded clause always attaches to the last noun and so forms a rightwardly extending chain where each embedded clause attaches to an embedded clause which is itself embedded to a clause attached further up and so on. Human beings have no problem understanding such structures.

However, if we look the earlier sentence about the rabbit that the dog scented and keep embedding there:

The rabbit [that the dog [that the farmer owned] scented] saw the trees.

... such structures are near unintelligible.

The reason for this unintelligibility has to do with the nature of human memory. Centre embedded structures, unlike right branching structures, require human memory to hold each of the beginnings of the embedded clauses pending the decoding of the inner clauses so that linking the rabbit with the verb that it is the subject of requires the decoding of both the dog clause and the farmer clause. Human memory as it is structured cannot do this job in real time -- as the sentence is spoken -- although it can do it with pen and paper.

Here is a clear contrast between competence which is purely linguistic and performance which is dependent on other factors such as memory and processing capacities.

Sticking Close

Now a second example of performance factors in action. Look at the following sentence:

Jill told Maria that she would give her the tickets at six o'clock.

This sentence has two clauses. In one, Jill is telling Maria something and, in the other, Jill would give Maria some tickets. The phrase "at six o'clock" is grammatically ambiguous as to which of the two clauses it attaches to. It could be that the telling was at six o'clock or the giving was scheduled for six o'clock. As far as an English speaker's linguistic competence goes, there is nothing to tell either way.

However, in real-time processing, speakers will almost invariably opt for supposing that the six o'clock phrase is attached to the giving clause and not the telling one. Frazier terms this the "minimal attachment strategy". By attaching the phrase to the current clause the speaker is processing, the speaker does not have to recall the earlier clause and see if the phrase plausibly attaches to it (which in this case it does). The minimal attachment strategy is again a strategy relating to linguistic performance and independent of competence.

Both these examples deal with receptive performance rather than productive performance. Clearly, speaking will also be influenced by performance factors. These factors include human memory and processing capacities. One way to investigate these factors is to perform psycholinguistic experiments on speakers in controlled conditions. One might also look at naturally occurring speech errors on the assumption that such errors are the result of performance processing going awry.

Speaking Under Pressure

Another way to examine performance is to study naturally occurring speech in contexts where speakers are under a measure of psychological pressure from the performance of tasks other than speaking, and then looking to see what effects the performance of those tasks has on the speakers' speech processing capacities. It would be advantageous, in examining the impact of memory and processing factors on speech production, if the speech task could be held relatively constant while the memory and processing pressures were a variable, at the same time maintaining a natural setting for the observation of speech.

Two kinds of speech lend themselves well to this kind of study: auctioning and sports commentary. In both, speakers must speak fluently while at the same time performing other tasks which compete with speaking for memory and processing resources. The research method used for these studies is to spend a period as a participant observer finding out just what the speaker is doing at the time that he or she is speaking. This is an easy task since the role of the investigator is a passive one, either as an attender at the auction or a sports fan listening to a commentary. The period of observation is essential in order to achieve something like a native receptive knowledge of the variety of the language that the speaker is using, especially coming to understand what its technical vocabulary means, and what the speaker is doing intellectually and culturally.

Selling at auction is not a transparent process. To understand how auctions work, one must spend time becoming familiar with all the ways in which bidders bid and auctioneers cajole, as well as understanding the legal and social frameworks within which various kinds of auctions operate. Similarly, to understand what there is to understand in the commentaries of a given sport, one must understand the nature of the sport and the part the commentator has to play in the sport and how it is recreated for those who listen to the commentary.

The next step is to record a representative sample of the speech of a number of speakers who sell at auction or provide commentaries and to transcribe these recordings in sufficient detail to allow for the transcripts to be analysed for those linguistic features which will have the potential to reflect the kind of other performance pressures which the speakers are under.

What's Going On?

One also must attempt to assess those performance pressures in terms of what the speaker is doing besides speaking.

Looking first at sport commentaries, it is reasonably obvious that those who provide commentaries of fast sports such as ice hockey and horse racing have a lot of psychological processing and memory work to do just in keeping up with the play. Ice hockey is extremely fluid, with players coming off the bench very frequently and other players moving onto the ice. They skate very quickly and the strategic balance of the game alters rapidly. In order to provide a fluent and intelligible commentary and not to get behind, commentators are clearly put under considerable processing pressure.

Race callers also have many things to remember such as the names of all the horses, how long the race is, the names of the jockeys and, like ice hockey, the situation in a race can change dramatically in a very short period.

Contrast this with cricket or baseball where there are long periods in which relatively little happens, thus placing little pressure on the speaker from tasks other than speaking.

The same contrasts can be found at auctions. In some auction markets such as real estate, the auctioneer has ample time to sell. Houses are not queueing in large numbers in the wings waiting to be sold. The details relating to the particular property are relatively easy to keep in mind and the range of actual bidders, as opposed to interested spectators, is often quite small. Compare this with the situation of an auctioneer of wool or tobacco, where a lot is sold about every five seconds, and you can see that there is considerably more pressure on the speaker in the latter cases just from the necessity to keep the processing going at such a rapid rate, not to mention any other factors.

Conserving Resources

To understand the options speakers have for coping with pressure on their memory and processing resources when they are speaking, we have to look at those options linguistically. Producing entirely novel speech made up for the occasion requires both memory and processing resources. Memory resources are required to access individual words from memory, and processing resources are required to string them together into grammatical sequences and utter them. Producing entirely non-novel speech requires memory resources to access some complete stored sequence (such as a poem), and then a minimum of processing resources in order to articulate the sequence in speech.

It appears that access to human long-term memory is very rapid and that the storage capacity of long-term memory in a finite natural lifetime is such that memory is unfillable. (Imagine the error messages that would result if human long-term memory was a finite hard disk which ended up full at some point when the person in whose head it was, was still alive.) Short-term or working memory, by contrast, is relatively small and human processing capacities, while rapid, are also not infinite. This situation would lead to the prediction that humans would opt for increased use of storage and retrieval mechanisms rather than use up valuable sort-term memory and processing capacity, particularly when essential processing not connected with speaking must be done concurrently with speaking.

What speech observations show is that in rapid sports, sportcasters use speech formulae a great deal. Such formulae are phrases which have particular jobs to do. In a race they might describe the distance a race still has to go, such as "400 metres left to run", or they might indicate that the race is about to start, such as "set now", or has just started, "racing now". In the commentaries of fast sports, most of what is said is said using formulae. Furthermore, in most fast sports the speaker speaks in a monotone of some sort. This too reduces processing load since a monotone is totally predictable and requires no articulation of the speaker's fundamental frequency, as is normal in other kinds of speech.

The same is basically true of middlingly rapid auctions. Speakers use formulae extensively and they tend to drone their intonation. In both cases their speech is remarkably free of the kind of hesitancy which is normal for speech. There are few, if any, pauses, voiced or silent, and the kind of speed up and slow down in articulation rate which is normal of conversational speech hardly occurs. This suggests that such speech is highly automated requiring little active processing since most of it is drawn in chunks from long-term memory as and where the situation requires.

Slow sports and slow auctions do not have these properties. Formulae are used, but by no means exclusively. Intonation is frequently animated -- after all there isn't much else happening in a cricket game much of the time. Normal hesitation phenomena and articulation rates appear while the speaker maintains normal fluency.

In cases of extraordinarily rapid auctions such as the tobacco auctions of the American South and the wool auctions in New Zealand, the memory and processing resources of auctioneers are placed under such pressure that speakers resort almost exclusively to tracking bids by saying numbers. What this indicates is that they are hard pressed to access and speak phrases, and are only able to utter single words.

What these case studies show is that talking in real time is comparatively expensive in the use normal speech makes of memory and processing resources. We might imagine therefore that even when speakers are not under the kind of pressures that auctioneers and sports commentators are under, they would resort, where possible, to using formulae rather than making sentences up from scratch. That seems to be the case.

In any situation which is more or less routine, such as being a telephonist or receptionist or forecourt attendant at a petrol station, the range of what the speaker says is largely restricted to formulaic speech. Formulaic speech is not only easy on processing resources, but socially safe, since the speaker is unlikely to be misunderstood because the hearer will have heard the formulae before. They are, after all, socially conventional. Conversely, high levels of linguistic creativity, as well as being hard on the speaker, are hard on the hearer who is required to decode the novel utterance and work out what the speaker meant by it. For this reason, among others, reciting even apposite recently created poems in the forecourt of a petrol station is probably socially dangerous.

Koenraad Kuiper is in Canterbury University's Department of Linguistics.