A Model of Language Acquisition
Psycholinguists report that the child “internalizes” the grammar of his or her native language. Beginning with an innate schema of universal grammar (UG), the child hears the speech of adults and somehow extracts the rules that govern the particular language in question. That heard language is external to the child’s mind, but it becomes internalized as the language is gradually acquired. At some point the acquired language is externalized in the form of overt speech, as the child’s inner competence gets expressed by means of a sensorimotor system. We have internalization followed by externalization. But what kind of internalization is this—in what form is the outer language internalized? A natural answer is: memory. The child remembers what he or she has heard, suitably processed and generalized, and acquires the ability to speak by using these memories. Memories are internal, so that is the form the internalization takes: outer speech is internalized in the form of memories. The child possesses an innate internal UG combined with a memorized internal PG (particular grammar)—and also a lexicon of some sort, innate or acquired. Innate schema plus memory equals linguistic competence.
I want to enrich this picture somewhat. In addition to internal memories, I want to say that language acquisition involves inner speech: the child first learns how to speak inwardly, only subsequently expressing his or her linguistic mastery externally. So the internalization involves becoming a linguistic agent—a speaker. It is not just a matter of acquiring memories of what is heard, but also of acquiring an ability to engage in internal speech acts. Memory is presupposed in this, but it is not all that is going on internally. When outer speech develops inner speech is hooked up to a sensorimotor system, typically hearing and oral action (but in the deaf it can be vision and manual action). The child does not go directly from hearing a language and remembering it to being an agent of external speech; she takes the intermediate step of acquiring internal speech, a type of purely mental action. So the internalization consists of more than stored memories; it is full-blown internal linguistic agency. The external speech of others is internalized in the form of internal speech in oneself.
The psychological structure here may be compared to mental imagery. A person perceives an external object, forms a mental image corresponding to that perception, and then acts on the basis of the image (saying by drawing a picture of the imaged object from memory). This involves an extra step beyond merely perceiving an object and acting on the perception: an additional psychological layer is introduced. It is apt to describe the process of image formation as a type of internalization: an external object is internalized in the form of an image (not just a perception). The internal image acts as a kind of replica of the external object. Likewise, someone may hear a piece of music and retain the tune in memory, rehearsing it in her mind silently: this is not just storing the tune in memory, but also actively engaging in musical performance internally. We can think of this as inner musical action analogous to outward musical action like singing or playing an instrument. Learning to sing or play an instrument will typically involve developing the ability to perform inwardly—inner musicianship, we might say. You don’t just hear the violin with your ears and then play it outwardly; you also hear it inwardly and rehearse inwardly. You have internalized the (sound of the) violin. If someone lacked the ability to perform inwardly, they would presumably lack something important to learning the instrument. We might say that “musical imagination” is an important (essential?) component of musical ability. Imagination is “subject to the will” (as Wittgenstein says) and musical imagery is as willed as other forms of imagery. You can whistle with your mouth or you can whistle in your head. And there are other forms of internalization that proceed in much the same way—for example, internalizing a set of moral commands. It isn’t just that the child hears the moral commands of adults and commits them to memory, thus acquiring moral competence. He or she also incorporates these commands into an internal moral system—commonly known as conscience. Freud took the superego to consist of internalized parental commands—telling the child what to do and not do. This was taken to be essential to moral development: not just remembering what others have commanded but also commanding oneself—the “voice of conscience”. Whether Freud was right about the details doesn’t matter: what is important is that moral development involves the internalization of moral prescriptions—you tell yourself what to do (a form of inner speech). So: imagery, music, and morals incorporate this kind of strong internalization–as well as language. They are not like merely memorizing the dates of battles or the capitals of countries, because they involve inner action analogous to outer action. In particular, language acquisition goes through a stage of acquiring a highly structured set of internal abilities generating inner speech acts. Conceivably it might stop at that point, never progressing to the next stage of acquiring an ability to communicate—a language dedicated purely to thought. Language acquisition is not just a matter of stimulus-memory-response, but of stimulus-memory-inner action-response. To put it baldly, the child primarily acquires inner speech, which may or may not lead to outer speech.
This is an empirical hypothesis. I don’t know if it is true of actual human children. It certainly could be true of logically possible children, and it fits with the fact that children do acquire both inner and outer speech. Investigators would have to examine language development to see whether there is evidence that inner speech is acquired before outer speech. That might not be so easy to determine, given that inner speech is silent and invisible. But we could observe whether the child engages in self-directed monologue or shows signs of internal contemplation. Perhaps such investigations have already been undertaken: I am merely suggesting a plausible-sounding model that might or might not receive empirical confirmation. What I do think is that such a model would fall foul of traditional behaviorist prejudices and so might not be taken as seriously as it should; and also that it fits a general conception of learning that has many merits—the idea of learning as internalization in a strong sense. I gave several examples where such internalization operates and the case of language seems a natural addition to the list. The alternatives to the hypothesis are that inner speech develops in tandem with outer speech, but does not precede or enable outer speech; or that inner speech is the internalization of the child’s own outer speech. Obviously these are empirical questions, but the hypothesis I offer seems to me antecedently at least as plausible as the others: inner speech is the mechanism whereby outer speech develops, not merely something additional to it or the result of it. For it provides a psychologically natural way to construct linguistic competence: first master language internally without worrying about how it will be publically expressed, and only then search for a way to link linguistic competence with the body—whether the mouth or the hands, the ears or the eyes. For example, if you are mute but not deaf, you will naturally acquire a language by internalizing what you hear, but you will not externalize it by using your mouth. The ability to engage in communicative speech goes significantly beyond merely mastering grammar and vocabulary, which can be done purely inwardly. I imagine the child hearing outer speech, rehearsing it in his head, acquiring the ability to form internal linguistic strings, playing with these strings inwardly, and only later wondering how best to express his burgeoning thoughts to others.
This picture fits well the idea of language as primarily a vehicle of thought not communication. If language is mainly a medium of thought, its natural form of existence is as an internal symbolic system, silent and solitary; no need to recruit bodily organs that can produce externally observable signals to others. So the child first internalizes outer speech to aid it in cognitive processes—employing a language of thought—and then uses what has been so acquired to lever external communicative speech into existence. First we have symbolic thinking, then symbolic communication—the inner as the foundation of the outer. UG is already internal and intrinsically unconnected to communication; so PG can occupy the same psychological territory—an internal system dedicated initially to thought. Silent speech is the natural medium for thought, so it develops first; only subsequently do noise and gesture enter the picture to permit language to be used for speaking to others. The larynx is very much a Johnny-come-lately. The speech centers of the brain make contact with the larynx late in the game, and might not make contact with it at all. After all, if people had no use for communication, they would still need a language to express their thoughts inwardly: a language of thought has a point even if a language of communication does not. Granted that language enhances thought, silent speech is the way to go, the noisy kind being redundant if communication is not on the menu. You are going to need a language to enhance thought no matter what, so you might as well get that under your belt as soon as possible; how far you will need it for communication is a far more chancy affair and can be left as a secondary accomplishment.
Inner speech is certainly a reality of adult linguistic life. For solitary individuals it constitutes most of linguistic life, and even for the very social it rumbles as ceaseless background chatter. It also mingles with outer speech in myriad ways. An interesting question is whether inner speech regularly precedes outer speech: do we first say it inwardly and then give it outward utterance? We (our brains) certainly plan what to say before engaging the larynx, constructing in silence a pre-formed string of words (often only milliseconds before the utterance). This is a form of inner speech, the production of symbolic strings independently of external manifestation, and it precedes external speech; so we can say that adult outer speech is subsequent to inner speech, expressing what existed antecedently. In the child language acquisition proceeds from inner to outer too, according to the hypothesis: outer spoken language externalizes a prior inner language. This is certainly contrary to the behaviorist assumptions of nearly all psychology (and philosophy) in the last hundred years, but being contrary to that tradition is surely a mark of truth in these more enlightened times.  The whole point of the mind is that it cannot be observed; any theory of its achievements should respect that fact.
 The idea that the primary reality of language is its appearance in outer speech is shared by nearly all approaches to language in the last hundred years, but it is belied by the simple fact that inner speech is common and arguably basic. Language is essentially larynx-independent, sub-vocal not vocal.