The analogies and connections between music and language are striking. This is most apparent in the case of song, but it applies quite generally. Music is made up of notes, phrases, bars, tunes, riffs, verses, movements, symphonies, operas, albums, etc. It has compositional structure. It proceeds from a finite base and generates a potential infinity of combinations. There are introductions and conclusions. Speech has melodic properties, pitch variation, pauses and crescendos. Music comprises rhythm as well as melody, some of it predominantly rhythmic. Poetry forms an intermediate link between the two. We use the vocal apparatus for both. We hear tunes in our head as well as words. Arrangements of words can suggest melodies to us, carrying a kind of latent melodic structure. Children learn music and language at roughly the same time, and have an innate aptitude for both. Both involve the internalization of rules and patterns (scales and grammar). There is a sensitive period for language learning and probably also for musical learning. We tend to be wedded to the forms of our own language and also to the musical forms to which we were early exposed. Language and music are universal to the human species and contain universal properties, though culture clothes each differently. They both have a social dimension and are used communicatively. No other species possesses both aptitudes, though there may be glimmers of them in other species. Both require a process of segmentation to divide up the incoming auditory stimulus into discrete units of sound. They clearly interact all the time, song being the obvious point of intersection. It would be wrong to identify them or claim some kind of priority of one over the other, but they coexist in mutually supportive ways. Songwriters make a profession of fusing them, and the brains of listeners respond to the fusion. Each has production and reception aspects: we make music and listen to it, as we produce utterances and comprehend them. Each admits of a competence-performance distinction. There is an abstract underlying structure to both as well as a sensorimotor expression: we hum and whistle tunes while tacitly cognizing and computing musical forms, and we utter words while unconsciously processing syntax. There are hidden dimensions as well as conscious manifestations. The major scale is not English grammar, but the two function similarly. Notes are discrete entities arranged into rule-governed patterns, as words are discrete entities arranged into grammatical sentences. Both can be said to express emotions and to be about something.

            Given these points of analogy and overlap, it might be useful to apply them to two well-known theories of language—those of Chomsky and Wittgenstein. Can Chomsky’s theoretical apparatus be applied to music, and can Wittgenstein’s characteristic ideas be similarly applied? I will be brief about both questions. I have already mentioned innateness, universality, sensitive periods, and competence and performance with respect to music; and the application of these notions to musical ability is obviously plausible, if not platitudinous. Less obvious is the question of cognitive architecture: what comparisons can be made at this level? I would emphasize the distinction between underlying competence and sensorimotor vehicle: linguistic competence is generally expressed in vocal speech, but this is not essential, as witness sign language.[1] Is something similar true of musical ability? Isn’t it always tied to the auditory and vocal? Yes, but this does not appear to be a necessary truth: we can imagine beings that possess our musical competence but lack our auditory faculties. That is, the abstract mathematics of music, involving pitch variation and rhythm, might be grasped by them but expressed via a different sense, say vision. They might see musical forms—melodies in the shape of patterns of light. No doubt their aesthetic experience would be very different from ours, but their brains could be performing the same abstract computations as ours as they process the visual input. For them musical form would be expressed in the way dance expresses musical form for us. If dance is music made visual, then their light shows would also be music made visual. They might even enjoy visual muzak. The point is that deep cognitive competence is separate from sensorimotor expression in both cases. Some aspects of what we call music derive from the abstract structure and some from the contingent sensorimotor system—just as Chomsky suggests that the same is true for language and vocal speech. Second, music is hierarchical: it consists of strings of notes that break down according to hierarchical principles. One musical phrase may be composed of other phrases that together form a structured whole. There are recurrent elements and backward-looking resolutions. There are “meaningless” combinations of notes just arbitrarily strung together. Chomsky talks about the Merge operation that generates new wholes recursively: the operation joins one linguistic unit with another, which can then combine with other units, and so on up.[2] And the same thing is true of musical forms: one note gets Merged with another, thus forming a new unit that can be Merged with another. Musical compositions are formed by a process of something like recursive combination. The mind has to grasp this structure in order to comprehend the musical piece. For example, we must keep track of the root note in order to appreciate a “return to the root”, and the “blue” note is defined by its place in a series of notes. We can break a symphony down into movements, themes, phrases, and individual notes—and similarly for a song composed of verses and chorus. So musical and linguistic comprehension involve a similar (but not identical) set of cognitive capacities, in particular an appreciation of hierarchical structure. Tree diagrams capture this structure in both cases, as they do for other hierarchical systems.

We can thus envisage a cognitive science of music similar to a cognitive science of language. Many of the same basic ideas will apply; the Chomskian apparatus will be applicable to both. There is an innate competence genetically represented, a natural and predetermined learning period, an outer expression and an inner architecture, an abstract structure capable of multiple sensorimotor externalizations, a computational procedure that generates infinitely many combinations from a finite base of primitive units, and a sharp competence-performance distinction. If we imagine an alternative intellectual history in which these ideas first gained a foothold in the study of music, then we can picture a theorist urging a similar approach in the study of language as against entrenched empiricist-behaviorist assumptions. As it is, we have Chomsky’s basic approach developed for the case of language first and then extended in the direction of music (or so I am suggesting). In my alternative history Chomsky’s counterpart writes books called Musical Structures, Music and Mind, and Aspects of the Theory of Music. He revolutionizes the field of musicology, construed as the study of musical psychology. It’s not all Markov processes, conditioned responses, blank slates, and Verbal Behavior (Chomsky’s counterpart writes a review of a book entitled Musical Behavior by the well-known Harvard psychologist B.F. Skynner).

            In the case of Wittgenstein we can imagine a similar possible world in which his counterpart, call him Wettgenstein, recapitulates the actual man’s history but in the field of music. In this imaginary world the philosophy of music is regarded as the central area of philosophy (they have a very musical culture) not the philosophy of language. Wettgenstein’s first work Tractatus Mathematico-Musicus is concerned with this subject and advances a highly abstract theory of the nature of music. It begins with the resounding words, “The world of music is a totality of tunes, not of notes”. He goes on to assert that tunes are complexes of simple notes. Auditory sequences mirror these complexes construed as objective features of the universe (“the music of the spheres”). A note is not a readily perceptible sound but a metaphysical posit. An auditory experience of music pictures these Platonic-Pythagorean entities by means of lines of projection. This is the hidden essence of musical form as we experience it. The mathematical properties of music are stressed. Heard music is held not to be capable of expressing the pictorial relation between itself and musical reality—this can only be shown by music. Music is really an abstract calculus tenuously related to what we actually hear. There is such a thing as an ideal musical notation capturing the form of the musical facts. Music is something sublime and otherworldly. So said Wettgenstein’s Tractatus, albeit obscurely. But in his later period he revised this theory of music: in his posthumous work Musical Investigations he rejected his earlier theory and replaced it with a new set of ideas (he was wary of calling this a theory). Here he spoke of musical games, musical practices and customs, music as use, of our musical form of life, of the rules of music, of the music of other tribes, of the concept of music as a family resemblance concept, and allied ideas. He repudiated his earlier static pictorial theory and replaced it with a dynamic behavior-centered theory. He compared his new theory with an already existing theory of language that mobilized the same set of ideas: for it was accepted by everyone that language functioned in the way Wettgenstein now claimed that music does. For instance, everyone believed that the concept of language is a family resemblance concept, so Wettgenstein could call on this acceptance to motivate his new theory of music–that there is no one thing in common to everything we call music just a series of overlaps and surface similarities in the heterogeneous musical family. There is nothing hidden to music—no deep structure—just a human social practice. It has a purpose, a role in our lives, but it isn’t any kind of sublime supernatural abstract structure. True, music can sometimes bewitch us as to its nature, but if we study it in its context of use we can see that there is nothing “queer” going on. Music is like language: a motley collection of old and new games with no underlying essence. It is a mistake to criticize one sort of music for not resembling another sort. Music is like a city composed of old and new buildings—just like language. There is nothing sacrosanct about the major scale! Other cultures use other musical scales and are none the worse off for it. Music is part of our natural history, like eating, drinking, etc. So the later Wettgenstein contended (still obscurely however).

            The actual Wittgenstein, despite being highly musical himself, says little about music in his Philosophical Investigations, preferring to draw his analogies from games; but we can envisage him invoking music in order to make his points about language. For music does appear to fit many of his suggestions: there are a great many types of music, music is woven into our daily lives, it is an activity that occurs over time, it is rule-governed, and it doesn’t lend itself to abstract theory. A theory (“description”) of music is really an anthropological exercise not a mathematical one—as Wittgenstein claimed for language. I am not saying Wittgenstein is right to see things this way, only that it would be natural for him to include music in his thinking; it could be used to support many of his contentions about language. In fact at one point he does bring in music to explain a point about meaning: in section 184 of the Investigations he discusses remembering a tune and asks, “What was it like to suddenly know it?” He says it can’t be that the tune occurred to the person “in its entirety at that moment!” Yet it must have been present in some sense—but in what sense, he asks. This is a version of a general problem he discusses particularly in relation to meaning: how I can know at a particular moment the meaning of a word if the meaning is the temporally extended use. He is raising a question in the philosophical psychology of music to illuminate a question about language and meaning. My point is that he could have invoked music in other contexts to advance his contentions—e.g. to explain his idea that language is a family resemblance concept, or to explore the concept of rule following. In fact it has been cogently argued that game is not a family resemblance concept, since it can be precisely defined; but no such thing has ever been done for the concept music.[3] On the face of it, Wittgenstein’s general theoretical apparatus applies quite naturally to music, so it would have been a good model to use to develop such a view of language. No one would think that a Tractatus-like view of music was correct (pace early Wettgenstein)!

            Music is not central to philosophy and psychology as they are currently practiced (it is consigned to aesthetics), but it has some promise of providing a useful area of comparison to other cases (as well as being interesting in its own right). It fits well with the perspectives developed by both Chomsky and Wittgenstein (different as they are). This is why it is helpful to consider what intellectual history would look like if music not language were the central focus of inquiry in both fields. As things stand we can only note the parallels and wonder how the subject would look if pushed further. I don’t see much prospect of a fundamental unification, but heuristically the comparison has its virtues.[4]                      


[1] See Berwick and Chomsky, Why Only Us (2016), p.74f.

[2] See Chomsky, What Kind of Creatures Are We? (2016), chapter 1.

[3] See Bernard Suits, The Grasshopper (1978), for a definition of games.

[4] We can imagine a book written by a Kripke counterpart about Wettgenstein’s Musical Investigations that claims to find a skeptical paradox at the heart of it about musical rule following. How do we know that a musician is following the major scale, say, given that his behavior so far is compatible with using some deviant scale in the future?

