# Structure and Perception in *Herma* by Iannis Xenakis

## Robert A. Wannamaker

KEYWORDS: Xenakis, Herma, set theory

ABSTRACT: This paper presents a detailed analytical discussion of
*Herma* (1961) for solo piano by Iannis Xenakis. The model of the work as a
set-theoretic demonstration, advanced in the writings of the composer, is
discussed. A statistical analysis reveals numerous inconsistencies between the
published score and this model. The aesthetic consequences of these are
explored, and an attempt is made to develop an alternative understanding of the
music as it is perceived by listeners in terms of temporal gestalt
formation.

*Received 7 November 2000*

Copyright © 2001 Society for Music Theory

**[1] Introduction**

[1.1] *Herma* (1961) for piano was Xenakis’s first composition for a
solo instrument. It was commissioned in 1961 by pianist/composer Yuji Takahashi,
whom Xenakis met on a trip to Japan in April of that year. The Greek title may
be translated as “bond,” but also as “foundation” or “embryo,” and perhaps
reflects an intuition on the composer’s part that this was to be a seminal work
insofar as it was his first departure from purely stochastic means of
composition.

[1.2] This technically very difficult piece makes unprecedented demands upon the performer, who must play complex rhythmic figures involving huge leaps with perfect evenness of articulation. In a good performance, the effort is repaid by the creation of a sense of seething, amorphous energy and a powerful forward momentum. The experience is unlike that associated with any prior piece in the piano literature. The ear strains to absorb it, but does not succumb to frustration or numbness. The frenzied sonic activity maintains a surprising sense of perceptual lucidity, never becoming muddy.

[1.3] The composer himself provides a detailed theoretical discussion of
*Herma*’s construction in his book, *Formalized Music*.^{(1)} Referring
to *Herma* as an example of “Symbolic Music,” Xenakis advances a model of
the piece which involves the exemplification of specific mathematical
relationships between certain pitch sets. The macro-structure of the composition
is clearly outlined as well. Details regarding the specific pitch and rhythmic
choices made are not included, however, and, while the composer makes
indications of what *ought* to be listened for in the piece, a typical
listening unavoidably raises certain questions which the composer does not
address regarding what *actually is* heard. In particular, the ability of
listeners to recognize the proposed model in the music, and the ability of the
model to account for important aspects of the music as heard, both beg
examination.

[1.4] This paper explores the relationship of the score to its set-theoretic
model, and the relationship of both to the music as it is perceived. Section 2
introduces the composer’s published model of the composition, expanding on his
treatment. Section 3 subjects the musical materials of
*Herma* to statistical analysis, which is appropriate given the composer’s stochastic means of pitch
selection. The statistical deployment of pitches is compared both to the
published model and to the perceptual organization of musical details as
revealed through the close reading of a representative score excerpt from the
viewpoint of gestalt perception. Detailed examination of the score for
consistency with the model is undertaken in Section 4 and certain significant
discrepancies are identified. The implications of these are considered in
Section 5 within a broader discussion of aesthetic issues raised by the
relationship of the model to listeners’ perceptions and by the writings of the
composer. An Appendix
provides a brief introduction to the necessary mathematical background.

**[2]**** Herma As a Set-theoretic Demonstration**

[2.1] The composer adopts as a universal pitch set^{(2)} *R*, the set of
all 88 keys on the standard piano keyboard. From *R* he selects three
subsets, *A*, *B*, and *C*, with non-empty intersections. The
central conceit of the piece is to present these pitch sets and others derived
from them by means of elementary set operations in a fashion which will
demonstrate that a given target set, *F*, can be expressed in terms of
*A*, *B*, and *C* using two different set-algebraic forms (one of
which is the disjunctive normal form discussed in the Appendix).

**Figure 1**. Graphical demonstration of equivalence for the two representations of the set *F* given by Equations 1 and 2 (after Xenakis, *Formalized Music*)

(click to enlarge)

[2.2] **Figure 1** is redrawn following *Formalized Music* (p. 176)
with minor corrections.^{(3)} It illustrates diagrammatically how the set *F*
may be expressed in two different fashions:

F |
= A B C + A B C + A B C + A B C |
(1) |

= G C + G C |
(2) |

where I have introduced the convenient shorthand

*G = A B + A
B.*

The figure is divided in half by a bold horizontal line, the upper portion (labeled “PLANE 1”) corresponding to Equation 1 and the lower (labeled “PLANE 2”) to Equation 2. Each “PLANE” is further subdivided into two rows of Venn diagrams, each row having a different dynamic marking to its right-hand side.

[2.3] The line of diagrams marked *fff* illustrates the four atomic sets
appearing in the disjunctive normal form of *F*, while the line labeled
*f* shows selected intermediate stages in the computation of these as
intersections of *A*, *B*, *C*, and their complements. Together
these two rows of diagrams are apparently intended to demonstrate how the four
relevant atomic sets are computed and united to yield *F*. The solid
arrow-tipped lines connecting the diagrams are evidently meant to suggest the
temporal flow of an imagined mathematical construction, consisting in essence of
the line marked *fff* with lemmata interjected from the line marked
*f*.

[2.4] The two lines which “PLANE 2” comprises similarly illustrate the
computation of Equation 2, demonstrating diagrammatically that the result is the
same as that specified by the disjunctive normal form. The development begins
with the “construction” of *A B + A B* on the line marked *ppp*,
before branching to illustrate the parallel constructions of *G C* (marked
*ff*) and *G C* (marked *ppp*), whose union
is *F*. This illustrates the manner in which the figure preserves
information about the structures of Equations 1 and 2, insofar as the rules of
operator precedence (“order of operations”) required to compute the
right-hand-sides of these equations partially dictate the order in which the
Venn diagrams are arranged as read from left to right. Of course, the
arrangement depicted is not the only one which might be employed to this end,
since there is mathematical flexibility regarding the order in which the
intermediate quantities are computed, so long as the order of operations is
observed.^{(4)}

[2.5] The broken arrow-tipped lines in the diagram seem mostly to represent
supporting observations traversing the two PLANES. Thus two broken lines lead
from *ABC* and *A B C* in PLANE 1 to their union,
*(A B + A B)C*, in PLANE 2. Some of the sets
connected by broken lines, however, lack a simple algebraic relationship. For
instance, *A B + A B* is not immediately relatable to
*A B C*.

[2.6] The composer describes the above set-theoretic considerations as being
*outside-time* (p. 170), indicating that no reference to temporality is
made or required. He observes that, to find employment in music, entities
*outside-time* must somehow be combined with a *temporal* event schema
to yield events *in-time*.

**Figure 2**. Temporal flow chart for *Herma* indicating changes in musical parameters, perceived high-level temporal gestalts, and different pitch sets articulated over the course of the work

(click to enlarge)

[2.7] In *Herma*, a pitch set is articulated *in-time* as a
sequence of pitches drawn from the set. (Indeed, the introduction of each new
pitch set is indicated directly in the score with the name of the set.) The top
line of **Figure 2** indicates the succession of different sets presented.
Thus *Herma* commences with a substantial passage composed using notes
selected from the set *R* (i.e., the entire piano keyboard). A section
using only the notes of the pitch set *A* follows, with successive sections
drawn from the contents of the sets *A*, *B*, *B*, *C*, and *C* in that order. Then the derived
pitch sets corresponding to each Venn diagram in Figure 1 are introduced with the dynamic markings given in the
figure and with new sets being introduced at the indicated dynamic and roughly
in the order shown (reading from left to right). Some sets are given multiple
expositions with subsequent presentations being marked *rappel* (“recall”)
in the score. Different pitch sets are sometimes sounded simultaneously, with
contrasting dynamics and temporal densities being employed to aid in their
discrimination. Sometimes, as in the *A* and *B* sections, the
*same* pitch set is articulated in two simultaneous layers with contrasting
dynamics and temporal densities, consisting of, in the composer’s terms, a
“linear” (i.e., temporally sparse and dynamically loud) layer and a “cloud”
(i.e., temporally dense and, usually, dynamically soft) layer, with the sustain
pedal depressed whenever a “cloud” is being sounded.

[2.8] Individual pitches are drawn at random from the given sets without
registral preference (except at one point in the *C* section: cf. paragraph
3.6). This
employment of stochastic procedures, retained throughout the composition, is
intended to avoid melodic or harmonic patterns that would serve to distract the
listener from the perception of the pitch sets and relationships between them
*as such*. In the words of the composer:

If we want to be free the sounds should follow without any melodic law, independently of one another. So we have to play them at random.

. . . How can I demonstrate the elements of the sets? By playing them. But in order to remain neutral I have to play them at random.^{(5)}The elements of each class are presented stochastically, that is unrestrictedly, in order not to disturb the basic plan of operations and of logical relationship between classes.

^{(6)}

[2.9] Each randomly selected pitch is associated with an attack randomly
placed in time. The mean attack density per unit time does not change
continuously with time, but assumes a sequence of constant values specified by
the composer, with the abrupt change to each new density being indicated at the
appropriate point in the score (e.g., as “0.8 s/s” or 0.8 sounds per second) and
in *Formalized Music *(p. 177). Note durations appear to be randomly
selected as well, although this is made definite neither by the score nor by the
composer’s writings about the piece. The opening *R* section exhibits a
wide variety of rhythmic figures, but thereafter values divisible by a
sixteenth-note or quintuplet sixteenth note (5:6) are employed almost
exclusively. The work is notated mostly in 12/8 time with the exception of three
brief occurrences of 6/8 in the second half of the work and the opening *R*
section, which begins in 4/4 and visits a wide variety of time signatures. These
markings notwithstanding, the composer’s preface to the score indicates that,
“The whole piece is to be played without accents, the bar-lines serving merely
as divisions in time.”

[2.10] All pitch sets are musically represented strictly by means of displaying their members, randomly selected. Set algebraic operations (i.e., intersection, union, complementation), relationships between sets (e.g., equality), and rules of logic (e.g., implication) are not represented as such. The composer contends that

If the observer, having heard

AandB, hears a mixture of all the elements ofAandB, he will deduce that a new class is being considered, and that a logical summation has been performed on the first two classes. This operation is theunion. . . .We have not allowed special symbols for the statement of the classes; only the sonic enumeration of the generic elements was allowed. . . .

. . . we have not allowed special sonic symbols for the three operations

. . . ; only the classes resulting from these operations are expressed, and the operations are consequently deduced mentally by the observer. In the same way the observer must deduce the relation of equality of the two classes, and the relation of implication based on the concept of inclusion (pp. 171–172).

[2.11] Here, then, we have a model of *Herma* as an exemplification
using pitch sets of the equivalence of two set-theoretic expressions. Pitch sets
are represented by sequences of their members, randomly selected, with the
succession of pitch sets corresponding to an exhibition of selected intermediate
quantities that would be produced in a step-by-step computation of the
expressions in question. The listener is left to deduce the relationships
between different pitch sets and the meaning of the succession (i.e., to
recognize that something is being exemplified, and what that particular
something is). The composer aptly compares the function of musical time here to
the expanse of a chalkboard:

The role of time is again defined in a new way. It serves primarily as a crucible, mold, or space in which are inscribed the classes whose relations one must

decipher. Time is in some ways equivalent to the area of a sheet of paper or a blackboard (p. 173).

[2.12] From the mathematical model of *Herma* discussed above, we now
turn to questions regarding the perception of the piece by listeners.
Requirements for a listener to be capable of following the set-theoretic
exposition in question would have to include the following:

- complete and accurate characterization of each pitch set involved;
- an ability on the part of the listener to discriminate and compare the contents of different pitch sets, requiring accurate recollection of the contents of pitch sets previously presented.

The next two sections address in detail the constitutive materials of
*Herma* and the auditory perception thereof in light of the above criteria,
and will attempt to identify any potential obstacles to the perception of the
underlying set-theoretic model.

**[3]**** Musical Segments in Herma**

[3.1] In describing the musical segmentation of *Herma*, I will use the
language of temporal gestalt formation as developed by James Tenney.^{(7)} A
detailed introduction to the subject of temporal gestalt formation in music is
beyond the scope of this article, and we must restrict ourselves to a few
relevant definitions. For a thorough discussion of the topic, the reader should
consult Tenney’s book. This terminology and conceptual framework are introduced
in order to identify correspondences between the set-theoretic model and the
musical segmentation as I hear it (and as I believe most listeners will hear
it).

[3.2] Following Tenney, we will adopt the term *holarchical* instead of
*hierarchical* in order to avoid any connotation of relative importance.
For our purposes, we will regard the lowest holarchical gestalt level as that of
the note and the highest as that of the piece. A gestalt at the holarchical
level immediately above that of the note is referred to as a *clang* and
constitutes “a sound or sound-configuration which is perceived as a primary
musical unit or aural gestalt.”^{(8)} A gestalt at the next highest level is
referred to as a *sequence* and consists of “a succession of clangs which
is set apart from other successions in some way, so that it has some degree of
unity and singularity, constituting a musical gestalt on a larger perceptual
level or temporal scale.^{(9)} Finally, a temporal gestalt all of whose
component, next-lower-level temporal gestalts exhibit the same overall
statistical properties (mean, variance, etc.) in some particular sonic parameter
are said to be *ergodic* in that parameter.^{(10)}

[3.3] The second line in Figure 2 shows the temporal gestalt units associated with
*Herma* at the first and second holarchical levels below that of the piece
as I hear them. I will refer to these as *sections* and *subsections*,
respectively.

[3.4] Inspection of the figure shows that sectional and subsectional boundaries are correlated with changes in temporal density, dynamic and timbral envelope (i.e., use of the sustain pedal). The composer systematically deploys such changes to aid in discrimination of the pitch sets. During the first 250 seconds of the piece, these three parameters exhibit simultaneous changes which produce clearly demarcated sections and which are correlated with changes in pitch set. Throughout the work as a whole, gestalts at the sectional level are defined primarily by drastic changes in temporal density, with periods of silence or undisturbed sustained tones separating sections. These delineations are frequently supported by large changes in dynamic, changes in the use of the sustaining pedal, or both. Less drastic and coordinated changes in temporal density, dynamic, or sustain are responsible for temporal gestalt formation at the level of subsections.

[3.5] During the later portion of the work, when pitch sets change rapidly
and temporally overlap one another, successive pitch sets are not separated by
periods of inactivity and are distinguished only by changes in dynamic or
pedaling. As a result, the holarchical level at which changes of temporal
gestalt unit and changes of pitch set are correlated falls from that of the
section to that of the subsection. This occurs around 250 seconds into the
piece, at which point the first sets corresponding to intersections are
introduced. After this, sections, which are still demarcated by hiatuses, no
longer serve to define single pitch sets. This change itself, however, is not a
perceptually salient event. In summary, we observe that all subsection
boundaries coincide with pitch-set boundaries and *vice versa*, with the
sectional/subsectional schema serving to embed the temporalized pitch-sets
within a perceptual holarchy.

[3.6] With one significant exception, the active pitch range remains the
entire range of the piano keyboard throughout the composition, and the random
selection of pitches is apparently uniformly distributed over the entirety of
the active pitch set. The single brief exception occurs during the *C* section between 230 and 250
seconds into the piece. At this point, the range engaged narrows towards the
middle of the keyboard, culminating in a dramatic, fortissimo, chordal passage
preceding the introduction of the first pitch sets derived from intersections.
Otherwise, the pitch materials of the piece are ergodic at the holarchical level
of section. That is, sequences within a subsection are distributed uniformly
among all registers of the piano and this situation persists throughout the
piece with the exception noted. (Our discussion will later return to the
apparent aesthetic contradiction between a compositional approach based
primarily on the articulation of specific pitch distributions and a perceived
ergodicity in pitch materials.)

**Figure 3**. Pitch occurrence frequencies within specific pitch sets articulated in Herma Bullets indicate notes which did not occur in the given section

(click to enlarge)

[3.7] At lower holarchical levels, the situation is more complex. **Figure 3** shows the frequency of occurrence of each pitch on the
piano keyboard during the articulation of each pitch set. As in hexadecimal
notation, the decimal numbers 10, 11, 12*A*, *B* and *C*, thus ensuring a fairly chromatic musical
texture. We also note that each of these sets consists largely of clustered
“clumps” of pitches separated by considerable intervals, a tendency which is
inherited by their complements and intersections. Otherwise the sets exhibit no
simple structure. The composer describes them as “amorphous.”^{(11)}

[3.8] It should be clear that randomly sounding notes that are uniformly
distributed over such “clumpy” sets may sometimes produce an impression of
multiple polyphonic lines (i.e., *auditory streaming* according to
similarity in pitch^{(12)}). This effect is perceptible throughout much of the
piece, although it is usually complicated by other factors. Generally speaking,
events stream according to complex interactions between their proximity in
register (determined partially by the macro-structure of the pitch sets and
partially by chance), their proximity in time, and their similarity in
dynamic.

**Figure 4**. Score excerpt from *Herma* showing the last four measures of the exposition of set *A* indicating clangs (circled) and sequences (tied)

(click to enlarge and listen)

[3.9] **Figure 4** shows a typical excerpt from the score: the final four
measures of the exposition of set *A*, where “linear” and “cloud” versions
of the set are simultaneously articulated. Circles indicate *clangs* as I
typically hear them, with sequences of *clangs* connected by ties. The
initial G4 is segregated from subsequent events by its dynamic, but such
segregation is not consistent throughout the passage, as is immediately
illustrated by the perceived grouping of the next three notes despite
differences in dynamic. Here this is largely a product of the temporal proximity
of the attacks, but also arises in part due to the continuous use of sustain,
which blurs together temporally clustered events in the bass more effectively
than ones in the treble. The following sequence in the treble is segregated by
difference in pitch from the middle register tones, reflecting the
macro-structure of *A* (i.e., the distinct clumps of pitches in *A* in
the ranges F5–E6 and F4–G4, respectively). Indeed, each sequence identified in
Figure 4 falls into a single one of the following clumps
suggested by Figure 3 (except in the highest register, where the two-highest
pitched clumps are regularly traversed): F1–F2, B2–D3, G3–

[3.10] This should not, however, be regarded as evidence that the macro-structure of the pitch sets is the primary determining factor in low-level temporal gestalt formation. Many of the intervallic skips observed in this example between notes within a single clump are larger than the pitch intervals separating distinct clumps, so other factors clearly also contribute. As already indicated, these include similarity in dynamic and temporal proximity, but also subtler matters of accentuation and articulation which may vary from one performance to another. (For instance, the D2 in the third of the four measures shown in Figure 4 is inaudible in Takahashi’s performance.) Furthermore, the attitude of the listener makes an important contribution. In casual listening, for instance, one may attend primarily to the fortissimo notes, ignoring the pianissimo ones and thus perceive a much different set of gestalts.

[3.11] Perhaps the most important perceptual consideration regards the sheer complexity of the sonic information presented. In producing Figure 4 I proceeded in a rather traditional manner by attempting to identify distinct melodic voices one at a time, an approach which already presumes a very particular strategy for listening (i.e., one of attending to only to activity in a certain register in an attempt to identify a coherent melodic voice). It frankly seems impossible to grasp the seething entirety confronted as a whole, and when I try to do so my ear flits from one sequence to another, ignoring certain events in favor of others. The events perceived as salient seem to be as much the product of how one listens (possibly favoring certain registers, dynamics, etc.) as of the auditory stimulus, and the sense of the music as being “too much to take in” is indispensable to its overall musical effect.

[3.12] To a large extent, seemingly, it is this perceptual complexity and instability which lends the work its considerable musical richness, producing a kaleidoscopic profusion of fractured pitch sequences in different registers between which the listener’s attention tends to skip. The uniform distribution of activity over the entire keyboard ensures that the density of sound in any given register remains relatively sparse. This produces a certain sense of lightness and lucidity, as opposed to the sense of density which would ensue if such a rapid succession of attacks were confined to a narrower pitch range.

**[4]**** Discrepancies Between Theory and Realization**

[4.1] A number of significant issues arise upon detailed comparison between
the set-theoretical conception of *Herma* and its realization as a musical
score. The first of these becomes apparent upon inspection of the *R* pitch
set tabulated in Figure 3. It is clear that, although *R* nominally contains
all 88 pitches playable at the standard piano keyboard, several are not actually
represented in the introduction to the piece where *R* is articulated. In
particular, *R* in *Herma* comprises only 205
notes, but, using such a random selection scheme, a much larger number
of draws will usually need to be
made before all 88 different possible pitches have been obtained. Consider, for
instance, a situation in which 87 different pitches *have already been
drawn*. The final pitch remains one among 88, so on average a further 88
draws will be required before it is drawn. Indeed, a computer
simulation employing ten million trials of the pitch lottery has shown that
an average of 446 draws are required to obtain all 88
possible pitches (with a standard
deviation of 110 draws), and that the probability of obtaining all 88 pitches
in 205 or fewer draws is extremely small (about 3 in 100000).
It seems reasonable
to suppose that all 88 pitches are not represented in the introduction of the
piece because the composer wished to curtail its duration without modifying his
method of pitch selection.

[4.2] Nonetheless, the fact that certain pitches in any given set may not be
acoustically articulated seems problematic. If the object of the composition is
to demonstrate certain set-theoretic facts, then a complete representation of
the sets of interest would seem to be a necessary prerequisite. An obvious
solution to the problem of sectional duration would be to not return pitches
drawn to the lottery. Then if one begins with a lottery containing *n*
representatives of each pitch, one is assured that after 88*n* draws each
pitch will have sounded precisely *n* times. In any event, no such
procedure was adopted in the work at hand.

[4.3] By definition of set complementation (see Appendix)
we know that notes appearing in the *A* section of *Herma* ought
theoretically to be absent from the *A* section. Inspection of Figure
3 indicates that, in fact, there are twelve instances of identical pitches
occurring in both sections. It is unlikely that these anomalies are the result
of artistic license on the part of the composer, for two reasons. The first of
these is the unambiguous commitment to the idea of a set theoretic demonstration
cast in pitch which Xenakis makes in *Formalized Music*, a commitment which
would be thoroughly betrayed by compromising the integrity of the pitch sets
once they are defined. The second reason is that no aesthetic motivation for the
introduction of the specific discrepancies is evident either in the score or
upon listening. The perplexing notes do not seem to be specifically demanded in
any obvious way by their musical context.^{(13)}

[4.4] We must consider the possibility that notes found in common between a
given set and its complement are errors arising either at the time of
composition or in the typesetting of the score. The latter seems unlikely on the
following grounds. *Herma* was written in in 1961 and premiered by
Takahashi a few months thereafter. The Boosey and Hawkes score (see note
6), the
only score of the piece which has been published, is copyrighted 1967 and the
data of Figure
3 is derived therefrom. However, the Denon recording of *Herma* by
Takahashi was made in 1972.^{(14)} The latter recording features at least some of
the pitch discrepancies mentioned above. (Not all have been verified by
listening, but at least B2, B3, D4, *A* and *A*.)
Takahashi was in frequent contact with Xenakis throughout the 1960’s,^{(15)} and so
the composer would have had ample opportunity to indicate corrections to the
score if there were any to be made. Furthermore, since Takahashi had performed
the work prior to its publication, he would likely have been working not from
the Boosey and Hawkes score when he recorded it, but rather from a copy of
Xenakis’s manuscript. This suggests that the anomalies in the published score
may have survived from the original manuscript.

[4.5] Comparing *A*, *B*, and *C* with their complements, 33
separate instances are observed of pitches appearing both in a set and in its
complement. Proceeding for the sake of argument on the assumption that such a
characterization is appropriate, we can attempt to identify which pitch
occurrences are “erroneous.” Most of the discrepancies are of the sort in which
a pitch occurs several times in a given set and only once in its complement. In
such cases, the single occurrence may be assumed to represent the “error.” In
other cases, a pitch may be relegated to one set or another after the
examination of sets occurring later in the score. For example, the appearance of
C8 in *A B C* means that it must belong to
*B *rather than to *B*.
Not all contradictions are so easily resolved, however. For instance, the pitch
B1 occurs twice in *A* and twice in *A*. The only subsequent sets in
which B1 is repeatedly observed are *G* and *G* *C*. Neither observation definitively
places B1 in either *A* or *A* and, worse, the sets in question
are themselves disjoint! In the end, we surmise that B1 belongs in *A* on the weak evidence that other
pitches in *A* sound several times during the exposition of that set (i.e.,
significantly more than twice). The question of whether B1 belongs in *B*
or *B* presents similar
difficulties. We place it in *B* based on the fact that it appears
in neither the exposition of *A
B C* nor in that of
*F*.

[4.6] Eliminating problematic pitches from *A*, *B*, and *C*
allows the generation of what I will call “best-guess” sets. These can be used
to algebraically compute the contents of pitch sets appearing later in the work.
The “best-guess” sets and sets derived from them using algebraic computation
software are shown in Figure 3 as shaded areas. These are to be compared with the
numeric entries representing the observed frequency of occurrence of each pitch
in the score. (Occurrences in the sections marked *rappel* have been
included.)

[4.7] The agreement between observed and computed sets is excellent, with a
few exceptions. Some expositions of sets (such as those of *B* *C* and *G*) do not exhibit all of the
expected pitches, perhaps only because they are too brief. Indeed, *G* is expected to contain 27
different pitches, but its exposition comprises only 12 notes.

[4.8] In two cases a set is presented at length in the score but shows little
agreement with the predicted set. In the case of *G* *C* fewer than half of the scored
notes fall within the predicted set. Curiously, the notes that fall outside the
computed set occur almost exclusively within the first exposition of *G C* while the notes heard in the four
occurrences of this set marked *rappel* agree almost perfectly with the
prediction. It is almost as if the first exposition is not of *G C* at all, but is of some other set
which is mislabeled in the score and in *Formalized Music*. Apparently this
is not the case, however, for the following reason. *A* *B* *C*. Thus, if the “unknown set” can
be expressed in terms of intersections, unions and/or complementations of
*A*, *B*, and *C*, then it must contain *A* *B* *C* as a subset (see the Appendix
regarding disjunctive normal form). There are *very many* other members of
*A* *B* *C*, however, which are not presented
in the exposition in question, so it would seem that the “unknown set” heard in
the first exposition marked *G
C* cannot be expressed in
terms of the sets *A*, *B*, and *C*.

[4.9] The other set that agrees poorly with the predictions is *A* *C*. In this case, the contents of
the initial presentation of the set and the single presentation marked
*rappel* agree with each other, although it would also appear that the set
observed cannot be expressed in terms of *A*, *B*, and *C*.

[4.10] Where agreement between prediction and observation is generally good,
a few discrepancies usually remain. Particularly troublesome is the pitch *A*, five times in *B*, and ten times in *C*. This would imply that it is a
member of *A* *B* *C* and therefore ought not to occur
in *F* (see Figures 8 and 9 in the Appendix).
Nonetheless, it is observed to occur nine times there. One possible explanation
is that the composer deliberately introduced this pitch so that *F* would
contain at least one representative of each pitch class. Indeed, *F* if the offending *A B C* and *G C*. Alternatively,
the fact that *C* might prompt us to include it in
*C* in spite of its repeated observation in *C* which would then account for its
presence in *A B C*, *G C* and *F*.

[4.11] Whatever the case may be, we must conclude that the mapping of the
proposed set-theoretic model of *Herma* onto its score does not satisfy the
first criterion for the audible intelligibility of the model given at the the
end of Section 2. First, the sets of interest are not completely represented in
the score. It has already been remarked that this deficiency is manifest in
*R*. The problem recurs, however, with respect to various pitch sets
throughout the score. In each case there exist pitches that are represented in
the score neither within the set in question nor within its complement.

[4.12] Furthermore, it cannot be said that the sets of interest are accurately represented from a mathematical perspective. The appearance of a pitch both within a given set and within its complement baldly contradicts the definition of complementarity. Inconsistencies appear to persist throughout the score based on the set-algebraic computations discussed above, in which significant disagreement between computed sets and their representations in the score is repeatedly observed.

[4.13] The second intelligibility criterion from Section 2, requiring that
listeners be able to discriminate and compare the pitch sets of interest, also
raises concerns. Psychological studies investigating the ability of subjects to
name musical tones with different pitches have found that they can reliably do
so only when the number of tones is less than 5–6 (although subjects with
absolute pitch perform better).^{(16)} *A*, *B*, and *C* each have
more than twenty members and their complements, of course, have considerably
more. Even allowing for absolute pitch to assist in the identification of
different pitches, the capacity of short-term memory for items differentiated in
a single dimension is usually taken to be 7 +/- 2 items so that a listener
should not be able to retain such large harmonically amorphous pitch sets in
short-term memory.

[4.14] It might be hoped that the clumpy macro-structure of the pitch sets could be of use in distinguishing them insofar as the emergence of voices occupying slightly different registers might be used to identify particular sets and transitions between them. In addition to the fact that set macro-structure is only one of several factors influencing stream formation (cf. Section 3), other impediments to such identifications would exist. Due to the substantial number of different sets presented, the time required for a comprehensive sampling of their members to sound, and the considerable overlap between their contents apparent in Figure 3, it would be necessary to remember the detailed contents of several streams over a significant period of time in order to perceive the “clumps” with sufficient completeness and certainty to distinguish between sets. This would again overtax the usual limits of short-term memory.

[4.15] Theoretically, it might be possible with sufficient listening to
commit each note of the piece to long-term memory, but this would represent an
extremely unusual listening situation. It must be noted that Takahashi, after
practicing the piece for months, wrote to the composer a letter containing the
following passage: “Theoretical conceptions of such great interest. Would like
to have technical explanations. Let me know about Symbolic Music.”^{(17)} If a
practiced performer with a score in hand cannot grasp the work’s construction,
is it reasonable to assume that a listener should be able to do so?

**[5]**** A Critical Summary Regarding Herma**

[5.1] Ultimately, the numerous inconsistencies between the score and its set-theoretic model which are discussed above remain, perhaps, a minor issue. In my opinion, they detract little from the effectiveness of the piece. This observation in itself, however, highlights certain deeper paradoxes associated with the compositional process.

[5.2] The dramatic surface of *Herma* conceals a deep-seated
contradiction between the technical approach of the composer and its musical
products. Xenakis is clearly serious about his conception of the work as a
temporal blackboard upon which is inscribed a set-theoretic argument
demonstrating the equivalence of two different expressions for the target set
*F*. Nonetheless, this “argument” is entirely impenetrable to the listener.
It is unrealistic to suppose that even musically astute listeners can apprehend
and retain in memory large, harmonically amorphous pitch sets which are
subjected to purely stochastic expositions. From a perceptual standpoint, the
music is ergodic in pitch at the highest holarchical gestalt levels. Certainly
no sense is ever produced of an argument cast in pitch proceeding towards a
conclusion. If it were otherwise, one supposes that the set-theoretic
inconsistencies discussed above would not be performed and recorded by musicians
of the highest competence.

[5.3] It seems clear that no sense of forward motion or cathartic resolution is communicated by the pitch content of the piece. Rather, the momentum of the music is entirely a product of its rhythmic and dynamic intensity and the tremendous overt technical virtuosity which must be displayed by its performer. Similarly, the richness of the musical fabric is a product not of its specific pitch content, but rather of the complex and unstable perceptual streaming produced by the “clumpy” macro-structure of the pitch sets and the use of simultaneous contrasting dynamics and temporal densities (cf. Section 3). This result is ironic insofar as dynamics, rhythm, and timbre are all compositionally subordinated to specific pitch content, purportedly serving as devices to temporally demarcate the expositions of different pitch sets. Thus a fundamental contradiction exists between the compositional approach and the perceived result.

[5.4] This would be less remarkable if Xenakis had not previously been
harshly critical of certain serialist composers on just such grounds. In his
well-known article “The Crisis of Serial Music,”^{(18)} the composer upbraided
serialists for

- the fundamental contradiction between their compositional methods (involving the manipulation of musical lines) and the results thereof (a perception of mass or surface, with no audible sense of “line”);
- subsuming all other musical parameters to that of pitch.

*Herma*, insofar as the set theoretic construction is not audibly manifested despite the marshalling of dynamic, temporal density and timbral envelope (i.e., sustain) to aid in the discrimination of the different pitch sets.

[5.5] Within Xenakis’s compositional output as a whole, it may be that
*Herma* is best viewed as a seminal/transitional work, as suggested by its
title. Its straightforward set-theoretic construction is unique in Xenakis’s
oeuvre. Immediately subsequent compositions returned to purely stochastic
considerations (*ST/10*, *Atrees*, and *ST/48* of 1962) or group
theory (*Akrata*, *Nomos Alpha*, and *Nomos Gamma*, composed
between 1964 and 1968). The latter pieces also mark the appearance of Xenakis’s
theory of scales (or *sieves*, in his terminology; see *Formalized
Music* 194–200), which probably represents
the most direct descendent of the set theoretic techniques adopted in
*Herma*. This assertion is corroborated by Xenakis’s own comments:

In

HermaI chose sub-sets from the chromatic scale—that is, I chose some of the points on the straight line. After that I put to myself the following question: How can one carry out this process on a more general level which would comprehend all the scales used in the past and all those that may come into use in the future? The sieve theory gives the answer. This answer may not be complete, but it’s certainly effective and many-sided.^{(19)}

Sieves are constructed with the aid of standard set-algebraic operations
applied to pitch sets, but these primitive sets display definite regularities
(unlike the “amorphous” *A*, *B*, and *C* sets of *Herma*).
Thus the products of these operations can be made more easily apprehensible and
discriminable to listeners.

[5.6] In my opinion, none of these considerations detract from the fine
musical qualities to be found in *Herma*, among which the following must be
counted. First, its texture seems almost amorphous and yet it maintains
tremendous forward momentum (although it supplies no reassuringly obvious goal).
The impression created is perhaps reminiscent of certain natural (volcanic or
meteorological) processes. In this light, the inscrutability of its internal
logic seems less troubling insofar as nature also operates by laws which are
often concealed, although the composer’s writings seem to suggest that this is
not the light in which he would prefer to have the piece viewed
(*Formalized Music*, 170–177).

[5.7] Furthermore, *Herma* refreshingly manages to eschew almost every
established cliché of piano writing, although it cannot be said that it is
entirely without precedent. Xenakis studied under Messiaen during the 1950s and
it is reasonable to assume that *Herma* is informed by Messiaen’s “Mode de
Valeurs et d’Intensites” from *Quatre Etudes de Rhythme* (1949–50). The
latter piece represents the first example of “total organization” composed in
Europe, and as such is the product of a quite different compositional approach
from that employed in *Herma*. Nonetheless, as Xenakis himself has observed
regarding total serial organization (see note 18), the
perceived results can be similar to those obtained using random procedures and
the qualitative similarities between the two pieces in question are easy enough
to hear. There is no question, however, that *Herma* goes far beyond
previous models in unreservedly embracing the notion of a seething mass of sound
as a viable alternative to linear writing. Hearing a good performance such as
Takahashi’s seems like nothing so much as getting caught in a storm.

[5.8] In the end it may seem that *Herma* is successful for listeners on
grounds other than those which facilitated its construction by the composer.
What is the proper object of analysis in such a case? As in much music, there
may exist subtly or markedly different analyses germane to the composer, the
performer and listener. (Perhaps even to each listener.)^{(20)}

**[A] Appendix: Elementary Set Theory**

**Figure 6**. (a) The intersection, *AB*, of two sets, *A* and *B* (b) The union, *A+B*, of two sets *A* and *B*

(click to enlarge)

[A.1] A *set* or *class* is a collection of objects, called
*members* or *elements* of the set. Let *R* denote a *universal
set* consisting of all elements of interest. A set *A* is called a
*subset* of *R* if every element in *A* is also a member of
*R*. Given such a subset *A* we may define its *complement*,
*A*, as the (possibly empty)
set of elements which are in *R* but not in *A*. Such relationships
among sets are often illustrated by means of Venn diagrams of the sort shown in **
Figure 5** in which the rectangular area represents the universal set, *R*,
with subsets thereof represented by areas demarcated within the rectangle. In Figure 5 a set *A* is represented by the area within a
circle, while the area outside of the circle but within *R* represents
*A*.

**Figure 5**. A universal set, R, comprising a set, A, and its complement, A

(click to enlarge)

[A.2] The *intersection*, *AB*, of two sets, *A* and *B*,
is that set which consists of all elements which belong to both *A* and
*B*. Similarly, one may define the *union*, *A+B*, of two sets,
*A* and *B*, as that set which consists of all elements which belong
to either *A* or *B*. The intersection and union of two sets are
illustrated by the filled regions in **Figures 6(a)** and **6(b)**, respectively.

[A.3] Two sets which have no elements in common are said to be
*disjoint*, and their intersection is represented by a forward slash
superimposed on a circle, the *empty set*. Two such disjoint sets are
illustrated in **Figure 7**.

(click to enlarge) |
(click to enlarge) |

[A.4] The set operations defined above can be applied to three or more sets.
**Figure 8** illustrates how the universal set, *R*, may be
*partitioned* into eight disjoint sets equal to intersections between three
sets *A*, *B*, *C* and their complements.

[A.5] Observe that any set formed from unions, intersections, and/or
complementations of *A*, *B*, and *C* can be expressed as a union
of the disjoint sets which are illustrated. (These disjoint sets are sometimes
referred to as *atoms*.) For instance, the complex looking set, *F*,
of **Figure 9** may be expressed as

F = A B C + A B C + A B C + A B C.

Any set thus written is said to be expressed in *disjunctive normal
form*. In general, there are other ways in which a given set may be
expressed, and it is this fact that provides Xenakis with a conceptual starting
point in the composition of *Herma*.

Robert A. Wannamaker

Music Department

York University

4700 Keele Street

Toronto, ON M3J 1P3

CANADA

rob@audiolab.uwaterloo.ca

### Footnotes

1. Iannis Xenakis, *Formalized Music*, rev. ed. (Stuyvesant, N.Y.: Pendragon Press, 1992). Page number references to this book are given in the text henceforth.

Return to text

2. Throughout this article, we concern
ourselves with pitches, as opposed to pitch classes, unless otherwise
indicated.

Return to text

3. Two minor errors in PLANE 2 of the
diagram as given by Xenakis have been corrected in Figure
1. First, the region corresponding to *ABC* in the leftmost figure on
the *ff* line has been shaded, where it was blank in the original. Second,
a missing overline has been added to the character *C* in the rightmost
figures on each line (both *ppp* and *ff*). Also, I have deduced that
the broken line between *A
B C* and *(A B + A B)C* should have its arrowhead pointing at the latter rather than the former.

Return to text

4. Xenakis observes that the form of Equation 2 is more computationally
efficient than the disjunctive normal form of Equation 1 in the sense that it
involves fewer total union, intersection, and complementation operations. The
difference is ten operations versus fourteen, assuming that intermediate results
are available for multiple re-uses once computed. In particular, observe that
once *A B + A B* has been computed, it can be used
to compute *G* by means of a
single complementation, as opposed to computing *G* from scratch using *A*,
*B*, and *C*. This fact has little practical implication for the
composition, however, since not all of the different operations involved are
sequentially illustrated. PLANES 1 and 2 each contain fewer than the respective
fourteen and ten diagrams, with the *G* *C* diagram, furthermore, being
repeated. Xenakis indicates that seventeen operations, rather than
fourteen, are required to compute the right-hand side of Equation 1
(*Formalized Music*, 173). This would be the case if the complements
*A*, *B*, and *C* could not be multiply re-used in
computations. Such re-use of intermediate results is, of course, necessary to
achieve the computational efficiency associated with the alternative form
specified by Equation 2.

Return to text

5. Bálint A. Varga, *Conversations with Iannis Xenakis* (London: Faber & Faber, 1996).

Return to text

6. Iannis Xenakis, *Herma* (London: Boosey &
Hawkes, 1967).

Return to text

7. James Tenney, *META / HODOS: A Phenomenology of 20th-century Musical Materials and An Approach to the Study of Form; and META Meta / Hodos*, 2nd edition (Oakland: Frog Peak Music, 1992).

Return to text

8. Ibid., 87.

Return to text

9. Ibid., 94.

Return to text

10. Ibid., 113.

Return to text

11. Varga, *Conversations*, 85.

Return to text

12. Albert S. Bregman, *Auditory Scene Analysis: The Perceptual Organization of Sound* (Cambridge, Mass.: MIT Press, 1990).

Return to text

13. One might be tempted to explain the apparent inconsistencies by invoking
the notion of “fuzzy sets,” first introduced into the scholarly literature in
1965 by engineer and mathematician Lotfi A. Zadeh as a generalization of the
classical set concept (Lotfi A. Zadeh, “Fuzzy Sets,” *Information and
Control* 8 (1965): 338–353). Is it possible that Xenakis anticipated this
concept in 1961?

A “fuzzy set” is one whose elements can possess degrees of confidence of
membership between zero (in which case the element certainly *is not* in
the set) and one (in which case the element certainly *is* in the set). In
a classical (or “crisp”) set, on the other hand, the degree of membership of any
element is either *precisely* one or *precisely* zero.

If the degree of membership of an element *x* in a given fuzzy set
*A* is, for example, 0.8, then the degree of membership of *x* in
*A* is 1 - 0.8 = 0.2. Thus,
one would expect to observe *x* in a randomly drawn sample set whether it
was drawn from *A* or *A* (albeit with greater frequency in
a sample set drawn from *A*).

Nonetheless, I maintain that a fuzzy set model is not appropriate for
*Herma*, for the following reasons. First, nowhere in his detailed
theoretical discussion of *Herma* in *Formalized Music *does the
composer introduce any concept relatable to fuzzy sets. Indeed, his description
of complementation clearly assumes classical sets: “If class *A* has been
symbolized or played to [the listener] and he is made to hear all the sounds of
*R* except those of *A*, he will deduce that the complement of
*A* with respect to *R* has been chosen” (page 171).

Furthermore, it should be noted that it is impossible in principle for a
listener or analyst to determine with certainty the precise degree of membership
of any given pitch in a fuzzy pitch set from a finite random sampling of the set
in question. That is, if it were given that fuzzy sets were employed in
*Herma*, it would not be possible to definitively characterize them either
upon hearing the work or upon inspection of the score. The composer
unequivocally advances a model of the work as a set-theoretic argument cast in
pitch. The introduction of fuzzy sets would not only represent an unmotivated
complication of the exposition, but would prevent the terms in that exposition
from being determinable with perfect confidence.

Return to text

14. Iannis Xenakis, *Herma*, Yuji Takahashi, piano. Denon 33CO–1052 [1972]. Compact disc.

Return to text

15. Nouritza Matossian, *Xenakis* (New York: Taplinger, 1986), 147.

Return to text

16. Brian J. C. Moore, *An Introduction to the Psychology of Hearing*, 4th Ed. (San Diego: Academic Press, 1997), 246.

Return to text

17. Matossian, *Xenakis*, 151.

Return to text

18. Iannis Xenakis, “La crise de la musique
serielle,” *Die Gravesaner Blätter* 1 (1955): 2–4. Cf. Matossian, *Xenakis*, 85–86.

Return to text

19. Varga, *Conversations*, 96.

Return to text

20. I would like to thank Profs. David Lidov and James Tenney for
their encouragement and guidance during the writing of this article. As well, I would
like to thank two anonymous reviewers for their comments and MTO Editor
Eric Isaacson both for his helpful suggestions and for performing some computer
simulations which instigated those reported in Paragraph 4.1.

Return to text

*Formalized Music*, rev. ed. (Stuyvesant, N.Y.: Pendragon Press, 1992). Page number references to this book are given in the text henceforth.

*ABC*in the leftmost figure on the

*ff*line has been shaded, where it was blank in the original. Second, a missing overline has been added to the character

*C*in the rightmost figures on each line (both

*ppp*and

*ff*). Also, I have deduced that the broken line between

*A B C*and

*(A B + A B)C*should have its arrowhead pointing at the latter rather than the former.

*A B + A B*has been computed, it can be used to compute

*G*by means of a single complementation, as opposed to computing

*G*from scratch using

*A*,

*B*, and

*C*. This fact has little practical implication for the composition, however, since not all of the different operations involved are sequentially illustrated. PLANES 1 and 2 each contain fewer than the respective fourteen and ten diagrams, with the

*G*

*C*diagram, furthermore, being repeated. Xenakis indicates that seventeen operations, rather than fourteen, are required to compute the right-hand side of Equation 1 (

*Formalized Music*, 173). This would be the case if the complements

*A*,

*B*, and

*C*could not be multiply re-used in computations. Such re-use of intermediate results is, of course, necessary to achieve the computational efficiency associated with the alternative form specified by Equation 2.

*Conversations with Iannis Xenakis*(London: Faber & Faber, 1996).

*Herma*(London: Boosey & Hawkes, 1967).

*META*~~/~~ HODOS: A Phenomenology of 20th-century Musical Materials and An Approach to the Study of Form; and META Meta ~~/~~ Hodos, 2nd edition (Oakland: Frog Peak Music, 1992).

*Conversations*, 85.

*Auditory Scene Analysis: The Perceptual Organization of Sound*(Cambridge, Mass.: MIT Press, 1990).

*Information and Control*8 (1965): 338–353). Is it possible that Xenakis anticipated this concept in 1961?

A “fuzzy set” is one whose elements can possess degrees of confidence of
membership between zero (in which case the element certainly *is not* in
the set) and one (in which case the element certainly *is* in the set). In
a classical (or “crisp”) set, on the other hand, the degree of membership of any
element is either *precisely* one or *precisely* zero.

If the degree of membership of an element *x* in a given fuzzy set
*A* is, for example, 0.8, then the degree of membership of *x* in
*A* is 1 - 0.8 = 0.2. Thus,
one would expect to observe *x* in a randomly drawn sample set whether it
was drawn from *A* or *A* (albeit with greater frequency in
a sample set drawn from *A*).

Nonetheless, I maintain that a fuzzy set model is not appropriate for
*Herma*, for the following reasons. First, nowhere in his detailed
theoretical discussion of *Herma* in *Formalized Music *does the
composer introduce any concept relatable to fuzzy sets. Indeed, his description
of complementation clearly assumes classical sets: “If class *A* has been
symbolized or played to [the listener] and he is made to hear all the sounds of
*R* except those of *A*, he will deduce that the complement of
*A* with respect to *R* has been chosen” (page 171).

Furthermore, it should be noted that it is impossible in principle for a
listener or analyst to determine with certainty the precise degree of membership
of any given pitch in a fuzzy pitch set from a finite random sampling of the set
in question. That is, if it were given that fuzzy sets were employed in
*Herma*, it would not be possible to definitively characterize them either
upon hearing the work or upon inspection of the score. The composer
unequivocally advances a model of the work as a set-theoretic argument cast in
pitch. The introduction of fuzzy sets would not only represent an unmotivated
complication of the exposition, but would prevent the terms in that exposition
from being determinable with perfect confidence.

*Herma*, Yuji Takahashi, piano. Denon 33CO–1052 [1972]. Compact disc.

*Xenakis*(New York: Taplinger, 1986), 147.

*An Introduction to the Psychology of Hearing*, 4th Ed. (San Diego: Academic Press, 1997), 246.

*Xenakis*, 151.

*Die Gravesaner Blätter*1 (1955): 2–4. Cf. Matossian,

*Xenakis*, 85–86.

*Conversations*, 96.

### Copyright Statement

#### Copyright © 2001 by the Society for Music Theory. All rights reserved.

[1] Copyrights for individual items published in *Music Theory Online* (*MTO*)
are held by their authors. Items appearing in *MTO* may be saved and stored in electronic or paper form, and may be shared among individuals for purposes of
scholarly research or discussion, but may *not* be republished in any form, electronic or print, without prior, written permission from the author(s), and advance
notification of the editors of *MTO.*

[2] Any redistributed form of items published in *MTO* must include the following information in a form appropriate to the medium in which the items are
to appear:

This item appeared in

Music Theory Onlinein [VOLUME #, ISSUE #] on [DAY/MONTH/YEAR]. It was authored by [FULL NAME, EMAIL ADDRESS], with whose written permission it is reprinted here.

[3] Libraries may archive issues of *MTO* in electronic or paper form for public access so long as each issue is stored in its entirety, and no access fee
is charged. Exceptions to these requirements must be approved in writing by the editors of *MTO,* who will act in accordance with the decisions of the Society
for Music Theory.

This document and all portions thereof are protected by U.S. and international copyright laws. Material contained herein may be copied and/or distributed for research purposes only.

Prepared by Brent Yorgason and Tahirih Motazedian, Editorial Assistants