Skip to content

Littlefield Technologies Simulation Solution Essays On Music

Littlefield Technologies Simulation Game 2 strategy

Just went through this last semester. We ended up in first place even though we made a few minor mistakes. First a few links that helped us:

Here is what we did:

Pre-Game Activities: The team met the Tuesday before class to examine the data and discuss strategies. It was apparent that both Stations 1 and 3 were operating at full capacity, frequently hitting 100% utilization. Station 3 seemed more strained since it had higher queues (Mean=506, STD=498) than Station 1(Mean=187, STD=175).

Since the average job lead time exceeded 2 days during days 43 through 46, inclusive, we thought it would be unprofitable to attempt to move to the $1,000 contracts. We discussed the options of altering the lot sizes, but decided that the extra setup time would only create more bottlenecks downstream.

Stage 1: As a result of our analysis, the team’s initial actions included:
1. Leave the contracts at $750.
2. Change the reorder point to 3000 (possibly risking running out of stock).
3. Change the reorder quantity to 3600 kits.
4. Purchase a second machine for Station 3 as soon as our cash balance reached $137,000 ($100K + 37K).

This strategy proved successful and after the second machine for Station 3 was purchased on Day 56 and the queue cleared, we were able to switch to the $1,000 contracts. We occasionally lost a few dollars for being a little late, but we always made more than we would have under the $750 contracts.

Stage 2: The next goal was to save enough cash to purchase a machine for Station 1 so that we could switch to the $1,250 contracts. During the cash building stage, we made the inventory order quantity as high as we could afford, which was 6,900 kits at a purchase price of $70,000. When the 6,900 kits were delivered, we switched the order quantity back to 3,600 so that we could purchase a Station 1 machine as soon as our cash balance reached $127,000 ($90K + 37K).
After 21 factory days, we were able to purchase the fourth machine for Station 1 and immediately moved to the $1,250 contracts.

The average lead time declined to under a half a day during factory days 69 through 76. There was a substantial decline in arriving orders during the same time period. The team noticed the drop in lead time and regrets not having moved to the $1,250 contracts sooner. We lost $22,750 of potential revenue for not moving on the information sooner. On the other hand, orders are random and an early move could have backfired on us.

Stage 3: During our preliminary meeting, the team discussed the possibility of purchasing a fifth machine for Station 1. We decided to wait and see if the loss of potential earnings was sufficient to justify a $90 K purchase. We knew that if we were going to buy a fifth machine we should do it as soon as possible to maximize the return on investment. We calculated the loss of potential revenue as ($1,250 – actual average revenues * jobs completed). Our initial estimates showed a potential revenue loss of $266 per day, but within a few factory days the rate of potential loss rose to $419 per day.

There is another consideration in the decision to purchase a fifth machine for Station 1. The title of the Littlefield Technologies game 2 is Customer Responsiveness. The title implies that we should be concerned about the consistency with which we deliver on our service level agreements (SLAs). The potential loss of $419 per day barely covers the $90,000 machine purchase; however we were missing our SLAs 13 out of 15 days and the percent of potential revenues lost due to missing SLAs was 3%. We decided to purchase the fifth machine on Day 94 primarily to improve our customer responsiveness.

This strategy did not perform as well as we had hoped. While our potential revenues lost declined to 1%, we were still missing our SLAs six out of seven days.

Stage 4. During Stage 4, we explored job splitting as a solution to the SLA problem. First, we split jobs into two batch of 30 kits each. This strategy worked so well that we wondered why we hadn’t explored job splitting during Stage 2 or 3. We met our SLAs 12 out of 16 days and our percent of potential revenues lost declined to 0.4%. We calculated the setup times as a proportion of a machine to be 0.007, 0.003, and 0.002 for S1, S2, and S3, respectively.

2S1 + P = 0.194458 =>   S1 + P = 0.187256  =>   S1 = 0.007202
2S2 + P = 0.082479 =>   S2 + P = 0.079424  =>   S2 = 0.003055
2S3 + P = 0.064835 =>   S3 + P = 0.062434  =>   S3 = 0.002401
Where the right hand side is calculated as Sum(%Utilization * #Machines)/#Jobs Completed

We thought that if setup time was so insignificant, maybe the other job splits would be equally good or better. Accordingly, we tried the 3-way job split for eight days, but we were not impressed with the results. On one of the days, our average revenues dropped below $1,200, which we hadn’t seen since purchasing the fifth machine for Station 1.

We thought that maybe it was because of the mismatch between machines and splits. So we tried the 5-way split thinking each job would be split equally among the five machines. This turned out to be a HUGE mistake! After only one factory day it was apparent the 5-way split was a bad thing and we switched back to the 2-way split. Even so, it took an additional four days for the system to recover from the backlogs and we lost $46,693 in potential revenues. (Morale of the story – 2 way splits are great as soon as the queue clears with the purchase of machines. Forget the other splits.)

A one-way ANOVA demonstrated that the differences between the job splits were statistically significant at the alpha=.01 level. Group 1 was no splits. Group 2 was a two-way split. Group 3 was a three-way split. Group 5 was a five-way split. Data for all groups were collected after all machine purchases explaining the small number of observations for Group 1.

We chose to stay with the 2-way split not only because it had the highest average revenues, but also because the 2-way split had the lowest variance. With the 2-way split we were meeting our service level agreements more consistently resulting in higher customer satisfaction and higher profits per job.

Stage 5. With our factory humming, our attention turned to inventory purchases. We calculated the reorder quantity using the equation:

Q* = SQRT(2DS/H) = SQRT(2 * 12 * 365 * 1,000 / 66.31) = 363 batches
Where D = annual demand = 12 * 365
S = fixed cost per order = $1,000, and
H = the handling costs = $60 x (1 + .10/365)365 = 66.31

The calculated reorder quantity was surprisingly close to the value obtained from running our numbers through the Inventory example from Chapter 7 of our text (363 vs. 382).

The text also mentioned that small variations in reorder quantity do not matter much and so people usually round to a convenient number. Thus, we set our re-order quantity to 400.
Stage 6. Previously we had been stockpiling inventory by purchasing more as soon as money was available to purchase, but we realized that we may be missing out on nontrivial interest payments. So we re-set the reorder point to 3600, which provides a four day inventory plus a safety net.

Stage 7. The Exit Strategy – We do not have control of the factory during the last 100 days of its life. We know from the instructions for the game that the demand is expected to stay consistent although orders are random. We do not feel it is wise to leave a large reorder quantity while the factory is out of our control because we might have a sudden increase in jobs during the last few days that sparks a $241,000 inventory purchase, most of which will go to waste. So before we lose control, we will buy (100 * 11.8 * 60) kits and then set the reorder quantity to 60 (or 3,600 kits). We hope this exit strategy works.

The exit strategy did work although if we had purchased another 1,200 kits in Stage 7, we could have set the reorder quantity to 0 and reorder point to 0. This would have saved use another $24,000.

Evolution of Tonal Organization in Music Optimizes Neural Mechanisms in Symbolic Encoding of Perceptual Reality. Part-2: Ancient to Seventeenth Century

Aleksey Nikolsky*

Braavo! Enterprises, Los Angeles, CA, USA

Edited by: Leonid Perlovsky, Harvard University and Air Force Research Laboratory, USA

Reviewed by: Stephan Thomas Vitas, Formerly affiliated with District of Columbia Psychological Association, USA; Leon Crickmore, Department of Education and Science, UK

*Correspondence: Aleksey Nikolsky gro.ovaarb@yeskela

This article was submitted to Cognition, a section of the journal Frontiers in Psychology

Author information ►Article notes ►Copyright and License information ►

Received 2015 Dec 16; Accepted 2016 Feb 3.

Copyright © 2016 Nikolsky.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

This article has been cited by other articles in PMC.


This paper reveals the way in which musical pitch works as a peculiar form of cognition that reflects upon the organization of the surrounding world as perceived by majority of music users within a socio-cultural formation. Part-1 of this paper described the origin of tonal organization from verbal speech, its progress from indefinite to definite pitch, and the emergence of two main harmonic orders: heptatonic and pentatonic, each characterized by its own method of handling tension at both domains, of tonal and social organization. Part-2, here, completes the line of historic development from Antiquity to seventeenth century. Vast archeological data is used to identify the perception of music structures that tells apart the temple/palace music of urban civilizations and the folk music of village cultures. The “mega-pitch-set” (MPS) organization is found to constitute the principal contribution of a math-based music theory to a new diatonic order. All ramifications for psychology of music are discussed in detail. “Non-octave hypermode” is identified as a peculiar homogenous type of MPS, typical for plainchant. The origin of chromaticism is thoroughly examined as an earmark of “art-music” that opposes earlier forms of folk music. The role of aesthetic emotions in formation of chromatic alteration is defined. The development of chromatic system is traced throughout history, highlighting its modern implementation in “hemiolic modes.” The connection between tonal organization in music and spatial organization in pictorial art is established in the Baroque culture, and then tracked back to prehistoric times. Both are shown to present a form of abstraction of environmental topographic schemes, and music is proposed as the primary medium for its cultivation through the concept of pitch. The comparison of stages of tonal organization and typologies of musical texture is used to define the overall course of tonal evolution. Tonal organization of pitch reflects the culture of thinking, adopted as a standard to optimize individual perception of reality within a social group in a way optimal for one's success, thereby setting the conventions of intellectual and emotional intelligence.

Keywords: ancient Babylonian and Greek music, diatonic/chromatic music, modulation and alteration, musical texture and pictorial perspective, musical key and pictorial perspective, environmental topography and tonal organization, pitch zone, aesthetic emotion


Part-1 of this paper presented the framework for study of tonal organization1 in any kind of music. Based on the available data from archeology, anthropology, ethnomusicology and psychoacoustics, the known forms of tonal organization were lined out in a timeline, where the cognitive constraints of perception of different musical typologies were used as criteria for deciding which form of organization came first. The pattern of acquisition of music skills during infancy was used to hypothesize the succession of stages in separation of music from speech and descent of definite pitch organization from indefinite one. The existing types of indefinite-in-pitch music were analyzed to identify khasmatonal and ekmelic modes as specialized methods of processing indefinite pitch. Mechanisms of their evolution into oligotonal definite-pitch mode were defined. The principle of triadic induction was shown to determine the growth of oligotonal into mesotonal, and mesotonal into multitonal schemes. The resulting hemitonic heptatony and anhemitonic pentatony presented two alternative methods of organizing vertical and horizontal harmony—each offering a dedicated style of handling tonal tension—reflecting a more general style of worldview, based on the parallels between tonal tension and social tension. Commitment to heptatony or pentatony as the principal means of tonal organization within a culture, then, appears to generally correspond to the preferred lifestyle in a social group. This correspondence could be the product of abstraction of individual lifestyle preferences into the tonal schemata of a musical mode, and further mediation of the multitude of such modes within a social group - until the statistically prevailing mode would establish the model of tonal organization.

Part-2 continues drawing the lineage until the rise of Western tonality, identifying yet another venue of musical representation of perceptual reality—vertical and horizontal tonal structures encoding the perceived relation of multiple objects in one's surrounding. The spatial organization of depicted images appears to share the same principles as the tonal organization of music in the same culture, probably originating in its environmental topography. Spatial-to-tonal correspondence is the strongest in Western tonality, but is noticeable at earlier stages defined in Part-2: diatonic and chromatic mega-pitch-set (MPS) systems, and non-octave hypermode.

Genesis of modal family and the role of tetrachord

What separates prehistoric and historic forms of music is the emergence of math-based music theory and notation. Notation encourages production of complex compositions in observation of theoretic rules, and restrains discrepancies in reproduction of the same tune. Oral transmission of folk music, in contrary, employs variation as the leading music-making principle. Any information technology designed to enhance the retention of symbolic information should be regarded as stimulation for the emergence of abstract thought (Couch, 1989).

Musical implementation of abstraction was the inference of modal family from a single mode. The model of it is documented in cuneiform notation of the Hurrian Hymns and related texts from Ugarit ca. 1400 BC. They reveal that Assyrian/Babylonian music was heptatonic, based on 7 modes2 named after a particular series of 5ths that were used to generate each of the modes (Kilmer and Tinney, 1996).

  • 1. Audio: An arrangement of the Hurrian Hymn No. 6, Anne Kilmer's transcription (1974). Kilmer's dyadic interpretation (Kilmer, 1974) was criticized for a number of inconsistencies with the data from the recovered music theory texts (West, 1994).

  • 2.Audio: the alternative transcription by West (1993), which also was criticized (Crocker, 1997).

  • 3. Audio: the alternative transcription by Dumbrill (1998). Despite huge differences between the transcriptions of this single surviving sample of Babylonian music, together with the retrieved music theory texts, it provides substantial information on general principles of tonal organization.

Our “Mixolydian G” formed the base of Babylonian system3. Prioritization of Mixolydian mode is known in numerous Eurasian folk music systems (Belaiev, 1963). Mesopotamian music theory must have adopted it from folk tradition and “rasterized” it mathematically, adopting the tetrachord as the formative tool in modal genesis.

Eurasian instrument-makers have traditionally conceptualized ambitus through equivalence of 4th, which according to Beliayev (1990, p. 248) manifests “the first stage of maturity” in tonal organization—supporting professionalization of folk musical culture. Modal integrity of 4th was epitomized in the Pythagorean cult of Tetraktys (“quaternary”), originating in primitive cultures of the Bronze Age (Burkert, 1972, pp. 188–191), likely in Babylon (Barbera, 1984)4.

Tetraktys was the earliest rational conceptualization of spatial and tonal organization in a single scheme (Figure ​1) of an equilateral triangle filled up with symmetrical rows of 1 to 4 dots. Each of these numbers encoded a geometric concept: 1—point, 2—line, 3—surface, and 4—tetrahedron—everyone of which contained the one before it (Riedweg, 2005).

Figure 1

Tetraktys (the “fourness”): the geometric representation of harmonicity of “4th.” (A) Musical aspect of tetraktys. On the left, each row is assigned a value that expresses the ratio of the string required for production...

Together, they represented world's order, where point symbolized unity, line—limit, surface—harmony, and tetrahedron—cosmos. This convention displayed amazing vitality throughout the ages, nourishing philosophy of Christian, Arabic, and Jewish traditions (McCartin, 2010). Since Ancients considered numbers “sounding”—following the paradigm of proportional shortening of the string length to attain different pitches, the idea of inclusiveness of a number was understood musically as well (Barbera, 1977): 4 and its integers expressed perfect consonant intervals (2:1 = octave, 3:2 = fifth, and 4:3 = fourth). Musical Tetraktys then served as the container of harmony: intervals that could be expressed in numbers not greater than 4 were considered “symphonia” (accord), whereas all other intervals, including 3rd and 6th, were considered “diaphonia” (discord) (Kholopov, 2006, p. 64). Tetraktys determined the assortment of pitches usable in music, which was also understood in cosmogonic terms. Plato, in Timaeus, described the derivation of music tuning as creation of the World-Soul—and the model for his calculations probably came from earlier Mesopotamian sources (Crickmore, 2009a).

Byzantine, Arabic, Persian, Indian, Syriac, Armenian, Georgian, and Western European music theories—as well as many Eurasian folk traditions—share the tetrachord base. Sachs (1962, p. 163) notes near-omnipresence of 4th in world's music: absent only in Polynesia and Micronesia. Cultural proliferation and longevity of 4th indicate cognitive reasons for its prominence.

Ancient Greek theory may offer a hint for possible explanation. Greeks conceived intervals in terms of stepwise singing male voices: thence, 4th was a sum of 3 steps and an aggregate of the constituent intervals: 2nd and 3rd, each of which carried its own psychological features. Greeks must have been aware of the displacing tendency of a 2nd5. Therefore, perception of melodic 2nd involved competition between two tones—a kind of duality—as opposed to monadic unison. Going over the step produced a leap of a 3rd that contrasted a 2nd by leaving a trace and thereby introducing a new dimension of vertical harmony. Whenever enclosed in a melodic 3rd, two 2nds, did not create friction, making the 3rd indeed the expression of “harmony”. Increasing the leap by another step produced a 4th which contained a 2nd and a 3rd—two intervals of different valence, one implying duality (disharmonious) and the other, accordance (harmonious). Such 4th encapsulated all basic tonal relations: sameness, otherness, and their synthesis—concordance.

This intervallic numerology could date back to Aurignacian culture, where the first cosmogonic concepts were forged in terms of solar/lunar, day/night, male/female dialectics, usually expressed in 2:1, 3:1, and 2:3 proportions (Frolov, 2003). Babylonian philosophy coordinated these proportions and expressed them mathematically as well as musically.

The integrating capacity of a 4th manifested itself in tetrachordal organization. This is still evident in maqam where the II and III tones inside a tetrachord can shift in their tuning values as needed, whereas the marginal tones remain permanently locked (Zannos, 1990). The inclusive power of such tetrachordal 4th is quite obvious to musicians, especially on a string instrument. And string instruments played a formative role in crystallization of the earliest epic Indo-European tradition. Four pitches that corresponded to melodic stresses of the ancient Sanskrit and Greek bridged Rig Veda to Homer - illuminating presence of 4 PCs by the look of a 4-string lyre, used to accompany epic singing (West, 1981). The idea of a tetrachord would then simply be the abstraction of a tunable lyre's string with marginal tones locked by the 3:4 proportion.

Optimization of interval-tracking could explain the preference for tetrachordal organization: most individuals cannot track more than four simultaneously moving entities (Drew and Vogel, 2008). This becomes an issue in heptatonic music-making, where 4th often stands as a “collection” of 4 tones, each of which requires attention and memory. Storing auditory images for each pitch—as incremental representation by a lookahead feature of the brain's error-detection circuitry—occurs while singing a familiar melody in one's mind (Janata, 2012). Facility of quick arithmetic estimations (1+1+1+1 etc.), necessary for vocal coordination, could make 4th into an optimal melodic size “chunk.”

A 4th also provides the best compromise between melodic and harmonic consonance (see Part-1): vertical 4th fuses well, while horizontal 4th might not segregate in very fast tempo (Huron, 2001)6. Its closest competitor in size is 3rd, but 4th has a serious taxonomic advantage of being a perfect interval: for trained musicians, narrowing or widening a 4th by about 12 cents reduces its recognizability, making the listeners hear it as another interval, whereas for major 3rd the tolerance zone is 50% wider, about 18 cents, and for minor 3rd—25 cents (Burns and Ward, 1978). Furthermore, the tuning zone for 4th, enclosed between 3rd and 5th (that are harmonically contrasting to 4th) is substantially narrower than for the range occupied by 3rds (enclosed between the dissonant 4th and 2nd). According to Moran and Pratt (1926), 4th enjoys the lowest deviation rate amongst all intervals, with 13.5 cents average. This makes 4th a better tuning reference than 3rd, which in history of acoustics has been notorious for exorbitance of tuning standards and preferences (Barbour, 2004). Listeners' resolution of interval size is the highest for 4th (43 cents)—exceeding the 5th (50 cents)—and presenting an asymmetric bias towards the 3rd: major 3rd, extended 37% toward 4th, is still heard as a 3rd, whereas 4th allows for only 18% extension toward 3rd (Shackford, 1962).

Preference for 4th might have a developmental origin—there is evidence that mother-to-infant vocalizations during the first 2 years of life tend to tonally tune to the harmonic row of the same fundamental (85% of communication), where 4th is the only interval used outside the row, as an “infrafix” below the fundamental (Van Puyvelde et al., 2010). Such frame of reference would favor unison and 4th as the smallest size perfect consonances, comfortable for vocalization—and tetrachord exactly sets the unison and 4th as the structural axis for melody-making.

For Eurasian instrumental music, expansion of ambitus usually occurs by addition of an extra tetrachord above the initial one. Mixolydian tetrachord presents a choice par excellence because of its ease of tuning on the string and on the pipe (tone+tone+semitone)7, and uniformity of its conjunct reproduction: G-A-B-C+C-D-E-F (Figure ​2A).

Figure 2

Genesis of Mixolydian polymodal family. Blue color marks stability, while yellow—instability. Brackets illustrate hierarchic grouping of degrees. Arrows indicate the tendency of tones to alternate as gravitational centers. Figures in circles rate...

The second favorite is the Phrygian tetrachord (semitone+tone+tone)8, which is simply an intervallic inversion of Mixolydian. Disjunct addition of this tetrachord forms a Phrygian mode: E-F-G-A+B-C-D-E, which was probably a later development9.

Each of these tetrachords constitutes a characteristic 4-tone intonation. Perceptually, the progression of two whole-tones, followed by a semitone, designates a melodic vector: Mixolydian tetrachord suggests ascent, whereas Phrygian—descent. Both are found amongst the world's most widespread modes (Gill and Purves, 2009). Singing them creates illusion of “resolution” toward the upper or lower 4th. A semitone is known to project directional ascending/descending melodic motion (Roederer, 2008, p. 184). Delviniotis et al. (2008) discovered that performers habitually increase the first interval length, and proportionally decrease the last in ascending scales, while inversing this treatment in descending scales—which would emphasize the vector of melodic inertia.

Melodic tetrachord highlights the gravitational relations and suggests spatial concomitants.

  • Pentatonic conjunction of trichords “rounds the corners” by avoiding sharp-sounding minor 2nds and projecting concordance;

  • Heptatonic conjunction of Phrygian or Mixolydian tetrachords amplifies ascending or descending directionality of the resultant scale, connoting insistence and purposefulness.

Chronologically, induction of Mixolydian mode must have preceded octave equivalence. The Mixolydian conjunct mode is non-octave in its design (Beliayev, 1990, p. 281), and is characterized by alternation in gravity between the base tones of both tetrachords. The IV degree here tends to change from stable to unstable state, depending on the tetrachord in which it is melodically engaged. Like pentatony, this mode lacks gravitational hierarchy, but features greater tension, since its unstable tones tend to shift closer to a stable degree in expressive tuning, usually employed by performers (Morrison and Jánina, 2002).

According to Beliayev (1990, p. 288), conjunct Mixolydian produces a mode equivalent to Ionian via tetrachord rotation. This pseudo-Ionian remains non-octave, since its disjunct tetrachords make the upper C unstable whenever mutable10 G temporarily becomes “tonic.”

In practice of earlier oligotonal and mesotonal music, singers commonly used transposition-by-interval11.

  • 4. Audio: Udasan Yryata, healing incantation. Occasionally, transposition was applied to a particular portion of a song as deliberate expressive means. Transposition of the oligotonal PS: from F-Ab-Bb to G-Bb-C (between the 2nd and 3rd strophes).

Heptatonic modes promoted transposition-by-degree12, introducing the pitch-class set (PCS) concept. The underlying idea of diatonicity originates from cultivation of string plucking instruments (Belaiev, 1963)—which were cardinal for Mesopotamian civilization (Lawergren, 1997). The visibility of strings, each easily equated to a pitch class (PC), and correspondence between the string length and interval size makes a mode obvious to players. Facility of producing few tones simultaneously prototypes the observation of vertical intervals that emerge between vocals and instrumental accompaniment.

  • 5. Audio: Maddoh, Pamir. The accompaniment on rubab (6-string lute) provides an example of vertical 2nds occasionally produced by plucking the adjacent strings.

This is exactly what Nippur music-instruction tablets specify: notation of vocal part with lyrics set against the pitches of the lyre (Colburn, 2009)—for the first time graphically exhibiting the dimension of musical texture.

Formulation of the mega-pitch-set

The next development occurred when the triad induction (see Part-I) caused to re-conceptualize the lower tetrachord plus a tone above it as pentachord (258), forging a concept of melodic intonation of 5th as a modal unit13, and introducing a new hierarchic layer I-III-V into a mode. Pseudo-Ionian non-octave mode then transfigured into Ionian octave mode, with the mutability I-V instead of I-IV. The new axis of I-V pioneered the “authentic” functionality, in light of which the older I-IV axis could be viewed as “plagal.” The novelty of the authentic relationship was that it typically supported a melodic development that would build a climax point and emphasize the prevalence of “tonic” at the end. Krohn et al. (2007) confirmed that the largest N1 component in the ERP corresponded to hearing the V degree of the major key14.

With pentachordal scheme in place, musicians begin reproducing a succession of the same tones from the II rather than the I degree—turning II into the new I degree—and filling up the upper end with an extra tone. Such transposition-by-degree creates a “sister” 7-tone mode, with identical pentachord hierarchy that shares the PS (C-D-E-F-G-A-B-C and D-E-F-G-A-B-C-D), uniting both modes into a single system. There is experimental evidence that listeners categorize such modes by ear despite their identical PCs (Rohrmeier and Widdess, 2012).

It is not an accident that the three closest Mixolydian transpositions (Dorian, Phrygian, and Lydian)15 top the interval set (IS) harmonicity list of the world most popular heptatonic modes (Gill and Purves, 2009). Also sister-modes “harmonize” the music repertoire by making all songs share the same intervals classes (ICs). This “pan-harmonization” separates the partially octave-equivalent multitonal mode of “village” music from the completely octave-equivalent mode of the “palace” modal system. Their difference is manifested in the presence of mega-pitch-set (MPS): a set of tones, legitimized as the building material for any musical composition by music theory.

The larger is the set, the greater is the harmonization, and therefore the greater is the stretch of gravity, causing overall reduction in tension. The earliest Sumerian harps had 11-15 strings, which by the eighteenth century increased to 29 strings (Lawergren, 1997). The ambitus of music performed on such harps greatly exceeded that of the typical folk heptatonic music, easing tension—appropriately for meditation in temple, and eulogy in palace. The “easing up effect” distinguishes MPS from earlier folk heptatonic forms.

The MPS mode loses some of the sovereignty of a stand-alone mode: it is no more a container of characteristic intonations popular within a particular kind of music. The MPS mode has to share its degrees with other modes, evident when one mode immediately follows another mode (as in verse/chorus or song-dance)16. Perceptual “sameness” of degrees encourages the performer to strip off the MPS mode of those intonations whose expressive tuning violates the tuning of a sister-mode. Eventually, all modes within a family turn out being “averaged.” This can be seen in comparing Figure ​3A from Part-1 to Figure ​4C here: the hierarchy of stable degrees is the same, but the hierarchy of unstable degrees flattens in the MPS. There are only two gradations here: VII vs. II-IV-VI. In the folk heptatonic mode there were 4 gradations: least unstable IV, more unstable II, yet more subordinate VI, and leading VII this hierarchy ends up reduced by one level by the demand to preserve the pitch values for all the member degrees across all sister modes17.

Figure 3

Different types of non-octave hypermode.27 Yellow color represents unstable, while blue—stable degrees. Brackets illustrate hierarchic grouping of the degrees. Arrows mark the tendency of the tones to alternate as gravitational centers. Figures...

Figure 4

Chromaitc system according to Cleonides (Aristoxenian school), c.1st century BC. This system contains 11 subsets based on rotation of fixed diatonic tetrachords. Blue color marks the permanent degrees that were associated with stability. Yellow color...

The earliest reliable sample of Ancient composition is Epitaph of Seikilos. Sustained in diatonic mode, it was likely composed in observation of the music theory of the day (Mathiesen, 1999, p. 150), exemplary of MPS melody.

  • 6. Audio: Epitaph of Seikilos, 1st century AD. Ancient Greek Phrygian diatonic tonos (coincides with modern Dorian E). Unstable degrees are somewhat averaged and moderated in their attraction to stable degrees, as compared to the stand-alone folk heptatonic mode in the example below.

Its most obvious trait is non-formulaic structure. Diversity of Epitaph's intonations outweighs the only pattern present in the entire composition (line-endings 3-4). Abundance of directional shifts and over-degree-skips obscures anchoring.

  • 7. Audio: Thracian Air. Modern Dorian E (Hypodorian) mode. Well-marked tonicity makes resolution of the unstable degrees clear. The phrase sampled in this example is continuously repeated throughout the recording of the entire song.

Nearly all survived Ancient Greek music features improvising style, even choirs. In this, they contrast the overall formulaic aptitude prevalent in European folklore (Zemtsovsky, 1987), suggesting opposition of folk and palace/temple music in Antiquity (see Appendix 1 in Supplementary Material).

If a beauty-in-averageness effect (Winkielman et al., 2006) can make a folk tune, averaged by modifications of multiple musicians, appear attractive and “natural,” an authored tune can make an “artificial” and idiosyncratic impression. Likeability here is traded for originality. Certainly, the authored tune can also be orally disseminated. But practice of performance under supervision of a musical administrator in Ancient Mesopotamia (Michalowski, 2006) was not likely to provide enough freedom in variation for averaging effect to occur. Administrated music tends to turn into hard “rule.” And later Hellenic civilization made exact public reproduction of someone else's composition socially unprestigious.

Individual practice of following melodic rules sets in place hierarchic processing of pitch, where the “invented” contours are filled up with the standardized intervallic detail-establishing the modern tonal standard of pitch pattern processing (Stewart et al., 2008). Just like performers, listeners here need to know the tonal schemata before they face a particular music work. Melody processing in such music is driven by instant automatic response to the tonal progression conceived or auditioned—relating it to long-term memory (Brattico et al., 2006) for pitch-set class (PSC) and interval-set class (ISC). Pre-attentive response indicates that modal rules are optimized and hard-wired in the MPS, as opposed to earlier modal systems:

  • In pre-MPS heptatony the standardized contours were filled with idiosyncratic intervallic detail;

  • In MPS heptatony the idiosyncratic contours are filled with standardized intervallic detail.

Emergence of the concept of “key” in music theory reflects this advance. The term “key” is often used synonymously with “tonality,” which is inaccurate. Ancient Greek music used keys that did not constitute tonality. The modern notion of key implies presence of a fixed PCS subordinated to a single tone. In practice, key was brought to life by the necessity to retune string instruments before playing in a different mode (Kholopov, 2006, p. 73). Tuning always proceeds from a certain tone to which other tones are adjusted. Hence, one pitch is singled out from a PS and the entire PS is inferred from it. This is not exactly about stability, but rather priority materialized in audiation. Such “key” is not found in folk cultures (Kvitka, 1973, p. 25).

Tuning practice encourages a single key to incorporate multiple modes—to minimize retuning. This is where the complex interplay between “key” and “mode” begins (Solomon, 2000, p. 75). Convenience of immediate switching from one popular mode to another overweighs the importance of key's integrity, legalizing certain alterations18. These alterations become “modal”—characterizing a certain mode. Their very presence testifies to the presence of key.

Over time, key earned its own ground, different from mode: Greeks distinguished between “modulation according to the scale,” and “modulation according to the key” (Hagel, 2009, p. 5). Their keys could be transposed like our keys, and were associated with “key signatures” (West, 1992, p. 179)—but they neither incorporated the notion of tonic triad, major/minor inclinations, nor implied vertical harmonic functionality (185).

Rigidity of key rules secured the processing speed, enabling the handling of larger stocks of data. Shulgi's introduction of “rigid music” set the foundation for the evolution of complexity in Western music, allowing music structures to convey more information about the perceptual reality as perceived by the creator of music.

An MPS mode becomes a member in the assortment of modes, whose knowledge is obligatory for a professional musician. He is supposed to choose the right mode appropriate to the occasion. Specialization of modes is promoted by ensemble performance and genre application19, and relies on professionalization20. Beliayev (1990, p. 296) underlines that professionalization of tradition necessarily involves development of multimodality. By the eighteenth century BC, there was already an internationally recognized system of accreditation of musicians—in courts and temples of Near East—with clearly defined ranks, and frequent relocation and integration of musicians through conquest and gift-exchange (Franklin, 2007). Already a 2800 B.C. relief shows two lyres playing together (Krispijn, 2010). In order for harpists to stay “in tune” with one another's strings, they had to share the same understanding of a mode/key. Middle Assyrian tablet VAT 10101 (West, 1994, p. 170) presents a census of Akkadian love-songs, classified by modes (tunings)21.

It took about 2000 years for the heptatonic MPS tradition to settle before the Greeks established the status quo for the entire region. Already for Ancient Romans there was no alternative to Greek music: there are almost no traces left of original Etruscan music (Powley, 1996)—it was overwhelmed by Greek influence (Landels, 2002, p. 182). Reliance on the ultimate music-making scale became organic part of this influence (Winnington-Ingram, 2015, p. 50). This was a direct outcome of conceptualizing the ISC, and teaching the ear to center on different tones of the same PS.

The circle of 5ths was that instrument which equalized the MPS. Ernest Clements (1935) reserved the term “quintal” to refer to what I call MPS scales—as opposed to folk heptatony. The diatonic Mediterranean MPS is cross-culturally implemented in the system of 8 modes, produced by modal transposition from each of the degrees of the principal heptatonic mode, including its intervallic reproduction an octave higher, with the tonic placed in another tetrachord. Werner (1948) investigated such octoechos systems, tracking them to the beginning of the 1st millennia BC Mesopotamia. He concluded that division in 8 modes was a melodic concomitant of the mathematically realized harmonic octave affinity assigning a dedicated mode to every degree within an octave, which originated not in musical but in cosmological and calendaric numerology.

Pentatonic mode passed through a similar transformation in constructing the pentatonic system (Cook, 1995)22. The origin of the idea of pentatonic MPS must date back to the ninth century BC, when the tones of PCS obtained their standard pitch names (Kuttner, 1965). Around the fifth century music theory had in place the principles for reproduction of the “legitimate” pentatonic modes across the available tonal space. The similarity of Chinese hexagon circle of 5ths with Chaldean music theory is striking—most likely determined by their astronomic correspondences (Daniélou, 1995, p. 37). The Yang-Yin dialectics defined the anchor points on odd degrees (Yang) vs. “unstable” even degrees (Yin)23. Sixty pitches were standardized by the use of precisely manufactured bells—used as reference tones for tuning instruments (Falkenhausen, 1992). Music theory devised a nominal 12-tone system by inferring the whole-tone scale and then dividing the whole tones in halves24. Its main purpose was to absolutize the pitch values for use in all possible transpositions of the legitimate pentatonic modes (Bagley, 2005)—in effect, a mega-pentatonic PS.

The idea of concert pitch standard is a logical consequence of cultivation of MPS: the idea of maintaining the sameness of tones across the sister modes suggests adoption of some standard of reference. Especially in ensemble performance and ecclesiastic application, a particular pitch could be assigned to a specific supernatural power justifying its standardization. It is unlikely that the Chinese MPS was the only absolute one. Greeks tuned their lyres to aulos, and designed their notation around fixed names of pitches (West, 1992, p. 273). Greek citharas incorporated tuned resonators which would ensure fixed pitches (Hagel, 2009, p. 69). Amazing is that modern pitch standard (A4) closely corresponds to the reference tones that defined Ancient Greek MPS (A2-A4). It would be extremely interesting to find out if the phenomenon of perfect ear existed in antiquity, or if it is a byproduct of modern tonality (Steblin, 1987).

Pan-harmonization of music system can be seen as means of resolution of cognitive conflict. It is not by chance that civilizations of Mesopotamia and Egypt, China, and India, all embraced cosmogonic music theory about the same time they developed script systems. Rise of literacy25 and analytical method of thinking26 were promoting awareness of complexity, contradictions, and imperfection of the state of things in the cultural environment. Analytic approach to text paved the road for rationalization of notions inherited from the traditional folk culture (Civil, 1994). Mesopotamian education system trained to grasp and put in use the meaning of texts (Michalowski, 2012).

Conflict of interests was a common motive in Sumerian and Akkadian literature, with plenty of vivid illustrations of invective and reproaching rhetoric (Foster, 1996, p. 220). In Akkadian literature, first person's speech often emphasized the state of cognitive dissonance. A very popular epistolary genre often presented complaints of unfair treatment (Vulliet, 2011). Even more conflicting were the genres of diatribe, where two persons competed in the verbal attack of each other, disputing before some deity (Hallo, 2010, p. 120). Verbal skills played a deciding role in forging and polishing counter-distinctive manner of thought, thereby amplifying awareness of cognitive dissonance. At its pinnacle was the rise of judicial rhetoric, which exposed conflicts of interest between different individuals, and rewarded better argumentation (Hallo, 2010, p. 126). Trials were held in public, and declaration of each of the parties was pivotal in influencing the court's weighing of the conflicting statements (Wilcke, 2007, p. 44).

Babylonian culture saw a marked increase in individualism (Foster, 2011), confrontation, and disorder, earning the nickname “Dark Age” that hit the entire Mediterranean region around the twelfth century BC (Drews, 1995). The assumption of the state ideology that serving king's interests serves everyone's interests turned out to fail to motivate the subjects to defend the state against external intrusions or internal plots. Cognitive dissonance should be put on the list of contributing factors in the inability of the Bronze Age palatial cultures to sustain resilience toward environmental and international stresses. The “barbaric” tribes, with more homogenous social structure and “cognitively consonant” music would have had an advantage over Mesopotamian civilized societies, subdivided and weakened by contradictory interests of their social groups and their musics.

Codification of the MPS system should be viewed within this context of growing cognitive dissonance. Rational harmonization of the entire compass of all available music tones was not a deliberate political move in reaction to social pressures, but an elemental biological response. Inspired by correlative cosmologies, mathematically-based theories of music harmony catered to neurobiological need of the brain to reduce informational stress by employing a new strategy of organizing data and establishing ways for synthesis of new quality out of it (Farmer et al., 2000).

Non-octave hypermode

Ancient Greek Systema Metabolon set the theoretical foundation for yet another distinct method of tonal organization—found in Medieval Western Europe, Byzantium, Russia, Armenia, Georgia, Azerbaijan, and Bulgaria. The title hypermode (Pashinian, 1973) captures its principle of stitching multiple tetrachords or trichords into a single system, spanning well over an octave. The tonal integrity is achieved by taking small elementary subsets, deficient to determine the makeup of the entire melody, and uniformly conjoining them according to the “chain principle” (Sachs, 1960): addition of a twin-subset whenever melody runs over the margin of a subset. The expanded set is treated compositionally as a single entity—especially pronounced in a polyphonic setting.

At the heart of hypermode is the fixed registral contrast between marginal tetrachords/trichords. The PCs of each subset are permanently mounted in the overall ambitus, disallowing alterations. This music makes a fairly diatonic impression between adjacent subsets, while evoking “friction” between the remote ones, expressed in “false relation” of the octave-inequivalent tones. Equivalence of 4th (or 5th) binds the mode.

  • 8. Audio: Ne oryol li s lebedem kupalisia, lyrical Cossack song, Southern Russia. B-C#-D-E-F#-G-A-B-C, false relation C#-C induces subtle increase in tension in the high register—in contrast to relaxation at low register.

The melody sustained in hypermode exhibits a peculiar “elastic” effect: as long as the phrases stay in the same registral position, they appear “casual,” but ascending induces tension, whereas descending—relaxation. The entire melody contracts/expands like an elastic band through cycles of tension/relaxation. The greater the amount of subsets, the greater the “elasticity.”

  • 9. Audio: Mussorgsky—The Great Gate of Kiev, the 2nd theme. The 12-tone hypermode: G#-A#-B-C#-D#-E-F#-G#-A-B-C#-D, with 2 false relations D#-D and A#-A within 4 trichord subsets.

In a few equintervallic diatonic subsets, elasticity is minimal: i.e., Byzantine hexáechos28 (Figure ​3A) is very close to diatonic MPS.

  • 10. Audio: The Little Entrance “Come, Let Us Worship” [Priidite, poklonimsia], 2-part Znamennyi chant, based on Byzantine hypermodal system. Minimal tonal tension from ascending motion through 3 equintervallic trichords.

Larger size subsets, such as tetrachordal and pentachordal, common in Georgian traditional music, increase functionality of PCs, inducing substantially greater instability—which is handled by more elaborate hierarchic organization.

  • 11. Audio: Kakhuri nana. Lullaby. Georgian tetrachordal hypermode (Figure ​3B) with the characteristic diminished octave G#/G (Gogotishvili, 2010). The unstable functionality prevails over the stable one.

Non-octave hypermodes presented a window for expression for the strictly controlled amounts of tension (see Appendix 2 in Supplementary Material for details) that was compartmentalized in different registers. The resulting opposition to “natural” (for speech and animal vocal communication) association of high register with submissiveness while of low register with aggression (Ohala, 2006), marks the contribution of hypermode to the establishment of specialized musical tonal semantics—in contrast to verbal tonal semantics.

Yet another historic landmark was divergence of hypermode from chromatic system by providing a diatonic-based alternative to the chromatic expandability by alteration/modulation (see below). A noticeable affiliation of hypermodal organization with the Christean plainchant, which subsequently shaped the folk music of many Eastern Orthodox nations and ethnicities, expressed rejection of the cultural heritage of the Greco-Roman philosophy of music and an attempt to restore the older Sumero-Babylonian cosmology on new theological ground (see Appendix 3).

Alteration and modulation

Unlike the hypermode, the diatonic MPS did not restrict degrees to sustain their pitch values throughout the music work. The need to temporarily increase tension was handled by alteration and modulation. The term “alteration” refers to raising or lowering of a degree in a PCS, involving modification of the IS. When this happens, listeners familiar with this PCS become surprised by its deviation from the norm. The impulse to restore familiar IS is what is responsible for momentary increase in tension associated with the alteration, when the listener experiences intense expectation for it to comply to the norm (Margulis, 2005).

Alteration is a form of cognitive dissonance. Formulation of Systema Metabolon (“the modulating system”) concurred with the formation of the discipline of dialectics in Ancient Greece (Losev, 2000, pp. 601–35), and with the growth of public interest in it (i.e., rhetorics, sophisms; Laertius, 1958, p. 137). As people realized the limitations of words in reference to real objects, the dialectic method of defining opposites began to make an imprint at first on the manner of conducting scholarly research and legal matters, then on the discipline of rhetoric in general, and finally on tonal organization. The primary function of music to harmonize was understood through opposition of tension and relaxation, “united by disunion” (Plato, 2012, p. 13).

Neither unfixed ekmelic degrees, nor expressively tuned multitonal degrees of pre-MPS musics involved cognitive dissonance. Rather, they constituted exaggeration of intonation in pitch—what Cazden (1971) termed “modal inflection.” The principal difference is that “chromatic” alteration implies production of two colors and cognitive conflict, whereas “inflection” implies saturating a single color and no cognitive conflict.

  • 12. Audio: Shelkovoya travushka, Nekrasov Cossacks. The IV degree here exists in three flavors (normal, sharpened, and flattened)—marking the opening of each strophe with a tonal “blot (see the frequency analysis in the Demonstration-4 in Part-I).”

Modal inflections are modally normative: justified by the permanence of melodic rule.

  • 13. Audio: Alilo, Georgian ritual Christmas song. The melodic rule: in the middle voice, every time B goes to C#, it sharpens, but every time it descends to A# in the opening of every strophe—it stays natural.

Alteration does not possess such permanence and logic. By its nature, it is accidental. Alteration splits the normative degree into few versions within the same composition, calling for further “resolution”: two versions cannot both be “right,” one ought to be “wrong,” and therefore “corrected29.”

Alteration is relatively rare in oral traditions30 reserved to technically advanced professional music with fully fledged music theory31.

  • 14. Audio: Maddoh, Pamir. Improvisation on a ghazal by Hafiz. The stanza starts with the altered degree C#, creating a dissonance in relation to the accompaniment—and then resolves into B, restoring the initial non-altered mode: E-F#-G#-A-A#-B-C-D#.

Alteration should not be mistaken for progression of natural degrees in folk “microtonal” modes, where seemingly “chromatic” degrees are normative (Petrovi, 1994). Such modes can contain their own micro-alterations.

  • 15. Audio: Falak-I Badakshani, Pamir. Microtonal alterations of four “natural chromatic” degrees within the ambitus of F#4-A4, providing extra tension for a genre of funeral lamentation (Levin, 2007).

See Presentation-1: Alterations/Micro-alterations.

“Modulation”— transition from one musical mode to another without a break—differs from alteration by violating gravity rather than PCS. Modulation has been theorized exclusively within the framework of Western music. Similar devices are known in other advanced music systems (Indian, Arabic, Chinese)—although without receiving much attention in their music theory. Modulation in folk music presents a novel and controversial object of study.

The most common form of gravitational shift in folk music is intra-modal mutability.

  • 16. Audio: Li Weri, a Senufo funeral, Côte d'Ivoire. Intra-modal mutability in pentatonic mode from C to Eb and finally to F.

Zemtsovsky (1998) calls this “pentatonic enharmonism”: ability of PCs to get included in different trichords, where the same PC would act as an anchor in one trichord, whereas remain unstable in another trichord. Similar “enharmonism” is possible in hemitonic modes usually involving membership of the same PC in two different tetrachords.

  • 17. Audio: Nozanin-Shod-I Uforash, call-and-response sozonda (wedding), Bukhara. Intra-modulation a step up, from Eb to F, in a heptatonic mode.

Mutability of multitonal mode (see Part-1) restricts intra-modulation to only 2-3 anchor-tones, making gravitational shifts predictable and regular.

  • 18. Audio: Ocarina solo, Bulgaria. Each sentence (provided sample) starts in A, in major inclination, but ends in F#, in minor inclination. Such A/F# alternation shapes the form of the entire composition, only by the end of it committing to a prolonged F#.

The MPS generalizes diatonic “enharmonism”: if folk mutability shifted gravity for a single tone, MPS modulation shifts the entire set—rebuilding it from any of the degrees.

Helladic music probably featured simple diatonic modulations (Franklin, 2002). Its original pitch set constituted an Olympic trichord E-F-A (West, 1992, 164). As time progressed, the set size grew—ultimately reaching an octave species, allowing for inter-tetrachordal enharmonism. Despite their size, all MPSs are treated in the same way: music users remember the normative sets, and upon detecting modulation, hypothesize a new set from what they already know (Raman and Dowling, 2012).

The entire PSs are alternated—even if, technically speaking, the PS degrees retain the same pitch values (as C-Ionian/A-Aeolian). In reality, their pitches are not exactly retained, since each PCS imposes its own expressive tuning: certain degrees are slightly sharpened or flattened, depending on their function in the PCS (Sundberg et al., 1995). The same tone B will be intoned sharper in Ionian C, and flatter in Aeolian A (Tchesnokov, 1961, p. 58). Although, this adjustment is not as drastic as a single tone mutation in a folk multitonal mode, it nevertheless does occur32. In the polymodal system, the music user remembers modes by their IS, including their characteristic expressive tuning (Brattico et al., 2006). Absence of expressive tuning is perceived as faulty performance (Sundberg, 1982). Every time music modulates from mode to mode, the melodic ISC switches, causing reassignment of expressive tuning values—all at once, as in switching from one tuning table to another. This is what the phenomenon of “harmonic modulation” practically entails.

Listeners take expressive tuning as a prompt in detecting the most stable (immutable tuning) and unstable (most mutable) degrees. They estimate modulation in terms of gradations in tension determined by the intervallic value of the modulation— the interval between the old and new tonics. Thus, modulations to subdominant (C-F) are perceived “tenser” than modulations to dominant (C-G) (Korsakova-Kreyn and Dowling, 2012). It seems that the listener's affective response to modulation is determined by the way in which the entire PS and IS of the “arrival” mode appears to the listener in relation to the “departure” mode. Thus, modulation from minor dominant to minor tonic appears different than modulation from minor subdominant to minor tonic. Transition from one PS/IS to another is processed probably as a single percept akin to the standard progression of chords33. The emotional reaction to modulation proves to be one of the most exciting stimuli in music listening experience (Korsakova-Kreyn and Dowling, 2014). We shall see later how this emotionality is important for the emergence of chromatic system.

Modulation usually involves alteration—their combination pioneered in Ancient Greece.

  • 19. Audio: Mesomedes—Hymn to the Muse, second century AD, brief modulation from Lydian to chromatic Hypolydian mode by the end of the hymn (Hagel, 2009, p. 287).

Hellenic listeners identified melodies by intervallic differences (Lippman, 1964, p. 160): which involved IS, IC, ICS, and ISC34. Interval-tracking habit was responsible for non-formulaic composition as opposed to contour-tracking habit of earlier folk-musicians.

Both, Babylonian and Assyrian songs fit a single song into a single mode (Franklin, 2013, p. 218). In Classical Greek music, a song often contained a nexus of tetrachords, each bearing its own modal organization (West, 1992, p. 226).

Professionalized folk cultures can come close to what might appear as a chromatic modulation either by emulating MPS music or forming composite mode-a compound of 2 or more stand-alone modes (Belaiev, 1963).

  • 20. Audio: Duma about Marussia of Bohuslav, Ukraine. Modulation from E to B that appears to be influenced by the Western classical modulation from tonic to dominant.

  • 21. Audio: Toshto Marii Kushtymo Sem, Marian dance. Here, Pentatony that characterizes the music of Volga Finns is enriched by the composite mode C-D-Eb-E-F-G-Ab-A-C, which was most probably generated by adding together the C-D-E-G-A and incomplete C-Eb-F-G-Bb (without Bb) pentatonic modes.

When folk musicians learn a diatonic PCS, they begin to transpose it by degree. Eventually, they come to connect two tunes, each associated with its own mode, into a medley. Then, one mode becomes transposed so that it would start on the same I degree as another. As the performer gets used to this juxtaposition, he can combine intonations from both modes within the same song. Even pentatonic modes acquire quasi-chromaticism in this way. Thus, two pentatonic modes built from the same tone (i.e., C-D-E-G-A and C-Eb-F-G-Bb) produce quasi-altered III degree (C-D-Eb-E-F-G-A-Bb). The complete combination of all pentatonic modes results in a 9-tone composite mode C-D-Eb-E-F-G-Ab-A-Bb35.

However, “chromatic” tones in composite modes are never used in scalar fashion (Belaiev, 1963). Even when a folk musical instrument includes the entire chromatic scale, as in Chinese shen or pipa (Riemann, 1899, p. 5), it hardly ever plays chromatic successions. Tunes remain pentatonic or diatonic, while the “chromatic” tones are reserved solely for passing from one mode to another (von Hornbostel, 1975, p. 41).

Chromatic polymodal system

The more frequent is the alteration, the more likely it is for it to cause habituation, lose its affinity with cognitive dissonance and acquire more “consonant” status. This is what must have happened in the Hellenic culture. According to Ancient Greek sources, altered tone's function was to “shade” the diatonic degrees: notable was the reference to “sweetness” of chromatic alterations (Hagel, 2009, p. 154)36. Pleasantness of alteration was responsible for the quick popularization of lute in Greece from the fourth century BC: unlike lyre, lute allowed to comfortably produce chromaticism (Higgins and Winnington-Ingram, 1965). Fashion for alterations could have “normalization effect” on chromaticism, so that its cultivation would have “domesticated” the cognitively dissonant aspect of it (Katsanevaki, 2011).

  • 22. Audio: First Delphic Hymn to Apollo, second century BC. Essentially, this composition presents spare use of chromatic alterations shading of the Phrygian tetrachord (West, 1992, p. 288).

  • 23. Audio: Katolophyromai fragment from Orestes by Euripides, from papyrus, 3rd century BC. Most of the melodic content of this lamentation in chromatic Lydian mode is made of altered degrees.

Chromatic alteration became affiliated with aesthetic emotion after the practice of connecting certain modes with certain affects was established through the temple culture of Sumerian and Egyptian cults, some time around the 2nd millennium BC (Farmer, 1965) (see Appendix 3 in Supplementary Material).

Earlier agricultural civilizations heavily depended on the calendar, which boosted the development of astronomy and math, but carried no mystic and esoteric implications to entitle numerology to a governing status delegated to the elite (Frolov, 1992, p. 152). Babylonian music theory was first to link the arithmetic definitions of musical tones to cosmology. Cosmology empowered music with the status of natural law, equating music's influence with the sun or the moon. Just as excess or shortage of sunlight can cause problems, so presence or absence of certain modal qualities in music was believed to be beneficial or hazardous for a person. This doctrine is known as “ethos” and existed in numerous Ancient civilizations (Kaufmann, 1976; Rowell, 1981; Deva, 1995; Katz, 1996; Thrasher, 2008).

In the 6th century BC, Sakadas of Argos started combining different ethea in a single composition by employing intra-modulations between different verses of his song. Then, Aristoxenus' Perfect System rationalized the means for the composer to generate his individual map of tonal tension suitable for a particular composition.

  • 24. Audio: Second Delphic Hymn, second century BC. The music is built on the Lydian tetrachord, alternating between Hypolydian and chromatic Lydian modes—which seems to be reserved as means of a peculiar compositional arrangement, unlike the modal stereotypicity of folk music.

Rising standard of authorship incorporated modal creativity. Greek civilization championed cultivation of melopoeia, art of composing music, put forth by Plato (Kholopov, 2006, p. 74). From the fifth century BC until the Dark Ages, authorship guided expression in the arts. Distinguished authors' names were perpetuated, encouraging other artists to either follow their steps or to compete with them. Growing popularity of chromatic style in the fifth century Athens reflected the antithesis of diatonic conventionality vs. chromatic originality. For the next half-millennium, enharmonic and chromatic genera made the diatonic genus look too predictable and unimpressive (Franklin, 2002).

Chromatic modulations were restricted to melodic junctions between the adjacent tetrachords: alteration could only follow the consonant “bounding” tones at the tetrachord's end (Hagel, 2009, p. 10). Thereby, diatonic system provided the skeleton for all modulations and alterations—very much like in a modern key. However, not all musicians followed the rules (Franklin, 2002).

Crexus, Timotheus, and Philoxenus were condemned for increasing the number of strings on the lyre, and excessive elaboration—blamed for using “polyharmonia” to appeal to the mob's ideas of plurality and liberation (LeVen, 2014, p. 81). This accusation should be understood in the context of dithyramb contests and theatrical plays becoming exceedingly popular to the extent of introduction of entrance fees for the first time in Greek history (Csapo, 2000). Theater musicians made lavish profit and enjoyed enormous popularity—this, together with the growing market (18 theatrical festivals per year, fourth century BC) unleashed fierce economic competition (Csapo, 2011). New Music was definitely based on the direct approval/disapproval of live audiences. The immediate reason for the split of public opinion, and voices for its condemnation was its break of conventional ties between mode and genre, and its inter-strophic modulation—which could be rather abrupt, even a semitone apart (Hagel, 2009, p. 44).

  • 25. Audio: Lamentation from Iphigenia Aulidensis by Euripides, third century BC. Modulation from Hyperaeolian to Hyperphrygian mode by common tone.

Chromatic music represented new philosophy of consumerism of aesthetic emotions—in opposition to Platonic philosophy that reserved diatonic music for propaganda of “right” emotions (Stamou, 2002). Chromatic music grew out of older enharmonic music that was cultivated in Dyonisiac dithyramb, and became related to theater and symposium (drinking parties), both of which involved aesthetic appreciation. Chromaticism as “sweetening” of intervals by tonal shading served to evoke states ranging from “pleasant” to “lugubrious” (Franklin, 2005)—essentially, aesthetic emotions37.

Athenian chromaticism replaced cosmogonic consonance admiration with admiration for realistic impersonation of humanistic character traits, interwoven into dramatic development. Aristoxenus' chromatic system instrumented this change by rejecting older Pythagorean numerology as “dogmatic,” and basing a new music theory on psychoacoustic principles put to service of the composer (Barker, 1978).

Another important issue was the topological reference frame: Babylonian/Pythagorean diatonic theory was all arithmetic, defined by prescriptive numerical proportions, whereas Aristoxenian chromatic theory was geometricdescriptive of actual distances on monochord's strings. Remarkable is the commonality of Aristoxenian and Euclidian approaches to the infinitely smallest magnitude, setting a conceptual and a terminological correspondence between musical and physical spaces (Barbera, 1977). Chromatic tetrachords reflected the contemporary advance in the irrational numbers, presenting breakthrough from Pythagorean ratios (Scriba, 2015, p. 44). Babylonian mathematics had strong arithmetic-algebraic character: tables and lists of reciprocals and roots provided the “right” answer for a particular use, where “the geometrical form of the problem was usually only a way of presenting an algebraic question” (Struik, 1987, p. 28). In contrast, Greek geometry sought methods for inferring the relations between objects based on empirical proof.

Moreover, Euclid introduced a strong personalized aspect in such calculations, where angles and distances were estimated from the viewpoint of a particular spectator (and not “in general”), resulting in discrepancies between “optic” and “perspectival” evaluations (Andersen, 2008, p. 725). Unlike Babylonian geometry, Euclidian geometry was influenced by scenography (728), acquiring strong spatial connotations (considering geometric lines as representations of what can be actually seen around)—in contrast to Babylonian “aprioristic” line of thinking (providing ready numbers for a particular application).

Chromatic music was a tonal system engineered to present emotional theater: to convey detailed emotional information prompted by the text and/or dramatic action. Chromatic MPS broke away from a diatonic MPS by becoming a storage of modulation/alteration possibilities for the composer. To minimize the inconvenience of retuning the lyre, which remained a reference instrument for theory, musicians had to find as many common tones between different modes as possible. And seven principal modes, when built from the open string E, mark the E-A-B core of immutable tones, thereby forming the axis for categorization and hierarchical organization (Gombosi, 1951). Of E-A-B, central A3 seemed to execute the function of the ultimate tonic (West, 1992, p. 219).

Just as ekmelic and mesotonal modes, chromatic modes were crystallized by the permanence of tuning: the least frequently retuned tones acquired the status of stability, while the most alterable tones ended up at the bottom of the tonal hierarchy. The synékheia (continuity) law postulated that all chromatic modifications to be derived from diatonic MPS for better melodic coherence (Franklin, 2005)—Aristoxenus was clear on using the entire MPS as a reference for chromatic alterations (Hagel, 2009, p. 44).

The MPS structure in Figure ​4 represents the chromatic/enharmonic key of A (Strunk and Treitler, 1998, p. 37), expanded over all the available sonic space—what was called Systema Ametabolon (West, 1992, p. 223). Aristoxenus described 13 chromatic “keys” which altogether regulated organization of chromatic/enharmonic genera, built from each of the 12 semitones between Hypodorian F2 and Hyperphrygian F3 (Hagel, 2009, p. 48). The description of the chromatic system might sound extremely complex, but in practice, the overall number of PSCs in the MPS was not exorbitant38. There was little distinction between the chromatic and enharmonic genera39. Greek notation did not distinguish between them at all (West, 1992, p. 255), and the performance practice left the exact choice to the discretion of the performer. In reality, musicians had to deal with no more than 14 different types of tetrachords: 2 types of each of the 7 principal keys.

The entire Systema Ametabolon clearly stresses the A/E gravity, with tonic/dominant functionality. The epicenter of chromatic mutability falls at the upper middle of the MPS (Figure ​4). This is the register where melodies show the greatest modal complexity. The peculiarity of Greek system is that all alterations are descending41. The descending functionality of Ancient Greek music probably originated from the Archaic trichord E-F-A (West, 1981), with its characteristic “directing” semitone placed at the bottom. This trichord became a melodic frame, where extra tones could be placed in between E and A, forming two oldest heptatonic genera, diatonic and enharmonic, circa seventh century BC, credited to Olympus (Barker, 2007, p. 99). Chromatic genus evolved later, as a simplification of enharmonic genus, and gained in popularity—up until AD: surviving musical fragments from the Roman period are almost wholly diatonic, and both, Gaudentius and Macrobius reported that chromatic and enharmonic genera were obsolete by fifth century AD (West, 1992, p. 165).

Chromatic music was ousted in the West, but not in the East of Roman Empire. Greek chromatic MPS impacted all the territories between Greece and India—conquered by Alexander during the heydays of chromatic music. “Gapped” structure with chromatic/enharmonic pyknon (a pinch of three close pitches) penetrated local folk cultures and created a special intervallic class—what Kholopov (1988, p. 38) named “hemiolic” (“hemiolia”—the 1½:1 ratio). Hemiolic mode differs from diatonic by its chromaticism: recoloration (chroma) of ICs due to their inequality—most prominent in microtonal varieties of hemiolic modes, i.e., maqam Hijaz-Kar-Kurdi C-Db-E¾b-F-G-A¾b-B¾b-C (Racy, 2004, p. 108, see Appendix 4 in Supplementary Material).