Music 253

From CCARH Wiki
Jump to navigation Jump to search

Old front page for Music 253: http://www.ccarh.org/courses/253

What is Musical Information?

Musical information (also called musical informatics) is a body of information used to specify the content of a musical work. There is no single method of representing musical content. Many digital systems of musical information have evolved since the 1950s, when the earliest efforts to generate music by computer were made. In the present day several branches of musical informatics exist. These support applications concerned mainly with sound, mainly with graphical notation, or mainly with analysis.

Musical representation generally refers to a broader body of knowledge with a longer history, spanning both digital and non-digital methods of describing the nature and content of musical material. The syllables do-re-mi (identifying the first three notes of an ascending scale) can be said to represent the beginning of a scale. Unlike graphical notation, which indicates exact pitch, this representation scheme is moveable. It pertains to the first three notes of any ascending scale, irrespective of its pitch.

Basics

Parameters of Musical Information

Pitch

Two kinds of information--pitch and duration--are pre-eminent, for without pitch there is no sound, but pitch without duration has no substance. Trained musicians develop a very refined sense of pitch. Systems for representing pitch span a wide range of levels of specificity. Simple discrimination between ascending and descending pitch movements meet the needs of many young children, while elaborate systems of microtonality exist in some cultures.

There are many graduated continua--diatonic, chromatic, and enharmonic scales--for describing pitch. Absolute measurements such as frequency can be used to describe pitch, but for the purposes of notation and analysis other nomenclature is used to relate a given pitch to its particular musical context.

Basic elements of musical notation.

Duration

Duration has contrasting features: how long a single note lasts is entirely relative to the rhythmic context in which it exists. Prior to the development of the Musical Instrumental Digital Interface (MIDI) in the 1980s the metronome was the only widely used tool to calibrate the pace of music (its tempo). MIDI provides a method of calibration that facilitates capturing very slight differences of the execution in order to "record" performance in a temporally precise way. In most schemes of music representation values are far less precise. Recent psychological studies have demonstrated that while human expectations of pitch are precise, a single piece of music accommodates widely discrepant executions of rhythm. People can be conditioned to perform music in a rote manner, with little variation from one performance to another, but deviation from a regular beat is normal.

Other Dimensions of Musical Information

Many other dimensions of musical information exist. Gestural information registers the things a performer may do to execute a work. These could include articulation marks for string instruments; finger numbers and pedal marks for piano playing; breath marks for singers and wind players; heel-toe indicators for organists, and so forth. Some attributes of music that are commonly discussed, such as accent, are implied by notation but are not actually present in the fabric of the musical work. They occur only in its execution.

Domains of Musical Information

Applications of musical information (or data) are said to exist in various domains. These domains partially overlap in the information they contain, but each contains domain-specific features that cannot be said to belong to the work as a whole. The three most important domains are those of graphics (notation), sound, and that content not dependent on either (usually called the "logical" work).

Schematic view of the three main domains of musical information.


The Graphical (Notational) Domain

The notation domain requires extensive information about placement on a page or screen. Musical notation works very differently from text for two main reasons: (1) every note simultaneously represents an arbitrary number of parameters of sound and (2) most notated scores are polyphonic.

Attributes of a single musical note. Each attribute suggests the need for a separate encoded parameter.

In contrast to scanned images of musical scores, encoded scores facilitate the analysis and re-display of these many parameters, singly or in any combination. Extracted parameters can also be recombined to create new works (a topic treated at Stanford in a separate course, Music 254/CS 275B).

The Sound Domain

The sound domain requires information about musical timbre. In the notation domain a dynamic change may be represented by a single symbol, but in the sound domain a dynamic change will affect the parameters of a series of notes. In relation to notated music sound makes phenomenal what is in the written context merely verbal (tempo, dynamics) or symbolic (pitch, duration). In its emphasis on the distinctios between verbal prescriptions and actual (sounding) phenomena, semiotics offers a ready-made intellectual framework for understanding music encoding in relation to musical sound.

The Logical Domain

The logical domain of music representation is somewhat elusive in that while it retains parameters required to fully represent a musical work, it shuns information that is exclusively concerned with sight or sound. Pitch and duration remain fundamental to musical identity, but whether eighth notes are beamed together or have separate stems is exclusively a concern of printing. If, however, beamed groups of notes are considered (in a particular context) to play an important role in some aspect of visual cognition, they may be considered conditionally relevant. The parameters and attributes encoded in a "logical" file are therefore negotiable and, from one case to the next, may seem ambiguous or contradictory.

Systems of Musical Information

Domain differences favor domain-specific approaches: it is more practical to develop and employ code that is task- or repertory-specific than to disentangle large numbers of variables addressing multiple domains in a "complete" representation system.

Notation-oriented schemes

Notation programs favor the graphical domain because notated music is only comprehensible if the spatial position of every object is correct and if the visual relationships between object are precise. Common Western Notation (CMN) is full of graphical conventions that aid vision and comprehension but have little to do with sound. Beamed groups of notes facilitate rapid comprehension. Barlines help users maintain a steady meter.

Sound-oriented schemes

Time is a fundamental variable in sound-oriented approaches to symbolic representation. The steady beat implied for four adjacent quarter notes rarely translates into four time-intervals of a single precise measurement. Repp has shown that deviation from a metronomic beat is the human norm. Our ears resist a completely steady beat, although synchonization with other parts is essential to performance by groups.

Other important qualities of sounding music are timbre, dynamics, and tempo. Timbre defines the quality of the sound produced. Many factors underlie differences in timbre. Dynamics refer to levels of volume. Tempo refers to the speed of the music. Relative change of both dynamics and tempo are both important in engaging a listener's attention.

Data Acquisition (Input)

For notation

Many systems for the acquisition of musical data exist. The two most popular today are (1) capture from a musical keyboard (via MIDI) and (2) optical recognition (scanning). Both provide incomplete data for notational applications: keyboard capture misses all the features specific to notation (i.e., not shared by sound); optical recognition is an imperfect science that can miss arbitrary symbols. Commercial programs for optical recognition tend to focus of providing MIDI data rather than a full complement of notational features. The details of what may be missed, or misinterpreted by the software, vary with the program. In either of the above cases serious users must modify files by hand.

Hand-correction of automatically acquired materials can be very costly in terms of time. For some repertories complete encoding by hand may save time and produce a more professional result, but outcomes vary significantly with software, user familiarity with the program, and the quantity of detail beyond pitch and duration that is required.

Statistical measures give little guidance on the efficiency of various input methods. A claimed 90% accuracy rate means little if barlines are in the wrong place, clef changes have been ignored, text underlay is required, and so forth.

Virtual keyboard used by the Guido Notation System. Pitch can be entered from it. Duration is specified from the menu of note values above the keyboard. The resulting code, which can also be written by hand, is shown in the upper window.

Visual prompts for data entry to a virtual keyboard may help novices come to grips with the complexity of musical data. Musicians are accustomed to grasping from conventional notation several features--pitch, duration, articulation--from one reading pass because a musical note combines information about these features in one symbol. However, computers cannot reliably decode these synthetic lumps of information.

For sound

MIDI is the preferred way to capture sound data to a file that can be processed for diverse purposes. MIDI provides a convenient way to store sound data for further use and for sharing. It is frequently used to generate notation and is useful for rough sketches of a work. Its capabilities for refined notation are limited, particularly by its inarticulate represent of pitch.

For logical information

Logical information is usually encoded by hand or acquired by data translation from another format. For simple repertories and short pieces logical scores may be acquired from MIDI data, but if the intended use is analytical verification of content is so important that manual data entry may be more efficient.

Data Output

All schemes of music representation are based on the assumption that users will wish to access and extract data. When output does not match a user's expectations it can be difficult to attribute a cause. Errors can result both from flawed or incomplete data and from inconsistencies or lapses in the processing software. However, all schemes of music representation are necessarily incomplete, because music itself is constantly in flux.

From notation software

The extraction of data sufficient to provide conventional music notation requires a considerable number of parameters. Hand correction is almost always necessary for scores of professional quality.

Lapses are greatest for repertories that do not entirely fall within the bounds of Common Western Notation (European-style music composed between c. 1650 and 1950) or which have an exceptionally large number of symbols per page (much of the music of the later nineteenth century, such as that of Verdi and Tchaikovsky). Even some popular repertories are difficult to reproduce (usually because of unconventional requirements) with complete accuracy.

From sound-generating software

The ear is a quicker judge than the eye but also a less forgiving one. Errors in sound files are usually conspicuous, but editing files can be difficult, but this depends mainly on the graphical user interface provided by the sound software.

From logical-data files

Logical-data files are difficult to verify for accuracy and completeness unless they can be channeled either to sound or notational output of some kind. This means that the results of analytical routines run on them can be flawed by the data itself, though the algorithms used in processing must be examined as well.

Schemes of Representation

Notation Codes

Schemes for representing music for notation and display range from lists of codes for the production of musical symbols such as CMN to extensible languages in which symbols are interrelated and may be prioritized in processing. Some notation codes are essentially printing languages in their own right but may lose any sense of musical logic in their assembly.

Some users judge notation codes by the aesthetic qualities of the output. Professional musicians tend to prefer notation that is properly spaced and in which the proportional sizes of objects clarify meaning and facilitate easy reading. For this reason notation software is heavily laden with spacing algorithms and provides the user with many utilities to altering automatic placement and size.

Musical notation is not an enclosed universe. In the West it has been in development for a millennium. In many parts of the Third World cultural values discourage the use of written music other than in clandestine circles of students of one particular musical practice. Contemporary composers constantly stretch the limits of Conventional Western Notation, guaranteeing that graphical notation will continue to evolve.

A fundamental variable of approaches to notation is that of the conceptual organization of a musical work, coupled with the order in which elements are presented. Scores can be imagined as consisting of parts, measures, notes, pages, sections, movements, and so forth. Only one of these can be taken as fundamental; other elements must be related to it.

Early representation systems encoded works page by page. More recent systems lay out a score

Typesetting Codes: CMN, MusiXTeX, and Lilypond

The close orientation towards plotters and printers is evident in many notation codes, such as CMN, an open-source description language. An online manual for this lisp-based representation, predominately by Bill Schottstaedt, is available at Stanford.

Also online is a manual describing MusiXTeX, a music-printing system based on TeX, a typesetting language widely used in scientific publishing. MusiXTeX is one of several dialects of music printing in the TeX universe.

The evolution of typesetting from metal through photo-offset to modern digital preparation and printing necessarily changed approaches to music printing. Lilypond, a text-based engraving program, sits somewhat in the tradition of these earlier approaches.

Monophonic Codes: EsAC and Plaine and Easie Code

Two coding systems that have been unusually durable are the Essen Associative Code (EsAC) and Plaine and Easie. Each is tied to a long-term ongoing project. Both are designed for monophonic music (music for one voice only). Both have been translated into other formats, thus realizing greater potential than the originators imagined.

EsAC

The purpose of EsAC was to encode European folksongs in a compact format. EsAC has a fascinating pre-history, which is rooted in the widespread collections of folksong made in the nineteenth and early twentieth centuries, when German was spoken across a wide swatch of the Continent.

A (typescript) code was developed to "notate" each song according to a single set of rules. Some metadata (titles, locales) and musical parameters (meter, key) were noted. The script was necessarily based on ASCII characters. [The underlying code was used for typescript "transcriptions" in unedited collections in Central and Eastern Europe (Austria, Croatia, Poland, Slovenia, Serbia) and elsewhere.]

Adaptation of this material to mainframe computers was begun in the 1970s by the Deutsche Volkslied Archiv (DVA) in Frieburg im Breisgau (Germany). An pioneering work of musical analysis across the different repertories encoded was completed by Wolfram Steinbeck (1982). This inspired the late Helmut Schaffrath, an active member of the International Council for Traditional Music, to adapt some of the DVA materials to a minicomputer. Several of his students at the Essen Hochschule für Musik (later renamed the Folkwang Universität der Künste) wrote software to implement analysis routines commonly applied to folksongs. The subset became the Essen Folksong Collection; the code, correspondingly, became EsAC. Schaffrath's work, which took a separate direction from that of the DVA, was carried out between 1872 and his death (1994). The work was later continued by Ewa Dahlig-Turek at the EsAC data website.

The EsAC encoding system is easily understood within the context of its original purposes: it is designed for music which is monophonic. Although all the works have lyrics, the encoding system was not designed to accommodate lyrics. (Efforts to add them have disclosed certain limitations of the system.) The range of a human voice rarely exceeds three octaves, but given that the ideal tessitura varies from one person to another, the concept of a fixed key is of neglible importance. Pitch is therefore encoded according to a relative system accommodating a three-octave span [Table 2.]

The EsAC three-octave "gamut". Pitches are represented by the integers 1..7. The central octave is unsigned. Pitches in the lower octave are preceded by a minus sign (-); pitches in the upper octave by a plus sign (+).

A smallest duration, given in header (preliminary) information, is taken as the default value. The increments of longer note values are indicated by underline characters. Sample files are shown in thus handout of EsAC samples. Further details are given in Beyond MIDI, Ch. 24.

Plaine and Easie

The purpose of Plaine and Easie (P&E) code was to prepare virtual catalogues of musical manuscripts. It was designed to give an exact, complete description of every indication on a musical score. Although only incipits (beginning passages) are encoded, a P&E encoding might include a tempo and dynamics; grace notes, appoggiaturas, and other ornaments; marks for staccato and legato, and so forth. It will be based on the original, unedited music including clef signs rarely seen today and word spellings that may be obsolescent.

P&E has always been associated with the international RISM music-bibliographical project, and particularly with the cataloguing of music manuscripts from the seventeenth and eighteenth centuries. At this writing (August 2012) the online RISM database of manuscripts contains c. 800,000 entries, which come from more than 60 countries. The project was begun in 1952. Much of the material available online today was initially transcribed on paper. The database structure in which the musical data exists has more than a hundred text fields and is searchable in many ways. Links to the digitized manuscripts directly from the RISM listing are uploaded monthly.

Polyphonic Codes: DARMS and SCORE

Systems for producing full music scores that originated in the Sixties and Seventies (prior to the development of desktop systems) necessarily devised representation schemes that were compact. Data was stored mainly on large decks of punched cards with efficient way to edit data; input error rates were high. Processing time was long. What might seem today to be compromises with efficiency should not obscure the fact that restraints forced developers to be ingenious. Those with the patience to parse early systems will often glimpse valuable insights in the residues of systems such as DARMS.

DARMS

The Digital Alternative Representation of Musical Scores (DARMS) was unveiled at a summer school in 1966. Among those involved in the development of DARMS was Stefan Bauer-Mengelberg. Raymond Erickson wrote an extensive manual for the DARMS seminar (State University of New York at Binghamton) of 1976. Jef Raskin was the first person to produce printed notation from DARMS code (Pennsylvania State University, 1966).

The encoding scheme was intended to be useable by encoders who did not read musical notation. With this aim in mind, pitch was represented by a note's vertical position relative to a clef, such that a note on the lowest line of five-line staff would always be "1". Duration was indicated by a letter code based on American usage (Q=quarter, W=whole, etc.). An extensive list of codes for articulation, dynamics, and so forth was developed.

DARMS LinearDecomp EX.bmp
DARMS encoding using linear decomposition to express temporal relationships between voices.

DARMS had one feature that addressed a problem common to keyboard music and short scores: to represent a rhytymically independent second voice on a staff already containing one voice, DARMS used what was called linear decomposition. Linear decomposition (initiated by the sign "&") could have successive starting points, so that in an extended passage of this sort, it could be reinitiated at each beat, at each bar, or once throughout the passage. For processing software this variability amounted to a lack of predictability. However the problem is a pervasive one in encoding systems, and DARMS was respected for its (hypothetical) ability to accommodate such material.

DARMS evolved into several dialects and spawned many extensions for special purposes. Among them the Note Processor (by Stephen Dydo) offered the only single-user commercially available software product based on the code. Tom Hall developed a production-oriented system based on DARMS for A-R Editions, Inc. (Madison, WI). In academic circles, Lynn Trowbridge developed DARMS extensions for mensural notation. Frans Wiering did the same for lute tablature.

SCORE

While DARMS spawned a generation of experimentation and innovation, SCORE has largely been the work a single individual, Leland Smith, professor emeritus of music at Stanford University. A few auxiliary programs are described on the home page of San Andreas Press, the vendor of SCORE. SCORE is generally regarded the notation program best suited to fully professional results. Noted for its esthetic qualities, SCORE features on openly documented encoding system.

SCORE was originally developed at Stanford University's artificial intelligence lab as a mainframe computer application that could produce music of arbitrary complexity. The research began in 1967 but the earliest printing was produced in 1971 via a plotter. In the 1980s SCORE was moved to DOS-based computers, and late in the 1990s it was revised for the Windows environment. At this writing (2012) the SCORE program remains in FORTRAN.

Two features of Seventies plotters account for some fundamental features of score: (1) plotters related the objects they were to print to X, Y coordinates (to specify vertical and horizontal placement) and (2) plotters were well suited to producing vector graphics.

With respect to music, the first principle accounts for SCORE's parametric layout, which places every object in relation to an exact vertical/horizontal point. (All printing programs need to give an account of placement, but in most commercial software packages these details are hidden from the users or seen only in tiny glimpses on screens for refined editing.) The vector graphics used by SCORE anticipated the later development of PostScript fonts, including Sonata (1987), the earliest such independent font for music.

Data entry in SCORE is achieved in a five-pass system. The first two passes provide pitch and duration. The succeeding passes (articulation ornamentation, and dynamics; beams; slurs) are optional; their use depends on what additional characters (if any) are present in the music yo be typeset. The power of score lies in (a) its extensive symbol library and free-drawing facility and (b) its conversion of input codes to its editable parametric format. Eighteen parametric fields give the user access to extremely detailed control of the final graphical content and its placement.

XML-Oriented Codes: MuseSCORE

Sound-Related Codes

MIDI

Music V and CSound

Conducting Cues in Encoding

Synchronic Approaches to Music Representation

MuseData

Humdrum (kern)

Schemes for Data Interchange

Advanced Topics in Music Representation

Data Interchange between Domains and Schemes

The interchange of musical data is significantly more complex than that of text. Musical data is doubly multi-dimensional. That is, pitch and duration parameters describe a two-dimensional space.

Additionally, most music is polyphonic: it consists of more than one voice sounding simultaneously. So taking all fundamental parameters into account, a musical score is an array of two (or more) dimensional arrays.

There is no default format for musical data (analagous, let us say, to ASCII for text in the Roman alphabet). This absence owes largely to the great range of musical styles and methods of production that exist throughout the world. No one scheme is favorable to all situations. Data interchange inevitably involves making sacrifices. In the world of text applications Unicode facilitates interchange between Roman and non-Roman character sets (Cyrillic, Arabic et al.) for alphabets that are phonetic.

MusicXML

Within the domain of musical notation MusicXML is currently the most widely use scheme for data interchange, particularly for commercial software programs such as Finale and Sibelius. A particular strength of MusicXML is its ability to convert between part-based and score-based codes, owing to its having been modeled bilaterally on MuseData, for which the raw input is part-based, and the Humdrum kern format, which is score-based.

Audio interchange

Many schemes for audio interchange exist in the world of sound applications. Among them are these:

  • AIFF has been closely associated since 1988 with Apple computers.
  • RIFF has been closely associated since 1991 with Microsoft software.

Both schemes are based on the concept of "chunked" data. In sound files chunks segregate data records according to their purpose. Some records contain header information or metadata; some contain machine-specific code; most define the content of the work.

  • MP3 files further a practical interest in processing audio files, which can be exorbitant in size. Compression techniques speed processing but may sacrifice some of the precision in incremental usage.
  • MPEG, another compression scheme, is concerned with the synchronization of audio and video compression. A large number of extensions adapt it for other uses.

The Music Encoding Initiative (MEI)

The Music Encoding Initiative seeks to provide non-commercial standards for the generalized encoding of materials containing music such that the content of the original materials is preserved as literally as possible, while the preparation of new renderings is not inhibited. As a predecessor of both HTML and XML, Standard Generalized Markup Language can be viewed as the primogenitor of projects such as the Text Encoding Initiative (TEI) and MEI. MEI is an XML-based approach suited to the markup of both musical and partially musical materials (i.e., texts with interpolations of music).

Cross-domain interchange

Attempts have been made to develop cross-domain interchange schemes, but to date none of have been successful. The default "cross-domain" application is MIDI, to which we devote attention below.


Applications of Musical Information

References

1. Musical symbol list (incomplete): http://en.wikipedia.org/wiki/List_of_musical_symbols

2. Repp, Bruno. "Variations on a Theme by Chopin: Relations between Perception and Production of Timing in Music," Journal of Experimental Psychology: Human Perception and Performance, 24/3 (1998), 791-811 [ http://www.brainmusic.org/EducationalActivitiesFolder/Repp_Chopin1998.pdf]

3. Steinbeck, Wolfram. Struktur und Ähnlichkeit. Methoden automatisierter Melodienanalyse (= Kieler Schriften zur Musikwissenschaft 25). Habilitationsschrift. Kassel, 1982.