Difference between revisions of "MuseData File Structure"

From CCARH Wiki
Jump to navigation Jump to search
(Created page with " Musical scores and parts lend themselves to hierarchical descriptions. The architecture of MuseData reflects a view from the “library” level—one that allows for multi...")
 
Line 3: Line 3:
 
Musical scores and parts lend themselves to hierarchical descriptions.  The architecture of MuseData reflects a view from the “library” level—one that allows for multiple sources, editors, and publishers.  Most MuseData editions rely on a single source and a single editor.  Most editions are found online only.  However, the provision for multiple editions is maintained.  In almost all cases, the organizational details found here were prescribed and maintained by Walter Hewlett.
 
Musical scores and parts lend themselves to hierarchical descriptions.  The architecture of MuseData reflects a view from the “library” level—one that allows for multiple sources, editors, and publishers.  Most MuseData editions rely on a single source and a single editor.  Most editions are found online only.  However, the provision for multiple editions is maintained.  In almost all cases, the organizational details found here were prescribed and maintained by Walter Hewlett.
  
Score types
+
<b>Score types</b>
 +
<br>
 
At the same time, musicians and audiences are accustomed to subdivisions that differentiate movements and performing parts.  These subdivisions are obvious in PDF download areas of the holdings.  MuseData respects several kinds of scores—full scores (default), conducting scores (may have additional cues), continuo scores (basso continuo plus a vocal or a cello part plus cues for entries drawn from ensemble parts), and choral scores (all choral numbers with basso continuo).
 
At the same time, musicians and audiences are accustomed to subdivisions that differentiate movements and performing parts.  These subdivisions are obvious in PDF download areas of the holdings.  MuseData respects several kinds of scores—full scores (default), conducting scores (may have additional cues), continuo scores (basso continuo plus a vocal or a cello part plus cues for entries drawn from ensemble parts), and choral scores (all choral numbers with basso continuo).
  
Parts
+
<b>Parts</b>
 +
<br>
 
Parts are usually self-explanatory.  During Fran’s era, performers made as many photocopies as required.  Today, performers are adopting electronic means of acquiring parts at different rates.  One complication may be the combination of a very short part (e.g. a third violin) with an adjacent one (e.g. second violin).  Part behavior varies slightly with the composer.  Handel presents many conundrums.  In many of his autograph manuscripts wind instrumentalists not indicated at the start of a movement may be instructed to stop playing in the middle of it (e.g. by an ohne Oboe [without oboe] cue in a violin part)!  Fran carefully annotated these situations.  While many of her annotations are not visible in online scores, they may be encoded in pre-edition materials and cited in critical notes.     
 
Parts are usually self-explanatory.  During Fran’s era, performers made as many photocopies as required.  Today, performers are adopting electronic means of acquiring parts at different rates.  One complication may be the combination of a very short part (e.g. a third violin) with an adjacent one (e.g. second violin).  Part behavior varies slightly with the composer.  Handel presents many conundrums.  In many of his autograph manuscripts wind instrumentalists not indicated at the start of a movement may be instructed to stop playing in the middle of it (e.g. by an ohne Oboe [without oboe] cue in a violin part)!  Fran carefully annotated these situations.  While many of her annotations are not visible in online scores, they may be encoded in pre-edition materials and cited in critical notes.     
  
Copyright and permissions
+
<b>Copyright and permissions</b>
 +
<br>
 
Newly commissioned editions, as in the MuseData opera and oratorio materials, are under copyright and require acknowledgment of the editors named on the title page of the designated work.  Reposted scores must acknowledge the Center for Computer Assisted Research in the Humanities at Stanford University as the original owner.  
 
Newly commissioned editions, as in the MuseData opera and oratorio materials, are under copyright and require acknowledgment of the editors named on the title page of the designated work.  Reposted scores must acknowledge the Center for Computer Assisted Research in the Humanities at Stanford University as the original owner.  
  
Database architecture
+
<b>Database architecture</b>
 +
<br>
 
Walter Hewlett’s original idea was to devise not merely an encoding system or the software to facilitate but also an elaborate digital architecture in which the house it.  He was not so much concerned with encoding every work by any one composer as with building an virtual apparatus that could accommodate every problem type that might occur among the chosen repertories with which we worked.   
 
Walter Hewlett’s original idea was to devise not merely an encoding system or the software to facilitate but also an elaborate digital architecture in which the house it.  He was not so much concerned with encoding every work by any one composer as with building an virtual apparatus that could accommodate every problem type that might occur among the chosen repertories with which we worked.   
  
 
The most ubiquitous of these was a wide-ranging list of source materials—collected editions in modern typography, early printed sources, and manuscript materials.  This was a first approximation.  Beneath the surface lay diverse publishers, revised editions, and manuscript variants.  (Computer handling delivers its own overlays of software versions, operating systems, printing systems, and so forth.  Because all of our encoding are in ASCII code (the plain vanilla of digital documents), we believe that users should not encounter obstacles here.)   
 
The most ubiquitous of these was a wide-ranging list of source materials—collected editions in modern typography, early printed sources, and manuscript materials.  This was a first approximation.  Beneath the surface lay diverse publishers, revised editions, and manuscript variants.  (Computer handling delivers its own overlays of software versions, operating systems, printing systems, and so forth.  Because all of our encoding are in ASCII code (the plain vanilla of digital documents), we believe that users should not encounter obstacles here.)   
  
Subdivisions by genre
+
<b>Subdivisions by genre</b>
 +
<br>
 
To take the Bach Gesellschaft edition as an example, the subdivisions by genre follow the publication plan: canons, cantatas, chamber pieces, chorale harmonizations, keyboard (chiefly harpsichord) pieces, orchestral works, organ works, and vocal pieces.  Each additional composer is likely to bring the need for additional genre categories.  Some examples are the concerto grosso (Corelli), the solo concerto (Vivaldi), opera (Handel), oratorio (also Handel), string trio (Haydn), symphony (Haydn, Beethoven), string quartet (Beethoven), and myriad incremental deviations among and between them.  Inevitably, the intermingling of genre, composer, and source creates a multidimensional grid.   
 
To take the Bach Gesellschaft edition as an example, the subdivisions by genre follow the publication plan: canons, cantatas, chamber pieces, chorale harmonizations, keyboard (chiefly harpsichord) pieces, orchestral works, organ works, and vocal pieces.  Each additional composer is likely to bring the need for additional genre categories.  Some examples are the concerto grosso (Corelli), the solo concerto (Vivaldi), opera (Handel), oratorio (also Handel), string trio (Haydn), symphony (Haydn, Beethoven), string quartet (Beethoven), and myriad incremental deviations among and between them.  Inevitably, the intermingling of genre, composer, and source creates a multidimensional grid.   
  
Subdivisions by code
+
<b>Subdivisions by code type</b>
 +
<br>
 
The MuseData system employs a progression of degrees of competition.  Sound data (pitch and duration) is captured first (this was originally called Stage 1 data).  The data is corrected and enriched (e.g. with stem directions, beam groups) in Stage 2. Further refinements (addition of lyrics, ornamentation, etc) constitutes Stage 2 Plus.
 
The MuseData system employs a progression of degrees of competition.  Sound data (pitch and duration) is captured first (this was originally called Stage 1 data).  The data is corrected and enriched (e.g. with stem directions, beam groups) in Stage 2. Further refinements (addition of lyrics, ornamentation, etc) constitutes Stage 2 Plus.
  
 
The next stages include creation of individual pages, their assembly into a draft score, and refinements to it (e.g., markup to indicate placement of accolades, placement of bar numbers and rehearsal letters, differentiation of tempo designation of movement names).  At the page-level the input code is replaced by a parametric format appropriate for layout.  An important element of the parametric format is its capacity to identify super-objects.  Super-objects have a relationship to several nearby objects.  This requires explicit awareness.  Common examples include beams in relation to all the notes to be attached to them.  Slurs in relation to the notes under or over them have an association requiring similar awareness.  The advantage of identifying such group is one of enabling the typesetting software to place them as a group while still being able to manipulate each member of the group.   
 
The next stages include creation of individual pages, their assembly into a draft score, and refinements to it (e.g., markup to indicate placement of accolades, placement of bar numbers and rehearsal letters, differentiation of tempo designation of movement names).  At the page-level the input code is replaced by a parametric format appropriate for layout.  An important element of the parametric format is its capacity to identify super-objects.  Super-objects have a relationship to several nearby objects.  This requires explicit awareness.  Common examples include beams in relation to all the notes to be attached to them.  Slurs in relation to the notes under or over them have an association requiring similar awareness.  The advantage of identifying such group is one of enabling the typesetting software to place them as a group while still being able to manipulate each member of the group.   
  
Subdivisions by musical source  
+
<b>Subdivisions by musical source</b>
Once music is encoded, it is in principle easy to compare diverse musical texts for specific works.  CCARH has experimented with encoding multiple versions several times but does not do it as a regular practice. In the case of Corelli, a large number of sources were consulted for Opp. 3 (trio sonatas) and 5 (solo sonatas).  The enormous popularity of Corelli’s music lasted throughout the eighteenth century. More than a hundred editions of some his opuses survive. Few of these progeny are mentioned on our website. These came about in response to specific cases but do not necessarily contribute to a better understanding of the works. Some discrepancies simply reflect changing styles of notation.   
+
<br>
In the case of Handel and Vivaldi, we occasionally mention variant readings or variant instrumentation in the critical notes. Variant instrumentation may generate an alternative reading. We rarely do complete encoding of multiple sources, but we did for three sources for Vivaldi twelve concertos op. 3 (L’Estro armonico). It was Vivaldi’s first publication to gain widespread fame. Yet the differences are less striking than one might imagine, partly because alternative sources may be incomplete. This lapse is strikingly true for Corelli’s violin sonatas Op. 5, since although many violinists offered their personal realizations of the solo part for slow movements, they did not transcribe novel readings of ritornellos.   
+
Once music is encoded, it is in principle easy to compare diverse musical texts for specific works.  CCARH has experimented with encoding multiple versions several times but does not do it as a regular practice. In the case of Corelli, a large number of sources were consulted for Opp. 3 (trio sonatas) and 5 (solo sonatas).  The enormous popularity of Corelli’s music lasted throughout the eighteenth century. More than a hundred editions of some his opuses survive. Few of these progeny are mentioned on our website. These came about in response to specific cases but do not necessarily contribute to a better understanding of the works. Some discrepancies simply reflect changing styles of notation.   
 +
In the case of Handel and Vivaldi, we occasionally mention variant readings or variant instrumentation in the critical notes. Variant instrumentation may generate an alternative reading. We rarely do complete encoding of multiple sources, but we did for three sources for Vivaldi twelve concertos op. 3 (<i>L’Estro armonico</i>). It was Vivaldi’s first publication to gain widespread fame. Yet the differences are less striking than one might imagine, partly because alternative sources may be incomplete. This lapse is strikingly true for Corelli’s violin sonatas Op. 5, since although many violinists offered their personal realizations of the solo part for slow movements, they did not transcribe novel readings of ritornellos.   
  
 
Handel: Semele (derived from Arnold edition)
 
Handel: Semele (derived from Arnold edition)
Line 37: Line 44:
 
Frags/vocal [see Radamisto]
 
Frags/vocal [see Radamisto]
  
Hierarchy of file types
+
<b>Hierarchy of file types</b>
The MuseData structure allows for data distribution to multiple kinds of software through translation to other representation systems.  The codes to which MuseData files have been translated are SCORE (for enhancing works with SCORE’s extensive library of refinements and extended symbol sets), kern (the Humdrum code for representing common Western notation), and MIDI.  A less compete translation to MEI also exists, but MEI datasets exist mainly for short passages or, at most, single movements of long works and cannot be fully tested at this time.  Data translations are found in the MuseData distrib library.
+
<br>
 +
The MuseData structure allows for data distribution to multiple kinds of software through translation to other representation systems.   
  
In directories near completion, the title of the work is followed by subdirectories (1) distrib and (2) outputsBeyond those rubrics, (3) pages and (4) stage-2 files may be present. For scores generated by the software implementation at musedata.org, data from (3) pages is the basis for the score produced on-the-fly.  
+
The codes to which MuseData files have been translated are SCORE (for enhancing works with SCORE’s extensive library of refinements and extended symbol sets), kern (the Humdrum code for representing common Western notation), and MIDIA less compete translation to MEI also exists, but MEI datasets exist mainly for short passages or, at most, single movements of long works and cannot be fully tested at this time. Data translations are found in the MuseData <b>distrib</b> library. For more recent data-conversion capabilities please see the [https://verovio.humdrum.org/ Verovio Humdrum Viewer], which hosts and increasing number of music codes via its drag-and-drop capability.  
  
OUTPUTS may include clusters of files intended to produce parts, scores or parts to be printed at specific font sizes (e.g. score18), and notes (e.g. charts of overtures, arias, recitatives, ensemble pieces, ritornelli in an opera).
+
In directories near completion, the title of the work is followed by subdirectories (1) distrib and (2) outputs. Beyond those rubrics, (3) pages and (4) stage-2 files may be present. For scores generated by the software implementation at musedata.org, data from (3) pages is the basis for the score produced on-the-fly.
 +
 
 +
<b>Outputs</b> may include clusters of files intended to produce parts, scores or parts to be printed at specific font sizes (e.g. score18), and critical notes (e.g. charts of overtures, arias, recitatives, ensemble pieces, ritornelli in an opera).

Revision as of 19:28, 1 June 2023


Musical scores and parts lend themselves to hierarchical descriptions. The architecture of MuseData reflects a view from the “library” level—one that allows for multiple sources, editors, and publishers. Most MuseData editions rely on a single source and a single editor. Most editions are found online only. However, the provision for multiple editions is maintained. In almost all cases, the organizational details found here were prescribed and maintained by Walter Hewlett.

Score types
At the same time, musicians and audiences are accustomed to subdivisions that differentiate movements and performing parts. These subdivisions are obvious in PDF download areas of the holdings. MuseData respects several kinds of scores—full scores (default), conducting scores (may have additional cues), continuo scores (basso continuo plus a vocal or a cello part plus cues for entries drawn from ensemble parts), and choral scores (all choral numbers with basso continuo).

Parts
Parts are usually self-explanatory. During Fran’s era, performers made as many photocopies as required. Today, performers are adopting electronic means of acquiring parts at different rates. One complication may be the combination of a very short part (e.g. a third violin) with an adjacent one (e.g. second violin). Part behavior varies slightly with the composer. Handel presents many conundrums. In many of his autograph manuscripts wind instrumentalists not indicated at the start of a movement may be instructed to stop playing in the middle of it (e.g. by an ohne Oboe [without oboe] cue in a violin part)! Fran carefully annotated these situations. While many of her annotations are not visible in online scores, they may be encoded in pre-edition materials and cited in critical notes.

Copyright and permissions
Newly commissioned editions, as in the MuseData opera and oratorio materials, are under copyright and require acknowledgment of the editors named on the title page of the designated work. Reposted scores must acknowledge the Center for Computer Assisted Research in the Humanities at Stanford University as the original owner.

Database architecture
Walter Hewlett’s original idea was to devise not merely an encoding system or the software to facilitate but also an elaborate digital architecture in which the house it. He was not so much concerned with encoding every work by any one composer as with building an virtual apparatus that could accommodate every problem type that might occur among the chosen repertories with which we worked.

The most ubiquitous of these was a wide-ranging list of source materials—collected editions in modern typography, early printed sources, and manuscript materials. This was a first approximation. Beneath the surface lay diverse publishers, revised editions, and manuscript variants. (Computer handling delivers its own overlays of software versions, operating systems, printing systems, and so forth. Because all of our encoding are in ASCII code (the plain vanilla of digital documents), we believe that users should not encounter obstacles here.)

Subdivisions by genre
To take the Bach Gesellschaft edition as an example, the subdivisions by genre follow the publication plan: canons, cantatas, chamber pieces, chorale harmonizations, keyboard (chiefly harpsichord) pieces, orchestral works, organ works, and vocal pieces. Each additional composer is likely to bring the need for additional genre categories. Some examples are the concerto grosso (Corelli), the solo concerto (Vivaldi), opera (Handel), oratorio (also Handel), string trio (Haydn), symphony (Haydn, Beethoven), string quartet (Beethoven), and myriad incremental deviations among and between them. Inevitably, the intermingling of genre, composer, and source creates a multidimensional grid.

Subdivisions by code type
The MuseData system employs a progression of degrees of competition. Sound data (pitch and duration) is captured first (this was originally called Stage 1 data). The data is corrected and enriched (e.g. with stem directions, beam groups) in Stage 2. Further refinements (addition of lyrics, ornamentation, etc) constitutes Stage 2 Plus.

The next stages include creation of individual pages, their assembly into a draft score, and refinements to it (e.g., markup to indicate placement of accolades, placement of bar numbers and rehearsal letters, differentiation of tempo designation of movement names). At the page-level the input code is replaced by a parametric format appropriate for layout. An important element of the parametric format is its capacity to identify super-objects. Super-objects have a relationship to several nearby objects. This requires explicit awareness. Common examples include beams in relation to all the notes to be attached to them. Slurs in relation to the notes under or over them have an association requiring similar awareness. The advantage of identifying such group is one of enabling the typesetting software to place them as a group while still being able to manipulate each member of the group.

Subdivisions by musical source
Once music is encoded, it is in principle easy to compare diverse musical texts for specific works. CCARH has experimented with encoding multiple versions several times but does not do it as a regular practice. In the case of Corelli, a large number of sources were consulted for Opp. 3 (trio sonatas) and 5 (solo sonatas). The enormous popularity of Corelli’s music lasted throughout the eighteenth century. More than a hundred editions of some his opuses survive. Few of these progeny are mentioned on our website. These came about in response to specific cases but do not necessarily contribute to a better understanding of the works. Some discrepancies simply reflect changing styles of notation. In the case of Handel and Vivaldi, we occasionally mention variant readings or variant instrumentation in the critical notes. Variant instrumentation may generate an alternative reading. We rarely do complete encoding of multiple sources, but we did for three sources for Vivaldi twelve concertos op. 3 (L’Estro armonico). It was Vivaldi’s first publication to gain widespread fame. Yet the differences are less striking than one might imagine, partly because alternative sources may be incomplete. This lapse is strikingly true for Corelli’s violin sonatas Op. 5, since although many violinists offered their personal realizations of the solo part for slow movements, they did not transcribe novel readings of ritornellos.

Handel: Semele (derived from Arnold edition) Handel/chry [Chrysander]/opera {orch, torio} Paths through the system Distrib/kern; midi1; midip; mused; score Editions/parts Doc/ documentation on file origins Frags/vocal [see Radamisto]

Hierarchy of file types
The MuseData structure allows for data distribution to multiple kinds of software through translation to other representation systems.

The codes to which MuseData files have been translated are SCORE (for enhancing works with SCORE’s extensive library of refinements and extended symbol sets), kern (the Humdrum code for representing common Western notation), and MIDI. A less compete translation to MEI also exists, but MEI datasets exist mainly for short passages or, at most, single movements of long works and cannot be fully tested at this time. Data translations are found in the MuseData distrib library. For more recent data-conversion capabilities please see the Verovio Humdrum Viewer, which hosts and increasing number of music codes via its drag-and-drop capability.

In directories near completion, the title of the work is followed by subdirectories (1) distrib and (2) outputs. Beyond those rubrics, (3) pages and (4) stage-2 files may be present. For scores generated by the software implementation at musedata.org, data from (3) pages is the basis for the score produced on-the-fly.

Outputs may include clusters of files intended to produce parts, scores or parts to be printed at specific font sizes (e.g. score18), and critical notes (e.g. charts of overtures, arias, recitatives, ensemble pieces, ritornelli in an opera).