Difference between revisions of "Humdrum Extras"
Line 10: | Line 10: | ||
= Programming Examples = | = Programming Examples = | ||
− | == humecho.cpp == | + | == Basic data access == |
+ | |||
+ | === humecho.cpp === | ||
<!-- Syntax Highlighter defaults --> | <!-- Syntax Highlighter defaults --> | ||
<!-- http://alexgorbatchev.com/SyntaxHighlighter/manual/configuration/ --> | <!-- http://alexgorbatchev.com/SyntaxHighlighter/manual/configuration/ --> | ||
Line 37: | Line 39: | ||
− | == humecho2.cpp (Accessing individual lines) == | + | === humecho2.cpp (Accessing individual lines) === |
The <tt>humecho</tt> program shows how to access the datafile in its entirety. The following source code for <tt>humecho2.cpp</tt> demonstrates how to access lines in the file individually. A HumdrumFile class essentially consists of an array of HumdrumRecord classes, and HumdrumRecord classes essentially are character strings which print tab-delimited with <tt>cout</tt>: | The <tt>humecho</tt> program shows how to access the datafile in its entirety. The following source code for <tt>humecho2.cpp</tt> demonstrates how to access lines in the file individually. A HumdrumFile class essentially consists of an array of HumdrumRecord classes, and HumdrumRecord classes essentially are character strings which print tab-delimited with <tt>cout</tt>: | ||
Line 57: | Line 59: | ||
<tt>hfile.getNumLines()</tt> returns the number of text lines in the Humdrum file stored in the hfile variable. So the for loop iterates through each line in the file and prints it to standard output. | <tt>hfile.getNumLines()</tt> returns the number of text lines in the Humdrum file stored in the hfile variable. So the for loop iterates through each line in the file and prints it to standard output. | ||
− | == humecho3.cpp (Accessing spine data) == | + | === humecho3.cpp (Accessing spine data) === |
An even more verbose version of <tt>humecho</tt> is given below. The <tt>humecho3</tt> program implements the <tt><<</tt> operator as a second for-loop. Each HumdrumRecord representing a line of music can be thought of as an array of strings, with each string being one token in the Humdrum File structure. | An even more verbose version of <tt>humecho</tt> is given below. The <tt>humecho3</tt> program implements the <tt><<</tt> operator as a second for-loop. Each HumdrumRecord representing a line of music can be thought of as an array of strings, with each string being one token in the Humdrum File structure. | ||
Line 86: | Line 88: | ||
Note that <tt>hfile[i][j]</tt> is a <tt>const char*</tt> and not a <tt>char*</tt>. If you want to change the contents of a field, you would have to use <tt>hfile[i].changeField(j, "new string")</tt>. | Note that <tt>hfile[i][j]</tt> is a <tt>const char*</tt> and not a <tt>char*</tt>. If you want to change the contents of a field, you would have to use <tt>hfile[i].changeField(j, "new string")</tt>. | ||
− | == HumdrumRecord line types == | + | === HumdrumRecord line types === |
Each HumdrumRecord is a certain enumerated <i>type</i>. | Each HumdrumRecord is a certain enumerated <i>type</i>. | ||
Line 137: | Line 139: | ||
|} | |} | ||
− | == "rid -GLI" (Remove all lines except for data lines) == | + | === "rid -GLI" (Remove all lines except for data lines) === |
The Humdrum Tool <tt>rid</tt> with the <tt>-GLI</tt> options can be implemented using the following C++ code: | The Humdrum Tool <tt>rid</tt> with the <tt>-GLI</tt> options can be implemented using the following C++ code: | ||
Line 156: | Line 158: | ||
− | == "rid -GLId" (Remove comments, interpretations and null data) == | + | === "rid -GLId" (Remove comments, interpretations and null data) === |
<source lang="cpp"> | <source lang="cpp"> | ||
Line 174: | Line 176: | ||
The <tt>HumdrumRecord::isNull()</tt> returns true if all fields in the record are equal to the string "." (called a <i>null record</i> in Humdrum terminology—not related to a NULL pointer in C). | The <tt>HumdrumRecord::isNull()</tt> returns true if all fields in the record are equal to the string "." (called a <i>null record</i> in Humdrum terminology—not related to a NULL pointer in C). | ||
+ | == User-specified Options == | ||
− | == "myrid -M -C -I" (Handling command-line options) == | + | === "myrid -M -C -I" (Handling command-line options) === |
The Humdrum Extras code contains a helper class called Options which can be used to manage command-line options. The following example program implements the options <tt>-M</tt> (suppress measure lines), <tt>-C</tt> (suppress comments), <tt>-I</tt> (suppress interpretations) in a C++ implementation of the Humdrum Toolkit <i>rid</i> program. | The Humdrum Extras code contains a helper class called Options which can be used to manage command-line options. The following example program implements the options <tt>-M</tt> (suppress measure lines), <tt>-C</tt> (suppress comments), <tt>-I</tt> (suppress interpretations) in a C++ implementation of the Humdrum Toolkit <i>rid</i> program. |
Revision as of 20:39, 7 December 2012
Humdrum Extras is a set of command-line programs and C++ parser library for processing Humdrum files. The programs can be compiled for Linux, Apple OS X, or Windows (primarily within cygwin, but also in Visual C++). The Humdrum Extras library can be used to parse Humdrum files independent of the example programs provided with the package.
Example Programs
The primary intent of the Humdrum Extras package is for user-based processing of Humdrum files as an auxiliary to the Humdrum Toolkit. Since the programs are compiled from C++, they process data much faster than programs written in interpreted languages, such as AWK which is the main development language for the Humdrum Toolkit.
Documentation for example programs can be found on the web at extras.humdrum.org/man. The source code for these programs is found in the download file, within the src-programs directory, or they can be viewed online.
Programming Examples
Basic data access
humecho.cpp
Here is a very simple C++ program called humecho.cpp that uses the Humdrum file parser in the Humdrum Extras library:
#include "humdrum.h"
#include <iostream>
int main(int argc, char** argv) {
HumdrumFile hfile;
if (argc > 1) hfile.read(argv[1]);
else hfile.read(std::cin);
std::cout << hfile;
return 0;
}
This program will take one Humdrum file as an argument (or standard input) and echo the contents of the Humdrum file to standard output. To compile this program using the Humdrum Extras makefiles, place humecho.cpp in the directory humextra/src-programs, and then type "make humecho. The humecho program can be utilized in several ways, including downloading from the web, or using the humdrum:// (or hum:// or h:// abbreviations):
cat file.krn | bin/humecho | less # standard input bin/humecho file.krn | less # command-line argument bin/humecho h://wtc/wtc1f01.krn | less # humdrum:// URI bin/humecho http://y.z.com/file.krn | less # URL
humecho2.cpp (Accessing individual lines)
The humecho program shows how to access the datafile in its entirety. The following source code for humecho2.cpp demonstrates how to access lines in the file individually. A HumdrumFile class essentially consists of an array of HumdrumRecord classes, and HumdrumRecord classes essentially are character strings which print tab-delimited with cout:
#include "humdrum.h"
int main(int argc, char** argv) {
HumdrumFile hfile;
if (argc > 1) hfile.read(argv[1]);
else hfile.read(std::cin);
for (int i=0; i<hfile.getNumLines(); i++) {
std::cout << hfile[i] << std::endl;
}
return 0;
}
hfile.getNumLines() returns the number of text lines in the Humdrum file stored in the hfile variable. So the for loop iterates through each line in the file and prints it to standard output.
humecho3.cpp (Accessing spine data)
An even more verbose version of humecho is given below. The humecho3 program implements the << operator as a second for-loop. Each HumdrumRecord representing a line of music can be thought of as an array of strings, with each string being one token in the Humdrum File structure.
#include "humdrum.h"
int main(int argc, char** argv) {
HumdrumFile hfile;
if (argc > 1) hfile.read(argv[1]);
else hfile.read(std::cin);
for (int i=0; i<hfile.getNumLines(); i++) {
std::cout << "\t" << hfile[i][0];
for (int j=1; j<hfile[i].getFieldCount(); j++) {
std::cout << "\t" << hfile[i][j] << std::endl;
}
std::cout << std::endl;
}
return 0;
}
HumdrumRecords always contain at least one field, so the code "cout << hfile[i][0];" will not cause an invalid array access in any situation. Both [] operators used on the hfile variable (first to access a HumdrumRecord, and the second for a const char*) are checked for a valid range, and the program will exit with an error if an out-of-range value is requested.
The code hfile[i].getFieldCount() returns the number of "fields" on the line. This is a non-standard term for Humdrum files, since "spines" and "tokens" can have somewhat ambiguous meanings. The field count is a count of the spines, but if the spines split the count would include the subspines as well. Global comments and reference records are always element 0 in a HumdrumRecord line. Empty lines, which are technically not allowed in Humdrum files, are also acessed as an empty string at element 0.
Note that hfile[i][j] is a const char* and not a char*. If you want to change the contents of a field, you would have to use hfile[i].changeField(j, "new string").
HumdrumRecord line types
Each HumdrumRecord is a certain enumerated type.
E_humrec_empty | empty line (technically invalid, but allowed in Humdrum Extras parsing) |
E_humrec_bibliography | of the form “!!!key: value” |
E_humrec_global_comment | starts with “!!” |
E_humrec_local_comment | local comment (!) |
E_humrec_data_measure | line starting with “=” |
E_humrec_interpretation | line starting with “*” |
E_humrec_data | data lines other than measure |
Use the HumdrumRecord::getType() function to access the type of a line. But for better code readability, the following helper HumdrumRecord functions interface with these enumerations:
.isData() | true if data (other than barline). |
.isMeasure() | true if barline (line starts with “=”). |
.isInterpretation() | true if line starts with “*”. |
.isBibliographic() | true if in the form of “!!!key: value”. |
.isGlobalComment() | true if line starts with “!!” and not bib. |
.isLocalComment() | true if line starts with one “!”. |
.isEmpty() | true if nothing on line. |
In addition there are a few composite test for line types:
.isComment() | isBibliographic() or isGlobalComment() or isLocalComment() |
.isTandem() | Interpretation lines which contain no spine manipulators (*+, *-, *^, *v, *x, or exclusive interpretations (starting with **). |
.isNull() | isData() and all fields are "." (null token). |
"rid -GLI" (Remove all lines except for data lines)
The Humdrum Tool rid with the -GLI options can be implemented using the following C++ code:
#include "humdrum.h"
int main(int argc, char** argv) {
HumdrumFile hfile(argv[1]);
for (int i=0; i<hfile.getNumLines(); i++) {
if (!(hfile[i].isData() || hfile[i].isMeasure())) continue;
std::cout << hfile[i] << std::endl;
}
return 0;
}
The above code will only print lines which are data or barlines. The official Humdrum file specification does not technically distinguish between barlines and data, but in practice and from a logical point of view they must be separated. So when using the Humdrum Extras C++ parser for Humdrum files, a line of data should not contain a mixture of data (or null tokens) and barlines.
"rid -GLId" (Remove comments, interpretations and null data)
#include "humdrum.h"
int main(int argc, char** argv) {
HumdrumFile hfile(argv[1]);
for (int i=0; i<hfile.getNumLines(); i++) {
if (!(hfile[i].isData() || hfile[i].isMeasure())) continue;
if (hfile[i].isNull()) continue;
std::cout << hfile[i] << std::endl;
}
return 0;
}
The HumdrumRecord::isNull() returns true if all fields in the record are equal to the string "." (called a null record in Humdrum terminology—not related to a NULL pointer in C).
User-specified Options
"myrid -M -C -I" (Handling command-line options)
The Humdrum Extras code contains a helper class called Options which can be used to manage command-line options. The following example program implements the options -M (suppress measure lines), -C (suppress comments), -I (suppress interpretations) in a C++ implementation of the Humdrum Toolkit rid program.
The Options class can be used to define multiple aliases for the same option, such as a short abbreviation and a long form. The options are formulated on the command line according to POSIX rules for options: single-letter options are preceded by a single dash. Multiple-letter options are preceeded by two dashes. When a single-letter option does not require it's own argument, they can be globbed together into a list of options preceded by a single dash. Here are various program usages for the code below:
myrid -M file.krn | Remove measure lines when echoing file.krn to standard output. |
myrid -M -I -C file.krn | Remove measure lines, interpretations and comments (global, local and reference). |
myrid -MIC file.krn | Same as above. Shorthand for bundling multiple single-letter boolean options. |
myrid --no-measures file.krn | Long for of "myrid -M". |
myrid --options | Secret built-in option for the Option class which will force a list of defined options to be printed to standard output. |
myrid -A file.krn | The option list will also be displayed when an undefined or misspelled option is used. Use "--" to disable options processing for unusual cases such as a filename starting with a dash. |
myrid -MM file.krn | Duplicate options are ignored, so only the last -M is used. Note that this is not the option "MM" which would be formulated as "myrid --MM". |
myrid -M file.krn -IC | Options can occur in any order, and can come before or after any command arguments which are not options. |
myride -M -- -file.krn -C | Process the poorly named file "-file.krn" and the even more poorly named file called "-C" (which is not an option if it comes after the -- marker. |
Note in the following source code, an extra include directive does not need to be added, since the declaration of the Options class is included in humdrum.h. If you want to use the Options class independent of the HumdrumFile parser, you can instead include the file "Options.h".
include "humdrum.h"
int main(int argc, char** argv) {
Options opts;
opts.define("M|no-measures:b", "remove measures");
opts.define("C|no-comments:b", "remove comments");
opts.define("I|no-interpretations:b", "remove interpretations");
opts.process(argc, argv);
int measuresQ = !opts.getBoolean("no-measures");
int commentsQ = !opts.getBoolean("no-comments");
int interpQ = !opts.getBoolean("no-interpretations");
HumdrumFile hfile(opts.getArg(1));
for (int i=0; i<hfile.getNumLines(); i++) {
if (hfile[i].isMeasure() && !measureQ) continue;
if (hfile[i].isComment() && !commentQ) continue;
if (hfile[i].isInterpretation() && !interpQ) continue;
std::cout << hfile[i] << std::endl;
}
return 0;
}
The code "HumdrumFile hfile(opts.getArg(1));" reads data from the first argument on the command line. Note that argument counts are indexed from 1 rather than 0. Perhaps not a great thing to do, but was intended to allow for similar behavior with command-line string arrays in C, where the name of the command is stored in array element 0, and the first argument (or option) is stored in array element 1. To access the name of the command, use the Options::getCommand() function.