Humdrum lab 5
This lab is about the wikifonia data.
Update Humdrum tools
In case there have been changes to the humdrum tools programs, you can update them with this command:
cd $(which beat | sed 's/humdrum-tools.*/humdrum-tools/') make update make
Basic information
How many files
ls *.krn | wc -l 6710
How many have lyrics:
grep -l "\*\*text" *.krn | wc -l 5460
How many have chords:
grep -l "\*\*mxhm" *.krn | wc -l 6282
How many have two or more verses:
grep -l "\*\*text.*\*\*text" *.krn | wc -l 2006
Bibliographic and basic information
Who are the top 10 represented composer in the data:
grep -h COM *.krn | sortcount | head -n 10 132 !!!COM: Unknown 121 !!!COM: Hungarian folk song 119 !!!COM: Traditional 91 !!!COM: Richard Rodgers 75 !!!COM: Irving Berlin 67 !!!COM: Hungarian song 65 !!!COM: Cole Porter 46 !!!COM: Harry Warren 45 !!!COM: George Gershwin 40 !!!COM: Harold Arlen
Most repeated titles:
grep OTL * -h | sortcount | head -n 10 7 !!!OTL: A Daisy A Day 6 !!!OTL: Amazing Grace 5 !!!OTL: Cabaret 5 !!!OTL: Take Five 4 !!!OTL: test 4 !!!OTL: Unforgettable 4 !!!OTL: You'll Never Walk Alone 4 !!!OTL: Nuages 4 !!!OTL: Birk's Works 4 !!!OTL: This Is My Song
Note that "sortcount" is a Humdrum Extra script which is equivalent (in this case) with the unix command "sort | uniq -c | sort -nr". Which files contain the totle "A Daisy A Day":
grep "OTL.*A Daisy A Day" *.krn -l WF3959.krn WF3960.krn WF3961.krn WF3962.krn WF3963.krn WF3964.krn WF3967.krn
To view the files in VHV from MacOS:
cat WV3959.krn | pbcopy
And then paste onto https://verovio.humdrum.org text region (command-A to select all old text, and then command-V to paste new score).
List the titles of all pieces where George Gershwin is the composer:
grep OTL $(grep -li COM.*Gershwin *.krn) | sort -k2 WF2190.krn:!!!OTL: 'S Wonderful! WF2191.krn:!!!OTL: A FOGGY DAY WF2267.krn:!!!OTL: A Foggy Day WF2186.krn:!!!OTL: A Woman Is A Sometime Thing WF2192.krn:!!!OTL: Bidin' My Time WF2178.krn:!!!OTL: Blues WF2193.krn:!!!OTL: But Not For Me WF2194.krn:!!!OTL: By Strauss WF2195.krn:!!!OTL: Clap yo' hands WF2185.krn:!!!OTL: Do It Again! WF2196.krn:!!!OTL: Embraceable You WF2197.krn:!!!OTL: Fascinating Rhythm WF2268.krn:!!!OTL: For You, For Me, For Evermore WF2198.krn:!!!OTL: How Long Has This Been Going On WF2199.krn:!!!OTL: I Got Plenty o' Nuttin' WF2103.krn:!!!OTL: I Got Rhythm WF2200.krn:!!!OTL: I Got Rhythm WF2222.krn:!!!OTL: I Loves You Porgy WF2201.krn:!!!OTL: I Was Doing All Right WF2202.krn:!!!OTL: I Was Doing All Right WF2179.krn:!!!OTL: I loves you Porgy WF2184.krn:!!!OTL: I'll Build A Stairway To Paradise WF2203.krn:!!!OTL: I've Got A Crush On You WF2204.krn:!!!OTL: Isn't It A Pity WF2221.krn:!!!OTL: It Ain't Necessarily So WF2205.krn:!!!OTL: Let's Call the Whole Thing Off WF2220.krn:!!!OTL: Liza WF2219.krn:!!!OTL: Liza (All the clouds'll roll away) WF2269.krn:!!!OTL: Love Is Here To Stay WF2206.krn:!!!OTL: Love Walked In WF2223.krn:!!!OTL: My Man's Gone Now WF2207.krn:!!!OTL: Nice Work If You Can Get It WF2208.krn:!!!OTL: Oh Lady Be Good WF2189.krn:!!!OTL: SUMMERTIME WF2995.krn:!!!OTL: Shoes With Wings On WF2183.krn:!!!OTL: Somebody Loves Me WF2209.krn:!!!OTL: Someone To Watch Over Me WF2210.krn:!!!OTL: Soon WF2211.krn:!!!OTL: Strike Up The Band WF2180.krn:!!!OTL: Summertime WF2181.krn:!!!OTL: Summertime WF2187.krn:!!!OTL: Summertime WF2212.krn:!!!OTL: Summertime WF2224.krn:!!!OTL: Swanee WF2213.krn:!!!OTL: That Certain Feeling WF2214.krn:!!!OTL: The Man I Love WF2225.krn:!!!OTL: The Simple Life WF2188.krn:!!!OTL: There's A Boat Dat's Leaving Soon For New York WF2215.krn:!!!OTL: They All Laughed WF2216.krn:!!!OTL: They Can't Take That Away From Me WF2217.krn:!!!OTL: They Can't Take That Away From Me WF2218.krn:!!!OTL: Who Cares
Texture
How many contain more than one **kern spine (i.e., are polyphonic, probably piano):
grep -l "\*\*kern.*\*\*kern" *.krn | wc -l 58
WF5118.krn is an example:
This one is interesting because it has invisible chords in the top staff which are realizing the harmonic chords above the staff.
How many songs have chords (this takes a long time to calculate -- 70 songs per second = 95 seconds):
for i in *.krn do extractx -i kern $i | serialize | ridx -H | grep " " | wc -l done | grep -v " ^0$" | wc -l 365
How many songs do not have chords:
for i in *.krn do extractx -i kern $i | serialize | ridx -H | grep " " | wc -l done | grep " ^0$" | wc -l 6345
Duration
What is the duration of all songs if played back-to-back and at the specified tempo without repeats?
gettime -T *.krn | tail -n 1 286:50:23.1354 hours
What are the longest songs:
gettime --simple -T *.krn | sort -k2 -nr | head -n 10 WF6618.krn: 3120 WF0181.krn: 3120 WF0182.krn: 1864 WF3616.krn: 1420 WF6336.krn: 1134 WF5131.krn: 909 WF6068.krn: 785 WF5004.krn: 696 WF3226.krn: 671 WF1249.krn: 664
The -k2 option means to sort by the second column of data. -n means to sort numerically rather than alphabetically, and -r means to sort by highest first.
What are the shortest songs:
gettime --simple -T *.krn | sort -k2 -nr | tail -n 10 WF2814.krn: 16 WF2806.krn: 16 WF2795.krn: 16 WF2785.krn: 16 WF2856.krn: 14 WF2852.krn: 12 WF2799.krn: 12 WF6338.krn: 8 WF5609.krn: 8
The shortest song in VHV:
cat WF5609.krn | pbcopy
Meter
What sort of meters are in the database and how much of each type?
beat -Ca *.krn | beat -Ua | extractx -s '$1-$' | ridx -H | sortcount -p 65.89 4 4 15.44 3 4 11.67 2 2 3.24 2 4 2.78 6 8 0.52 12 8 0.16 5 4 0.14 9 8 0.09 6 4 0.06 3 8 0.01 3 2 0.01 7 4 0 2 8 0 7 8 0 9 4 0 10 8 0 17 16 0 1 2 0 5 8 0 1 4 0 4 8
The most common meter is 4/4, where 65% of the music is in that meter.
-C means extract the count of the meter (the top number).
-U means extract the duration unit from the meter (the bottom number).
-C and -U are output once for each measure, so using these are a simple way of counting the number of measures in the scores. If you add -F option with these two options, every data line will display the metrical information.
-a means to append the analysis to the end of the lines (keeping the original input score).
The extract option:
-s '$1-$'
means to extract from one before the last spine to the last spine. $1 is one before the last spine, $2 is two before the last spine, and so on.
Chord labels
How many unique chord labels are there?
extractx -i mxhm * | ridx -H | sortcount | wc -l 1399
What are the most common ones:
extractx -i mxhm * | ridx -H | sortcount -p | head -n 10 7.21 C major 6.14 F major 4.94 G major 4.83 G dominant 4.07 C dominant 3.55 D dominant 3.28 B- major 2.81 E- major 2.3 D major 2.25 F dominant
How many chord qualities:
extractx -i mxhm * | ridx -H | sed 's/[^ ]* //; s/\/.*//' | sortcount | wc -l 80
Here are the 80 qualities:
93061 major 64155 dominant 31402 minor 27490 minor-seventh 8594 major-seventh 5861 dominant-ninth 5733 major-sixth 4138 diminished 2943 min 2912 7 2816 minor-sixth 2159 half-diminished 2154 suspended-fourth 1884 diminished-seventh 1738 augmented-seventh 1408 augmented 1208 C 1102 dominant-13th 1082 min7 1008 F 1008 maj7 967 dominant-seventh 892 G 878 minor-ninth 705 D 650 B- 592 major-ninth 355 E- 352 A 280 E 249 power 237 dominant-11th 222 suspended-second 202 minor-11th 165 minor-major 157 dim 129 maj 128 A- 96 augmented-ninth 84 9 71 other 66 B 62 6 62 major-minor 58 sus47 46 aug 46 D- 46 min9 36 G- 29 m7b5 23 major-13th 21 maj9 19 min6 19 none 17 pedal 16 dim7 16 maj69 15 F# 12 major B- major 8 major F major 7 C# 6 major . 5 minor D minor D minor 4 minor-13th 3 minor G minor 3 C- 3 minMaj7 3 D# 2 minor . 2 major F major F major 2 dominant C dominant 2 5b 2 major C major C major 1 major E- major 1 ma 1 major . . 1 major G major G major 1 minor . . 1 7sus 1
extractx -i mxhm * | ridx -H | sed 's/ .*//' | sortcount 48078 C 45001 G 38459 F 33577 D 24361 A 23237 B- 16334 E- 15587 E 8279 A- 7869 B 3487 D- 3365 F# 1701 C# 1316 G- 824 G# 187 D# 168 C- 52 A# 28 F- 6 B-- 4 B# 4 . F 3 B/D# 3 E# 3 C/G 1 . B- 1 A--
What is the most common 3-note chord sequence:
extractx -i mxhm * | grep -v ^= | serialize | context -n 3 | ridx -H | sortcount | head -n 10 1779 C major G dominant C major (I V7 I) 1334 F major C dominant F major (I V7 I) 1301 C major F major C major (I V I) 1062 D minor-seventh G dominant C major (ii7 V7 I) 994 G major D dominant G major (I V7 I) 939 G dominant C major G dominant (V7 I V7) 863 G major C major G major (V I V) 857 G dominant C major F major (V7 I IV) 812 F major B- major F major (V I V) 781 G minor-seventh C dominant F major (ii7 V7 I)
Scale degrees
The key information is not present in the files. They need to be processed further for that. MusicXML input has key information, but it is often incorrect since people use it more for key signature information, and the "mode" part is usually left at "major". The finalis-tonic script can be used to add an approximate key.
for i in *.krn do finalis-tonic $i | extractx -i **kern | deg -at | serialize | ridx -H | grep -v r done | sortcount
Output includes chords and some other junk, but basic counts are:
161963 1 135309 5 102726 2 102275 3 86931 6 85740 4 50631 7 38827 7- 38765 3- 24511 6- 12031 4+ 11188 2- 6041 5+ 5822 2+ 5803 1+ 4907 5- 2822 1- 2669 6+ 1874 4- 1222 7+ 1126 3+
Looking at 5-note sequences:
for i in *.krn do finalis-tonic $i | extractx -i **kern | deg -at | serialize -f | grep -v '^[r=]' \ | context -n 5 | ridx -H ; done | sortcount > /tmp/analysis-data.txt done head -n 25 analysis-data.txt
6053 1 1 1 1 1 4417 5 5 5 5 5 2186 3 3 3 3 3 1872 4 4 4 4 4 1573 2 2 2 2 2 1293 3 2 1 2 3 1282 5 4 3 2 1 1189 6 6 6 6 6 1087 1 2 3 2 1 915 2 1 1 1 1 894 3 4 3 2 1 890 3- 3- 3- 3- 3- 883 3 3 3 2 1 855 3 2 1 1 1 852 1 1 1 1 2 825 6 5 4 3 2 806 5 5 5 5 4 786 3 2 1 2 1 785 5 5 5 5 6 761 3 3 3 3 2 761 7 7 7 7 7 753 7- 7- 7- 7- 7- 752 5 1 1 1 1 726 1 2 3 4 5 723 4 3 2 1 1
Key
Using entire contents of file to determine key:
keycor *.krn | sed 's/.*: //' | sortcount -p 22.33 C Major 15.05 F Major 11.87 G Major 9.21 E- Major 6.87 B- Major 5.82 A Minor 4.51 D Minor 4.39 D Major 4.16 C Minor 3.44 G Minor 3.03 E Minor 2.11 A- Major 1.78 A Major 1.41 F Minor 1.02 E Major 0.64 B- Minor 0.53 D- Major 0.39 E- Minor 0.39 B Minor 0.29 B Major 0.18 G- Major 0.16 F# Minor 0.16 C# Minor 0.08 F# Major 0.06 C# Major 0.04 D- Minor 0.04 A- Minor 0.04 G- Minor 0.02 G# Minor 0.02 G# Major
Using only the first 8 bars of the music for analysis of key:
for i in *.krn do myank -m 1-8 $i | keycor done | sed 's/.*: //' | sortcount -p
19.46 C Major 12.32 F Major 10.2 G Major 6.89 E- Major 6.7 A Minor 6.65 D Minor 5.64 B- Major 5.36 C Minor 5.21 D Major 4.3 G Minor 4.1 E Minor 2.41 F Minor 1.72 A Major 1.65 A- Major 1.31 B Minor 1.22 B Major 1.04 E Major 0.89 B- Minor 0.56 E- Minor 0.56 D- Major 0.48 F# Minor 0.45 C# Minor 0.3 A- Minor 0.23 G- Major 0.12 F# Major 0.08 C# Major 0.07 G# Minor 0.05 G- Minor 0.03 G# Major 0.02 D- Minor
Lab 1 (intro) | Lab 2 (Essen) | Lab 3 (searching) | Lab 4 (JRP) | Lab 5 (Wikifonia) | Lab 6 (bar chart) | Lab 7 (regular expressions) | Lab 8 (chorck & cint) |