Difference between revisions of "Humdrum lab 5"

From CCARH Wiki
Jump to navigation Jump to search
(Replaced content with "This lab is about the wikifonia data. == Basic information == How many files ls *.krn | wc -l 6710")
Line 1: Line 1:
This lab is about plotting data, and doing further analysis of the raw data extracted from Humdrum files.
+
This lab is about the wikifonia data.
  
 +
== Basic information ==
  
There are several possibilities for plotting.  We will focus on the last one in the lab, but here are other possibilities:
+
How many files
  
 
+
     ls *.krn | wc -l
 
+
     6710
== Load data into a spreadsheet ==
 
 
 
You can copy-and-paste data into a spreadsheet, either Microsoft Excel, Google Spreadsheets, or similar.
 
 
 
In MacOS, try the command:
 
 
 
    humcat -s h://chorales | deg -at | serialize | ridx -H | egrep -v "=|r" | sortcount | pbcopy
 
 
 
pbcopy is used to copy data to the clipboard.
 
 
 
This will extract a count of scale-degrees in Bach chorales:
 
 
 
15628 5
 
14991 1
 
11710 3
 
10721 2
 
9435 4
 
7761 6
 
5742 7
 
4728 7-
 
1135 6+
 
1134 4+
 
556 3+
 
332 2-
 
326 1+
 
324 5+
 
50 3-
 
41 5-
 
32 2+
 
11 6-
 
7 1-
 
2 4-
 
 
 
 
 
Open up a spreadsheet program and paste the resulting data into the spreadsheet.
 
 
 
 
 
[[File:excel-scalegrees-chorales.png|center|500px]]
 
 
 
Notice that some of the cells in the B column are left justified while others are right justified.  This is because Excel is autodetecting the format of each cell.  It is right justifying the numbers, and left justifying the text.
 
 
 
Make all of the B column identified as text by clicking on the "B" at the top of the column, then right-click and choose "Format Cell..." for the context menu that appears, and choose Text as the type for the column cells:
 
 
 
[[File:excel-format-cells-as-text.png|center|300px]]
 
 
 
Now all cells in the B column are text:
 
 
 
[[File:excel-cells-are-text.png|center|500px]]
 
 
 
Switch the order of the columns and then create a bar chart:
 
 
 
[[File:excel-chart1.png|center|500px]]
 
 
 
Compare the to barchart created further below with pandas/jupyter:
 
 
 
[[File:excel-bar-chart-jupyter.png|center|300px]]
 
 
 
 
 
== Plotting with Gnuplot ==
 
 
 
[http://www.gnuplot.info/ Gnuplot] is a handy command-line plotting program.  Here is an example of plotting the same data in gnuplot:
 
 
 
First save the data to a file:
 
 
 
          humcat -s h://chorales | deg -at | serialize | ridx -H | egrep -v "=|r" | sortcount > data.txt
 
 
 
On MacOS, install gnuplot with Homebrew:
 
 
          brew install gnuplot
 
 
 
Then create a file called plotbar with these contents:
 
 
 
  #!/usr/bin/env gnuplot
 
 
 
  set terminal svg size 800,500 enhanced font "Helvetica,20"
 
  set output "output.svg" 
 
 
 
  set style data histogram
 
  set style fill solid
 
  set title "Scale degrees used in Bach chorales"
 
 
 
  unset key
 
 
 
  plot "data.txt" using 1:xtic(2) linecolor rgb "#ff0088"
 
 
 
 
 
Run the script with this command:
 
 
 
  chmod 0755 plotbar
 
  ./plotbar
 
 
 
This should create a file called output.svg that looks like this:
 
 
 
 
 
[[File:gnuplot-barchart.png|center|500px]]
 
 
 
 
 
== Jupyter/pandas/matplotlib ==
 
 
 
The rest of the lab examines how to load and plot similar data in a Jupyter notebook using pandas and matplotlib to display the barchart.
 
 
 
Check out the online version of the notebook here:  http://nbviewer.jupyter.org/url/notebooks.humdrum.org/jupyter/craig/barplots/barplots.ipynb
 
 
 
Also, there is a download button on that page to download a local copy of the original jupyter notebook file. 
 
 
 
To run jupyter on your computer, do these commands (may vary depending on os and other installation systems, but this worked well for me in MacOS):
 
 
 
      brew install python3
 
 
 
Also some necessary modules for python:
 
 
 
    pip3 install matplotlib
 
    pip3 install pandas
 
 
 
This will take a while.  Then install [https://github.com/jupyterlab/jupyterlab jupyterlab], which is the development version of the jupyter notebook web interface:
 
 
 
      pip3 install jupyterlab
 
 
 
A nice, but optional thing to do is install the [https://github.com/ian-r-rose/jupyterlab-toc table-of-contents tab plugin]:
 
 
 
    brew install nodejs
 
    jupyter-labextension install jupyterlab-toc
 
 
 
But I have had problems on most computers getting this to install without the installation process hanging...
 
 
 
 
 
To start jupyterlab, type in the terminal:
 
 
 
    jupyter-lab
 
 
 
If you want to use a specific browser because jupyter chose the wrong one:
 
 
 
     jupyter-lab --browser=chrome
 
    jupyter-lab --browser=firefox
 
    jupyter-lab --browser=safari
 
 
 
Here is the default window of jupyter-lab:
 
 
 
[[File:jupyter-lab-window.png|center|700px]]
 
 
 
To create a new notebook, click on the "Python 3" icon in the "Notebook" section of the Launcher.  You can load a notebook saved on the computer by using the file viewer on the left side of the window.  Click on the "Files" tab on the far left to hide the file menu.  Here is what the browser looks like after doing all of that and then typing a test command:
 
 
 
[[File:jupyter-lab-notebook2.png|center|700px]]
 
 
 
Now to go the webpage http://nbviewer.jupyter.org/url/notebooks.humdrum.org/jupyter/craig/barplots/barplots.ipynb
 
 
 
and download that notebook from the icon on the top right:
 
 
 
[[File:nbviewer-download-icon.png|center]]
 
 
 
You will probably have to copy the notebook to the same directory in which you started jupyter-lab.
 
 
 
Before running the commands in the notebook, you may have to update your humdrum tools installation:
 
 
 
     cd `which transpose | sed 's/humdrum-tools.*/humdrum-tools/'` && make update && make
 
 
 
Since I had to update the [http://extras.humdrum.org/man/transpose transpose] tool to allow a stream of multiple input files at one time.
 
 
 
=== Useful tips for working in jupyter notebooks ===
 
 
 
* To run a program in a cell, click in the cell with a mouse and then type {{keypress|shift-return}}.  This will evaluate the cell and print the results underneath.
 
 
 
* To add a new cell above the current cell.  Press {{keypress|esc}} to exit from editing the cell.  Then type {{keypress|a}}.  A new cell should be added above the current one.  Similarly {{keypress|b}} will create a new cell below the current one.
 
 
 
* Text can be added to the page by converting a cell to the markdown format, and then typing markdown data.  To convert to markdown, click in the cell then press {keypress|esc}} to defocus on the text, and then type {{keypress|m}}.  Then click in the cell again and add text.  Then when finished, press {{keypress|shift-enter}} to convert to text.
 
 
 
* Here is a summary of Markdown syntax for adding text commentary to your notebooks: https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet.  Jupyter notebook markdown is somewhat more constrained than Github flavor of markdown.
 
 
 
* 15 Nov 2017 blog about Jupyterlab:  [https://medium.com/@brianray_7981/jupyterlab-first-impressions-e6d70d8a175d JupyterLab first impressions]
 
 
 
* Video presentation on Jupyterlab: https://www.youtube.com/watch?v=w7jq4XgwLJQ
 
 
 
* Jupyterlab tutorial docs: https://media.readthedocs.org/pdf/jlab/debug-rtd/jlab.pdf
 
 
 
* Jupyterlab online docs: https://jupyterlab.readthedocs.io/en/stable/
 
 
 
Jupyterlab also allows multiple sets of tabbed workspaces, which is useful:
 
 
 
[[File:jupyterlab-tabbed-workspace.png|center|700px]]
 
 
 
== R markdown notebooks ==
 
 
 
The [https://www.r-project.org/ R langauge] is another useful language for statistical analysis and plotting of data.  This is not covered in the lab, but checkout the website:  https://rmarkdown.rstudio.com/r_notebooks.html
 

Revision as of 17:08, 1 May 2018

This lab is about the wikifonia data.

Basic information

How many files

   ls *.krn | wc -l
   6710