Difference between revisions of "Humdrum lab 6"

From CCARH Wiki
Jump to navigation Jump to search
 
(12 intermediate revisions by the same user not shown)
Line 63: Line 63:
 
[[File:excel-bar-chart-jupyter.png|center|300px]]
 
[[File:excel-bar-chart-jupyter.png|center|300px]]
  
 +
 +
== Datawrapper ==
 +
 +
Even easier than using a spreadsheet, try the datawrapper website to generate the same histogram:
 +
 +
https://www.datawrapper.de
 +
 +
This site "publishes" your plot and allows access to the underlying data.
 +
 +
<html>
 +
<iframe id="datawrapper-chart-vhkHL" src="//datawrapper.dwcdn.net/vhkHL/1/" scrolling="no" frameborder="0" allowtransparency="true" style="width: 0; min-width: 100% !important;" height="548"></iframe><script type="text/javascript">if("undefined"==typeof window.datawrapper)window.datawrapper={};window.datawrapper["vhkHL"]={},window.datawrapper["vhkHL"].embedDeltas={"100":648,"200":573,"300":573,"400":548,"500":548,"700":548,"800":548,"900":548,"1000":548},window.datawrapper["vhkHL"].iframe=document.getElementById("datawrapper-chart-vhkHL"),window.datawrapper["vhkHL"].iframe.style.height=window.datawrapper["vhkHL"].embedDeltas[Math.min(1e3,Math.max(100*Math.floor(window.datawrapper["vhkHL"].iframe.offsetWidth/100),100))]+"px",window.addEventListener("message",function(a){if("undefined"!=typeof a.data["datawrapper-height"])for(var b in a.data["datawrapper-height"])if("vhkHL"==b)window.datawrapper["vhkHL"].iframe.style.height=a.data["datawrapper-height"][b]+"px"});</script>
 +
</html>
 +
 +
Try clicking on the "Get the data" link on the chart.  This will download a CSV version of the data (the input was in TSV format).
 +
 +
The above chart was created with this code copied from the datawrapper website:
 +
 +
  <iframe id="datawrapper-chart-vhkHL" src="//datawrapper.dwcdn.net/vhkHL/1/" scrolling="no"
 +
      frameborder="0" allowtransparency="true" style="width: 0; min-width: 100% !important;" height="548">
 +
  </iframe>
 +
  <script type="text/javascript">
 +
      if("undefined"==typeof window.datawrapper)window.datawrapper={};window.datawrapper["vhkHL"]={},window.datawrapper["vhkHL"].embedDeltas=
 +
      {"100":648,"200":573,"300":573,"400":548,"500":548,"700":548,"800":548,"900":548,"1000":548},
 +
      window.datawrapper["vhkHL"].iframe=document.getElementById("datawrapper-chart-vhkHL"),window.datawrapper["vhkHL"]
 +
      .iframe.style.height=window.datawrapper["vhkHL"]
 +
      .embedDeltas[Math.min(1e3,Math.max(100*Math.floor(window.datawrapper["vhkHL"].iframe.offsetWidth/100),100))]+
 +
      "px",window.addEventListener("message",function(a){if("undefined"!=typeof a.data["datawrapper-height"])
 +
      for(var b in a.data["datawrapper-height"])if("vhkHL"==b)window.datawrapper["vhkHL"]
 +
      .iframe.style.height=a.data["datawrapper-height"][b]+"px"});</script>
 +
 +
Here is another variation:
 +
 +
<html>
 +
<iframe id="datawrapper-chart-BjMlq" src="//datawrapper.dwcdn.net/BjMlq/1/" scrolling="no" frameborder="0" allowtransparency="true" style="width: 0; min-width: 100% !important;" height="500"></iframe><script type="text/javascript">if("undefined"==typeof window.datawrapper)window.datawrapper={};window.datawrapper["BjMlq"]={},window.datawrapper["BjMlq"].embedDeltas={"100":625,"200":550,"300":525,"400":525,"500":500,"700":500,"800":500,"900":500,"1000":500},window.datawrapper["BjMlq"].iframe=document.getElementById("datawrapper-chart-BjMlq"),window.datawrapper["BjMlq"].iframe.style.height=window.datawrapper["BjMlq"].embedDeltas[Math.min(1e3,Math.max(100*Math.floor(window.datawrapper["BjMlq"].iframe.offsetWidth/100),100))]+"px",window.addEventListener("message",function(a){if("undefined"!=typeof a.data["datawrapper-height"])for(var b in a.data["datawrapper-height"])if("BjMlq"==b)window.datawrapper["BjMlq"].iframe.style.height=a.data["datawrapper-height"][b]+"px"});</script>
 +
</html>
 +
 +
This variant adds social-media buttons for telling everyone about your latest and greatest data plots.
  
 
== Plotting with Gnuplot ==
 
== Plotting with Gnuplot ==
Line 102: Line 139:
 
[[File:gnuplot-barchart.png|center|500px]]
 
[[File:gnuplot-barchart.png|center|500px]]
  
 +
 +
Gnuplot can be compiled with emscripten to be used directly inside of a webpage (without a server backend):
 +
* https://github.com/chhu/gnuplot-JS
  
 
== Jupyter/pandas/matplotlib ==
 
== Jupyter/pandas/matplotlib ==
Line 185: Line 225:
  
 
[[File:jupyterlab-tabbed-workspace.png|center|700px]]
 
[[File:jupyterlab-tabbed-workspace.png|center|700px]]
 +
 +
=== Further reading for jupyterlabs and pandas ===
 +
 +
* https://notebooks.azure.com/versae/libraries/cidr-data-visualization/html/data_visualization_filled.ipynb  This jupyter notebook covers more plotting possibilities with pandas, such as histograms and other python libraries such as seaborn.
 +
 +
* http://songhuiming.github.io/pages/2017/04/02/jupyter-and-pandas-display
  
 
== R markdown notebooks ==
 
== R markdown notebooks ==
Line 191: Line 237:
  
  
== Datawrapper ==
+
== Other data plotting systems ==
 +
 
 +
* Bokeh https://bokeh.pydata.org  This is a python visualization library
 +
* Seaborn tps://seaborn.pydata.org Another python visualization library
 +
 
 +
For creating your own interactive plots on the web in Javascript:
 +
* D3 https://d3js.org
 +
* Example barchart using D3: https://bl.ocks.org/mbostock/3885304
 +
 
 +
 
  
https://www.datawrapper.de
+
{{humdrum_labs}}

Latest revision as of 01:16, 30 June 2018

This lab is about plotting data, and doing further analysis of the raw data extracted from Humdrum files.


There are several possibilities for plotting. We will focus on the last one in the lab, but here are other possibilities:


Load data into a spreadsheet

You can copy-and-paste data into a spreadsheet, either Microsoft Excel, Google Spreadsheets, or similar.

In MacOS, try the command:

    humcat -s h://chorales | deg -at | serialize | ridx -H | egrep -v "=|r" | sortcount | pbcopy

pbcopy is used to copy data to the clipboard.

This will extract a count of scale-degrees in Bach chorales:

15628	5
14991	1
11710	3
10721	2 
9435	4
7761	6 
5742	7
4728	7-
1135	6+
1134	4+ 
556	3+
332	2-
326	1+
324	5+
50	3-
41	5-
32	2+
11	6-
7	1-
2	4-


Open up a spreadsheet program and paste the resulting data into the spreadsheet.


Excel-scalegrees-chorales.png

Notice that some of the cells in the B column are left justified while others are right justified. This is because Excel is autodetecting the format of each cell. It is right justifying the numbers, and left justifying the text.

Make all of the B column identified as text by clicking on the "B" at the top of the column, then right-click and choose "Format Cell..." for the context menu that appears, and choose Text as the type for the column cells:

Excel-format-cells-as-text.png

Now all cells in the B column are text:

Excel-cells-are-text.png

Switch the order of the columns and then create a bar chart:

Excel-chart1.png

Compare the to barchart created further below with pandas/jupyter:

Excel-bar-chart-jupyter.png


Datawrapper

Even easier than using a spreadsheet, try the datawrapper website to generate the same histogram:

https://www.datawrapper.de

This site "publishes" your plot and allows access to the underlying data.

Try clicking on the "Get the data" link on the chart. This will download a CSV version of the data (the input was in TSV format).

The above chart was created with this code copied from the datawrapper website:

  <iframe id="datawrapper-chart-vhkHL" src="//datawrapper.dwcdn.net/vhkHL/1/" scrolling="no" 
     frameborder="0" allowtransparency="true" style="width: 0; min-width: 100% !important;" height="548">
  </iframe>
  <script type="text/javascript">
     if("undefined"==typeof window.datawrapper)window.datawrapper={};window.datawrapper["vhkHL"]={},window.datawrapper["vhkHL"].embedDeltas=
      {"100":648,"200":573,"300":573,"400":548,"500":548,"700":548,"800":548,"900":548,"1000":548},
     window.datawrapper["vhkHL"].iframe=document.getElementById("datawrapper-chart-vhkHL"),window.datawrapper["vhkHL"]
     .iframe.style.height=window.datawrapper["vhkHL"]
     .embedDeltas[Math.min(1e3,Math.max(100*Math.floor(window.datawrapper["vhkHL"].iframe.offsetWidth/100),100))]+
     "px",window.addEventListener("message",function(a){if("undefined"!=typeof a.data["datawrapper-height"])
     for(var b in a.data["datawrapper-height"])if("vhkHL"==b)window.datawrapper["vhkHL"]
     .iframe.style.height=a.data["datawrapper-height"][b]+"px"});</script>

Here is another variation:

This variant adds social-media buttons for telling everyone about your latest and greatest data plots.

Plotting with Gnuplot

Gnuplot is a handy command-line plotting program. Here is an example of plotting the same data in gnuplot:

First save the data to a file:

          humcat -s h://chorales | deg -at | serialize | ridx -H | egrep -v "=|r" | sortcount > data.txt

On MacOS, install gnuplot with Homebrew:

         brew install gnuplot

Then create a file called plotbar with these contents:

 #!/usr/bin/env gnuplot 
 
 set terminal svg size 800,500 enhanced font "Helvetica,20"
 set output "output.svg"  
 
 set style data histogram
 set style fill solid
 set title "Scale degrees used in Bach chorales"
 
 unset key
 
 plot "data.txt" using 1:xtic(2) linecolor rgb "#ff0088"


Run the script with this command:

  chmod 0755 plotbar
  ./plotbar

This should create a file called output.svg that looks like this:


Gnuplot-barchart.png


Gnuplot can be compiled with emscripten to be used directly inside of a webpage (without a server backend):

Jupyter/pandas/matplotlib

The rest of the lab examines how to load and plot similar data in a Jupyter notebook using pandas and matplotlib to display the barchart.

Check out the online version of the notebook here: http://nbviewer.jupyter.org/url/notebooks.humdrum.org/jupyter/craig/barplots/barplots.ipynb

Also, there is a download button on that page to download a local copy of the original jupyter notebook file.

To run jupyter on your computer, do these commands (may vary depending on os and other installation systems, but this worked well for me in MacOS):

     brew install python3

Also some necessary modules for python:

    pip3 install matplotlib
    pip3 install pandas

This will take a while. Then install jupyterlab, which is the development version of the jupyter notebook web interface:

     pip3 install jupyterlab

A nice, but optional thing to do is install the table-of-contents tab plugin:

    brew install nodejs
    jupyter-labextension install jupyterlab-toc

But I have had problems on most computers getting this to install without the installation process hanging...


To start jupyterlab, type in the terminal:

    jupyter-lab

If you want to use a specific browser because jupyter chose the wrong one:

   jupyter-lab --browser=chrome
   jupyter-lab --browser=firefox
   jupyter-lab --browser=safari

Here is the default window of jupyter-lab:

Jupyter-lab-window.png

To create a new notebook, click on the "Python 3" icon in the "Notebook" section of the Launcher. You can load a notebook saved on the computer by using the file viewer on the left side of the window. Click on the "Files" tab on the far left to hide the file menu. Here is what the browser looks like after doing all of that and then typing a test command:

Jupyter-lab-notebook2.png

Now to go the webpage http://nbviewer.jupyter.org/url/notebooks.humdrum.org/jupyter/craig/barplots/barplots.ipynb

and download that notebook from the icon on the top right:

Nbviewer-download-icon.png

You will probably have to copy the notebook to the same directory in which you started jupyter-lab.

Before running the commands in the notebook, you may have to update your humdrum tools installation:

   cd `which transpose | sed 's/humdrum-tools.*/humdrum-tools/'` && make update && make

Since I had to update the transpose tool to allow a stream of multiple input files at one time.

Useful tips for working in jupyter notebooks

  • To run a program in a cell, click in the cell with a mouse and then type shift-return. This will evaluate the cell and print the results underneath.
  • To add a new cell above the current cell. Press esc to exit from editing the cell. Then type a. A new cell should be added above the current one. Similarly b will create a new cell below the current one.
  • Text can be added to the page by converting a cell to the markdown format, and then typing markdown data. To convert to markdown, click in the cell then press {keypress|esc}} to defocus on the text, and then type m. Then click in the cell again and add text. Then when finished, press shift-enter to convert to text.

Jupyterlab also allows multiple sets of tabbed workspaces, which is useful:

Jupyterlab-tabbed-workspace.png

Further reading for jupyterlabs and pandas

R markdown notebooks

The R langauge is another useful language for statistical analysis and plotting of data. This is not covered in the lab, but checkout the website: https://rmarkdown.rstudio.com/r_notebooks.html


Other data plotting systems

  • Bokeh https://bokeh.pydata.org This is a python visualization library
  • Seaborn tps://seaborn.pydata.org Another python visualization library

For creating your own interactive plots on the web in Javascript:



Lab 1 (intro) Lab 2 (Essen) Lab 3 (searching) Lab 4 (JRP) Lab 5 (Wikifonia) Lab 6 (bar chart) Lab 7 (regular expressions) Lab 8 (chorck & cint)