Friday, August 25, 2017

Data visualization by Golang

Overview

Usually when I plot data’s behavior to check it and to decide the approach, I use Python, matplotlib. Actually these days this is one of the best answer from the viewpoint of data science. But for these 3 or 4 weeks, I have been using Go and as a data scientist, I feel obligation to know how to plot data even at a basic level.
This is basic plot note about Golang.


Environment

Data


This time, I just used normal random number.

Library


gonum/plot has typical plot functions. You can install by “go get”. If the error about hg appear, you can install mercurial by brew(if you are Mac user.).

Make Simple graphs

Code


The code below has three types of simple plots. Box plot, Bar plot, Histogram.
Those plots have no decoration even about the aspect of color.

package main
import (
    "github.com/gonum/plot"
    "github.com/gonum/plot/plotter"
    "github.com/gonum/plot/vg"
    "math/rand"
)

func main() {
    //make data
    var values plotter.Values
    for i := 0; i < 1000; i++ {
        values = append(values, rand.NormFloat64())
    }
    boxPlot(values)
    barPlot(values[:4])
    histPlot(values)

}
func boxPlot(values plotter.Values) {
    p, err := plot.New()
    if err != nil {
        panic(err)
    }
    p.Title.Text = "box plot"

    box, err := plotter.NewBoxPlot(vg.Length(15), 0.0, values)
    if err != nil {
        panic(err)
    }
    p.Add(box)

    if err := p.Save(3*vg.Inch, 3*vg.Inch, "box.png"); err != nil {
        panic(err)
    }
}

func barPlot(values plotter.Values) {
    p, err := plot.New()
    if err != nil {
        panic(err)
    }
    p.Title.Text = "bar plot"

    bar, err := plotter.NewBarChart(values, 15)
    if err != nil {
        panic(err)
    }
    p.Add(bar)

    if err := p.Save(3*vg.Inch, 3*vg.Inch, "bar.png"); err != nil {
        panic(err)
    }
}

func histPlot(values plotter.Values) {
    p, err := plot.New()
    if err != nil {
        panic(err)
    }
    p.Title.Text = "histogram plot"

    hist, err := plotter.NewHist(values, 20)
    if err != nil {
        panic(err)
    }
    p.Add(hist)

    if err := p.Save(3*vg.Inch, 3*vg.Inch, "hist.png"); err != nil {
        panic(err)
    }
}

When you execute this code, three simple plot images are saved.


Let’s check one by one.

Box plot


Box plot is the plot which contains qunantile points information and outliers. You can check the distributions of the given values.

func boxPlot(values plotter.Values) {
    p, err := plot.New()
    if err != nil {
        panic(err)
    }
    p.Title.Text = "box plot"

    box, err := plotter.NewBoxPlot(vg.Length(15), 0.0, values)
    if err != nil {
        panic(err)
    }
    p.Add(box)

    if err := p.Save(3*vg.Inch, 3*vg.Inch, "box.png"); err != nil {
        panic(err)
    }
}
The code above is to make simple box plot image. The part, plotter.NewBoxPlot(), does box plot setting. Roughly, the function’s three arguments means plot part’s width, center point of the plot, target data for plotting.
The plot saved by this is this image.



Bar plot


Actually, I don’t use bar plot much. But sometimes, to emphasize, it works well.

func barPlot(values plotter.Values) {
    p, err := plot.New()
    if err != nil {
        panic(err)
    }
    p.Title.Text = "bar plot"

    bar, err := plotter.NewBarChart(values, 15)
    if err != nil {
        panic(err)
    }
    p.Add(bar)

    if err := p.Save(3*vg.Inch, 3*vg.Inch, "bar.png"); err != nil {
        panic(err)
    }
}

This is also simple as box plot. The function plotter.NewBarChart() takes two arguments. First one is target data and the second is part’s width.
The plot image is like this.




Histogram


Histogram is one of the fundamental tool for data analysis.

func histPlot(values plotter.Values) {
    p, err := plot.New()
    if err != nil {
        panic(err)
    }
    p.Title.Text = "histogram plot"

    hist, err := plotter.NewHist(values, 20)
    if err != nil {
        panic(err)
    }
    p.Add(hist)

    if err := p.Save(3*vg.Inch, 3*vg.Inch, "hist.png"); err != nil {
        panic(err)
    }
}

About histogram, there is a point. plotter.NewHist() takes 2 arguments and the second one means bins number. By the number of bins, the histogram looks different and it sometimes leads to misjudges to data.




Make graph fancy


I made three types simple graphs. Those are too simple. Even if I’m not visualizing-oriented, I don’t like those as those are. I try to make it look better, focusing on histogram.
Histogram structure can take Bins, Width, FillColor, draw.LineStyle. So, I’ll add FillColor and draw.LineStyle.
At first, just colors.

func histPlot(values plotter.Values) {
    p, err := plot.New()
    if err != nil {
        panic(err)
    }
    p.Title.Text = "histogram plot"

    hist, err := plotter.NewHist(values, 20)
    if err != nil {
        panic(err)
    }
    hist.FillColor = plotutil.Color(2)
    p.Add(hist)

    if err := p.Save(3*vg.Inch, 3*vg.Inch, "hist.png"); err != nil {
        panic(err)
    }
}


By setting plotutil.Color() to FillColor, the graph can get color. Easily, you can change color by changing the value of the argument.
Next, LineStyle.

func histPlot(values plotter.Values) {
    p, err := plot.New()
    if err != nil {
        panic(err)
    }
    p.Title.Text = "histogram plot"

    hist, err := plotter.NewHist(values, 20)
    if err != nil {
        panic(err)
    }
    hist.FillColor = plotutil.Color(2)
    hist.LineStyle.Color = plotutil.Color(4)
    p.Add(hist)

    if err := p.Save(3*vg.Inch, 3*vg.Inch, "hist_cl.png"); err != nil {
        panic(err)
    }
}

The structure LineStyle has Color, Width, Dashes, DashOffs. On this code, I just set Color. The image is like this. Line parts have red.