Abstract
On this article, I'll try Julia's HTTP server. Concretely, the goal is to make HTTP server that execute k-means’s prediction. About machine learning task, it is usual to set the learned model to HTTP server and post the data to that. So, as a first step of it on Julia, I'll try it. The package used here is HTTP.jl.Here, I'll just touch the initial step and won’t follow the good or proper manner. When you make the HTTP server for machine learning task, I strongly recommend that you read the official document after this article.
Try sample
On Julia, we have some choices for HTTP server. On this article, I'll use one of them, HTTP.jl.At first, I'll check the server side. The following code is from the official site’s sample code, although some lines are omitted. We can save this as server.jl.
using HTTP
HTTP.listen() do request::HTTP.Request
try
return HTTP.Response("Hello")
catch e
return HTTP.Response(404, "Error: $e")
end
end
On your terminal, you can launch the server.
julia server.jl
I- Listening on: 127.0.0.1:8081
Next, client side. As an initial step, I'll send simple HTTP Request Message and check the return. By using HTTP.request(), we can use GET and POST.
Let's use GET method and check the response.
julia> using HTTP
julia> res = HTTP.request("GET", "http://localhost:8081")
HTTP.Messages.Response:
"""
HTTP/1.1 200 OK
Transfer-Encoding: chunked
Hello"""
The type of the response is HTTP.Messages.Response. It has five elements.
julia> typeof(res)
HTTP.Messages.Response
julia> fieldnames(res)
5-element Array{Symbol,1}:
:version
:status
:headers
:body
:request
As a response, we expect that the body is “HELLO”. We can check by accessing the body element. The response is the Array of UInt8. By using String(), we can see the expected output.
julia> res.body
5-element Array{UInt8,1}:
0x48
0x65
0x6c
0x6c
0x6f
julia> String(res.body)
"Hello"
Embed function to calculate 2x
To the server side, I'll set the arbitrary function. As an example, I'll make it return the value which is two times as the value thrown from client side.The server side code is below. About the message parse, probably there are some methods. It is actually better to check.
using HTTP
twoTimes = function(x)
return 2x
end
HTTP.listen() do request::HTTP.Request
body = parse(Float64, String(request.body))
try
return HTTP.Response(string(twoTimes(body)))
catch e
return HTTP.Response(404, "Error: $e")
end
end
After launching the server, from your terminal, post the value “3”. And the response body is “6”.
julia> res = HTTP.request("POST", "http://localhost:8081", [], "3")
HTTP.Messages.Response:
"""
HTTP/1.1 200 OK
Transfer-Encoding: chunked
6.0"""
julia> res.body
3-element Array{UInt8,1}:
0x36
0x2e
0x30
julia> String(res.body)
"6.0"
Embed k-means
Finally, I'll embed the predict function of k-means to the server. First of all, we need to touch the “prediction” of k-means. About k-means itself, please check the following article.On k-means, as a consequence of the clustering, we can get the centroids of the clusters. So, on the prediction phase, the nearest centroid from the incoming data point indicates the cluster the data point belongs to.
Roughly, if the server side program has the centroids information, by calculating the distance between the incoming data point and them, we can predict the cluster the data point belong to.
You can get the k-means code I’ll use here from the following repository.
If you clone the code above, the preparation is okay.
We will follow the following steps.
- make data for k-means
- k-means clustering
- embed k-means information to server side
- post the new data point from client side
make data for k-means
The following code is to make the data and load the necessary packages.include("./Clustering/src/kmeans.jl")
using DataFrames
using Distributions
using PyPlot
function makeData()
groupOne = rand(MvNormal([10.0, 10.0], 10.0 * eye(2)), 100)
groupTwo = rand(MvNormal([0.0, 0.0], 10 * eye(2)), 100)
groupThree = rand(MvNormal([15.0, 0.0], 10.0 * eye(2)), 100)
return hcat(groupOne, groupTwo, groupThree)'
end
data = makeData()
scatter(data[1:100, 1], data[1:100, 2], color="blue")
scatter(data[101:200, 1], data[101:200, 2], color="red")
scatter(data[201:300, 1], data[201:300, 2], color="green")
As you can see, the data is composed of three distributions.
k-means clustering
We can do clustering by following code.result = kMeans(DataFrame(data), 3)
The returned values is the composite type. By fieldnames() function, we can check the fields. fieldnames(result)
6-element Array{Symbol,1}:
:x
:k
:estimatedClass
:centroids
:iterCount
:costArray
The outcome of clustering is as following. As you know, the data is composed of three groups, groupOne, groupTwo and groupThree. Here, the correspondence is as below.
- groupOne: 3
- groupTwo: 1
- groupThree: 2
println(result.estimatedClass)
[3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 2, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 1, 3, 3, 3, 3, 3, 3, 3, 3, 1, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 3, 1, 1, 1, 1, 1, 1, 3, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 3, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2]
Only information the server side needs to have is the centroids. As a default, the centroids field keeps all the information over iterations.
println(result.centroids)
Array[Array{Float64,1}[[2.39863, -3.66735], [14.4972, 1.66306], [4.40598, 6.32293]], Array{Float64,1}[[-0.278207, -1.37692], [14.5054, 2.35933], [6.62845, 8.77454]], Array{Float64,1}[[-0.339455, -0.343924], [14.5123, 1.29915], [9.11203, 10.33]], Array{Float64,1}[[-0.144399, 0.0686747], [14.5367, 0.318053], [10.3518, 10.4564]], Array{Float64,1}[[-0.0691016, 0.197866], [14.5357, -0.00769022], [10.6975, 10.3584]], Array{Float64,1}[[-0.0691016, 0.197866], [14.3853, -0.188593], [10.9187, 10.3307]], Array{Float64,1}[[-0.062882, 0.300898], [14.3853, -0.188593], [11.0214, 10.329]], Array{Float64,1}[[-0.062882, 0.300898], [14.3853, -0.188593], [11.0214, 10.329]]]
What we need is only the final result. println(result.centroids[end])
Array{Float64,1}[[-0.062882, 0.300898], [14.3853, -0.188593], [11.0214, 10.329]]
embed k-means information to server side
In the practical situation, we should make appropriate system to embed centroids information to server side. But, here, this is not the point. So, I'll write the centroids information directly.On the server side, it calculates the distance between the incoming data point and centroids. The nearest centroid’s index which is corresponding to the cluster the data point belong to will be returned as a response.
using HTTP
include("./Clustering/src/kmeans.jl")
centroids = [[-0.062882, 0.300898], [14.3853, -0.188593], [11.0214, 10.329]]
function findNearestCentroid(centroids, dataPoint)
distances = []
for centroid in centroids
push!(distances, calcDist(centroid, dataPoint))
end
return indmin(distances)
end
HTTP.listen() do request::HTTP.Request
body = parse.(Float64, split(String(request.body), ","))
try
return HTTP.Response(string(findNearestCentroid(centroids, body)))
catch e
return HTTP.Response(404, "Error: $e")
end
end
post the new data point from client side
The following three data points are for prediction. The variable's names are corresponding the expected clusters.three = rand(MvNormal([10.0, 10.0], 10.0 * eye(2)), 1)
one = rand(MvNormal([0.0, 0.0], 10 * eye(2)), 1)
two = rand(MvNormal([15.0, 0.0], 10.0 * eye(2)), 1)
@show(one)
@show(two)
@show(three)
one = [-2.80662; -0.859332]
two = [8.60476; -4.64956]
three = [12.3454; 11.0062]
If we post the data points, the predictions are done on the server. From the response, we can see the prediction are properly done.
julia> res = HTTP.request("POST", "http://localhost:8081", [], "-2.80662,-0.859332")
HTTP.Messages.Response:
"""
HTTP/1.1 200 OK
Transfer-Encoding: chunked
1"""
julia> res = HTTP.request("POST", "http://localhost:8081", [], "8.60476,-4.64956")
HTTP.Messages.Response:
"""
HTTP/1.1 200 OK
Transfer-Encoding: chunked
2"""
julia> res = HTTP.request("POST", "http://localhost:8081", [], "12.3454,11.0062")
HTTP.Messages.Response:
"""
HTTP/1.1 200 OK
Transfer-Encoding: chunked
3"""