This is a sageMath Worksheet accompaniment for the Apache Spark databricks notebook in Scala for understanding the K-Means clustering on a small sample of the 1M Songs data used in the course on Scalable Data Engineering Science available freely from https://lamastex.github.io/scalable-data-science/sds/2/x/
This is a sageMath (python) Worksheet to get one started with interactively visualizing points in 3D
This is a support notebook in sageMath (actually a Worksheet) for Scalable Data Science Course. It is mostly used as a visual cognitive tool.
SageMath is perhaps the largest open-source effort to do mathematical computing and you can use it for serious mathematical computing:
See
FAQ at http://doc.sagemath.org/html/en/faq/index.html for why you might want to use SageMath for your own research (it is Python-based).
Finally COCalc - this worksheet runs on is free for light workloads, so you can do your hoeworks, research, collaborate in social media with your colleagues, etc here.
For relevant plotting we will do now see docs here: https://doc.sagemath.org/html/en/reference/plot3d/sage/plot/plot3d/shapes2.html
And 3D interactive visualization possibilities here: http://sagemath.wikispaces.com/point3d http://sagemath.wikispaces.com/plot3d (see 10 minutes long YouTube video in the link).
Plotting the points from a csv file
See https://ask.sagemath.org/question/9393/how-to-plot-data-from-a-file/.
The file has been downloaded from the display in the databricks notebook from https://lamastex.github.io/scalable-data-science/sds/2/2/.
The first 10 lines of the file looks like this:
There are 1000 rows in the file that has been uploaded to this sageMath Worksheet in COCALC. This file is in the current directory with the path in the Python open()
function below.
Let's just see these points in 3D using primitive graphics objects interactively
We can compare the clusterings in 3D with and without taking log of duration and understand their 2D scatter plots
To manipulate the rendering of the interactive 3D Plot above uncomment and put the cursor after the '.' and hit TAB to see methods
Also don't forget sageMath docs http://doc.sagemath.org/html/en/index.html (sage has arithmetic, geometry, cryptography, calculus, and a lot lot more - finally COALC is free for small learning workloads).