| Hosted by CoCalc | Download
Kernel: Julia 1.0

Julia with DataFrames and Queryverse on CoCalc

shell commands - one-time setup

  • open a .term file

JULIA_DEPOT_PATH="/home/user/qvtest" julia-1 ... type "]" to enter pkg mode pkg> add DataFrames ... takes less than a minute pkg> add Queryverse ... takes 3 minutes

Jupyter notebook... follow the steps below

  • select Julia 1.x Jupyter kernel

Note that using Queryverse takes up to 10 minutes the first time, about 30 seconds after that

References

- Hal Snyder

VERSION
v"1.0.3"
DEPOT_PATH[1] = "/home/user/qvtest"
"/home/user/qvtest"
using Queryverse
df = DataFrame(name=["John", "Sally", "Kirk"], age=[23., 42., 59.], children=[3,5,2])

3 rows × 3 columns

nameagechildren
StringFloat64Int64
1John23.03
2Sally42.05
3Kirk59.02
x = @from i in df begin @where i.age>30. && i.children > 2 @select {Name=lowercase(i.name)} @collect DataFrame end

1 rows × 1 columns

Name
String
1sally
save("mydata.csv", df)
# display first few lines of a text file function fhead(fname, lines=4) open(fname) do file for i in enumerate(eachline(file)) println(i[2]) if i[1] > lines break end end end end
fhead (generic function with 2 methods)
fhead("mydata.csv")
"name","age","children" "John",23.0,3 "Sally",42.0,5 "Kirk",59.0,2
using VegaLite, VegaDatasets
dataset("cars") |> @vlplot( :point, x=:Horsepower, y=:Miles_per_Gallon, color=:Origin, width=400, height=400 )
WARNING: Some output was deleted.
cars = dataset("cars"); typeof(cars)
VegaDatasets.VegaDataset
# default number of rows when displaying DataFrame ENV["LINES"] = 3
3
df = DataFrame(cars)

406 rows × 9 columns

Miles_per_GallonCylindersOriginWeight_in_lbsDisplacementAccelerationNameYearHorsepower
Float64⍰Int64StringInt64Float64Float64StringStringInt64⍰
118.08USA3504307.012.0chevrolet chevelle malibu1970-01-01130
215.08USA3693350.011.5buick skylark 3201970-01-01165
318.08USA3436318.011.0plymouth satellite1970-01-01150
cars |> @filter(_.Origin=="USA" && _.Weight_in_lbs>4000) |> DataFrame

67 rows × 9 columns

Miles_per_GallonCylindersOriginWeight_in_lbsDisplacementAccelerationNameYearHorsepower
Float64⍰Int64StringInt64Float64Float64StringStringInt64⍰
115.08USA4341429.010.0ford galaxie 5001970-01-01198
214.08USA4354454.09.0chevrolet impala1970-01-01220
314.08USA4312440.08.5plymouth fury iii1970-01-01215
cars |> @filter(_.Origin=="USA" && _.Weight_in_lbs>4000) |> save("us_heavy_cars.csv")
fhead("us_heavy_cars.csv")
"Miles_per_Gallon","Cylinders","Origin","Weight_in_lbs","Displacement","Acceleration","Name","Year","Horsepower" 15.0,8,"USA",4341,429.0,10.0,"ford galaxie 500","1970-01-01",198 14.0,8,"USA",4354,454.0,9.0,"chevrolet impala","1970-01-01",220 14.0,8,"USA",4312,440.0,8.5,"plymouth fury iii","1970-01-01",215 14.0,8,"USA",4425,455.0,10.0,"pontiac catalina","1970-01-01",225

use Command line Julia and X11 mode for Voyager

commands

  1. open .x11 file in CoCalc

  2. in terminal pane (upper left), type the following

    JULIA_DEPOT_PATH="/home/user/qvtest" julia-1 ... in julia REPL using Queryverse using Vegalite, VegaDatasets cars = dataset("cars"); cars |> Voyager() ... wait for x11 pane to show data exploration GUI ... the X11 interface may be slow, depending on your ping time to CoCalc servers
  3. UI operations

    1. Click in "Data Voyager" title bar to get pointer focus in that pane

    2. In Fields menu, hover cursor over "+" to right of "A Cylinders" until it highlights, then drag into Encoding column, "x" value.

    3. In Fields menu, hover cursor over "+" to right of "# Horsepower" until it highlights, then drag into Encoding column, "y" value.

    4. Observe display of Horsepower vs. Cylinders.

  4. screen capture

  5. Watch the YouTube video Intro to the Queryverse, a Julia data science stack by David Anthoff