CoCalc Public FilesTESTS / queryverse_demo.ipynb
Author: Hal Snyder
Views : 227
Description: Short demo notebook Queryverse with Julia-1
Compute Environment: Ubuntu 18.04 (Deprecated)

# Julia with DataFrames and Queryverse on CoCalc

## shell commands - one-time setup

• open a .term file
JULIA_DEPOT_PATH="/home/user/qvtest" julia-1
... type "]" to enter pkg mode
pkg> add DataFrames ... takes less than a minute
pkg> add Queryverse ... takes 3 minutes


## Jupyter notebook... follow the steps below

• select Julia 1.x Jupyter kernel

Note that using Queryverse takes up to 10 minutes the first time, about 30 seconds after that

## References

- Hal Snyder

In [1]:
VERSION

v"1.0.3"
In [2]:
DEPOT_PATH[1] = "/home/user/qvtest"

"/home/user/qvtest"
In [43]:
using Queryverse

In [4]:
df = DataFrame(name=["John", "Sally", "Kirk"], age=[23., 42., 59.], children=[3,5,2])



3 rows × 3 columns

nameagechildren
StringFloat64Int64
1John23.03
2Sally42.05
3Kirk59.02
In [5]:
x = @from i in df begin
@where i.age>30. && i.children > 2
@select {Name=lowercase(i.name)}
@collect DataFrame
end


1 rows × 1 columns

Name
String
1sally
In [6]:
save("mydata.csv", df)

In [15]:
# display first few lines of a text file
open(fname) do file
for i in enumerate(eachline(file))
println(i[2])
if i[1] > lines
break
end
end
end
end

fhead (generic function with 2 methods)
In [14]:
fhead("mydata.csv")

"name","age","children" "John",23.0,3 "Sally",42.0,5 "Kirk",59.0,2
In [16]:
using VegaLite, VegaDatasets

In [17]:
dataset("cars") |>
@vlplot(
:point,
x=:Horsepower,
y=:Miles_per_Gallon,
color=:Origin,
width=400,
height=400
)

WARNING: Some output was deleted.
In [29]:
cars = dataset("cars");
typeof(cars)

In [41]:
# default number of rows when displaying DataFrame
ENV["LINES"] = 3

3
In [42]:
df = DataFrame(cars)


406 rows × 9 columns

Miles_per_GallonCylindersOriginWeight_in_lbsDisplacementAccelerationNameYearHorsepower
Float64⍰Int64StringInt64Float64Float64StringStringInt64⍰
118.08USA3504307.012.0chevrolet chevelle malibu1970-01-01130
215.08USA3693350.011.5buick skylark 3201970-01-01165
318.08USA3436318.011.0plymouth satellite1970-01-01150
In [35]:
cars |>
@filter(_.Origin=="USA" && _.Weight_in_lbs>4000) |> DataFrame


67 rows × 9 columns

Miles_per_GallonCylindersOriginWeight_in_lbsDisplacementAccelerationNameYearHorsepower
Float64⍰Int64StringInt64Float64Float64StringStringInt64⍰
115.08USA4341429.010.0ford galaxie 5001970-01-01198
214.08USA4354454.09.0chevrolet impala1970-01-01220
314.08USA4312440.08.5plymouth fury iii1970-01-01215
In [37]:
cars |>
@filter(_.Origin=="USA" && _.Weight_in_lbs>4000) |>
save("us_heavy_cars.csv")

In [38]:
fhead("us_heavy_cars.csv")

"Miles_per_Gallon","Cylinders","Origin","Weight_in_lbs","Displacement","Acceleration","Name","Year","Horsepower" 15.0,8,"USA",4341,429.0,10.0,"ford galaxie 500","1970-01-01",198 14.0,8,"USA",4354,454.0,9.0,"chevrolet impala","1970-01-01",220 14.0,8,"USA",4312,440.0,8.5,"plymouth fury iii","1970-01-01",215 14.0,8,"USA",4425,455.0,10.0,"pontiac catalina","1970-01-01",225

# use Command line Julia and X11 mode for Voyager

## commands

1. open .x11 file in CoCalc

2. in terminal pane (upper left), type the following

JULIA_DEPOT_PATH="/home/user/qvtest" julia-1
... in julia REPL
using Queryverse
cars = dataset("cars");
cars |> Voyager()
... wait for x11 pane to show data exploration GUI
... the X11 interface may be slow, depending on your ping time to CoCalc servers

3. UI operations

1. Click in "Data Voyager" title bar to get pointer focus in that pane
2. In Fields menu, hover cursor over "+" to right of "A Cylinders" until it highlights, then drag into Encoding column, "x" value.
3. In Fields menu, hover cursor over "+" to right of "# Horsepower" until it highlights, then drag into Encoding column, "y" value.
4. Observe display of Horsepower vs. Cylinders.
4. screen capture

5. Watch the YouTube video Intro to the Queryverse, a Julia data science stack by David Anthoff

In [ ]: