Contact
CoCalc Logo Icon
StoreFeaturesDocsShareSupport News AboutSign UpSign In
| Download

r sagews demos

Project: 🐪 SRE
Views: 3834
Kernel: Python 2 (SageMath)

Rmagic in IPython

NOTE: When running in CoCalc, this notebook must be run using the Classical Jupyter notebook.

This file is part of the examples collection of the Sagemath Cloud.

Rmagic is an extension for IPython. Check out the full IPython notebook for additional details! It is based on RPy2 and allows to seamlessly talk to an underlying R session via an IPython notebook.

To activate it, see the cell below. There are basically only a few core commands:

  • %R runs a line of R code, return values can be assigned via var = %R ....

  • %%R -i <input> -o <output> ... runs the entire cell in R and -i and -o specify the variables for input and output.

  • %Rpush ... sends the data of a given variable to R.

  • %Rpull ... retrieves the variable (namespace is populated) and the data of a variable inside R.

  • %Rget ... is similar to Rpull, but only retrieves the actual data.

%pylab inline %load_ext rpy2.ipython
Populating the interactive namespace from numpy and matplotlib
import numpy as np
%R print(seq(10)) %R print(summary(factor(c("a", "b", "b", "a", "c", "a", "c"))))
[1] 1 2 3 4 5 6 7 8 9 10
a b c 3 2 2
%%R v <- 5.5 a <- seq(10) + v print(summary(a)) print(sd(a))
Min. 1st Qu. Median Mean 3rd Qu. Max. 6.50 8.75 11.00 11.00 13.25 15.50 [1] 3.02765

a only exists in R, hence the following error:

a
--------------------------------------------------------------------------- NameError Traceback (most recent call last) <ipython-input-5-60b725f10c9c> in <module>() ----> 1 a NameError: name 'a' is not defined

%Rget pulls and converts the data into Python:

a = %Rget a a
array([ 6.5, 7.5, 8.5, 9.5, 10.5, 11.5, 12.5, 13.5, 14.5, 15.5])
type(a)
numpy.ndarray

%Rpull is similar, and defines the variable, too:

%Rpull v
v
array([ 5.5])

%%R runs the given cell in R and -o [variable] "outputs" it to Python

%%R -o b b <- c(2,3,4,3,5,6,5,6,7,8) print(paste(length(a), "==", length(b), "?"))
[1] "10 == 10 ?"
print(b)
[ 2. 3. 4. 3. 5. 6. 5. 6. 7. 8.]

Plots

Basically, they are stright forward. Multiple simultaneous plots are displayed accordingly.

%R plot(a, b, 'b-')
Image in a Jupyter notebook
%%R -o lmod lmod <- lm(b ~ a) print(lmod)
Call: lm(formula = b ~ a) Coefficients: (Intercept) a -1.7 0.6

Interested in the coefficients?

Use R's slot accessor $ via a call to R in %R to retrieve the coefficients as a NumPy array.

coeffs = %R lmod$coefficients coeffs
array([-1.7, 0.6])
%R print(summary(lmod))
Call: lm(formula = b ~ a) Residuals: Min 1Q Median 3Q Max -1.00 -0.35 0.10 0.40 0.80 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -1.70000 0.79162 -2.147 0.064 . a 0.60000 0.06963 8.617 2.55e-05 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.6325 on 8 degrees of freedom Multiple R-squared: 0.9027, Adjusted R-squared: 0.8906 F-statistic: 74.25 on 1 and 8 DF, p-value: 2.549e-05

Plotting in a 2x2 grid via R's par command and setting the output canvas size to 800x600 pixels.

%%R -w 800 -h 600 par(mfrow=c(2,2)) plot(lmod)
Image in a Jupyter notebook
%%R -o faithful library(datasets) print(summary(faithful))
eruptions waiting Min. :1.600 Min. :43.0 1st Qu.:2.163 1st Qu.:58.0 Median :4.000 Median :76.0 Mean :3.488 Mean :70.9 3rd Qu.:4.454 3rd Qu.:82.0 Max. :5.100 Max. :96.0
%%R library(lattice) print( wireframe(volcano, shade = TRUE, zlab = "", aspect = c(61.0/87, 0.5), light_source = c(10,0,10)))
Image in a Jupyter notebook

Advanced Example: PCA

%%R -o pca_usarrest library("stats") pca_usarrest <- princomp(USArrests, cor=TRUE) print(summary(pca_usarrest))
Importance of components: Comp.1 Comp.2 Comp.3 Comp.4 Standard deviation 1.5748783 0.9948694 0.5971291 0.41644938 Proportion of Variance 0.6200604 0.2474413 0.0891408 0.04335752 Cumulative Proportion 0.6200604 0.8675017 0.9566425 1.00000000
%%R print(summary(pca_usarrest))
Importance of components: Comp.1 Comp.2 Comp.3 Comp.4 Standard deviation 1.5748783 0.9948694 0.5971291 0.41644938 Proportion of Variance 0.6200604 0.2474413 0.0891408 0.04335752 Cumulative Proportion 0.6200604 0.8675017 0.9566425 1.00000000
%%R biplot(pca_usarrest)
Image in a Jupyter notebook

The pca_usarrest variable references a datastructure from R. Applying R functions via RPy2 directly is no problem.

from rpy2 import robjects as ro print(ro.r.summary(pca_usarrest))
Importance of components: Comp.1 Comp.2 Comp.3 Comp.4 Standard deviation 1.5748783 0.9948694 0.5971291 0.41644938 Proportion of Variance 0.6200604 0.2474413 0.0891408 0.04335752 Cumulative Proportion 0.6200604 0.8675017 0.9566425 1.00000000
%R print(help(sum))
R Help on ‘sum’sum package:base R Documentation Sum of Vector Elements Description: ‘sum’ returns the sum of all the values present in its arguments. Usage: sum(..., na.rm = FALSE) Arguments: ...: numeric or complex or logical vectors. na.rm: logical. Should missing values (including ‘NaN’) be removed? Details: This is a generic function: methods can be defined for it directly or via the ‘Summary’ group generic. For this to work properly, the arguments ‘...’ should be unnamed, and dispatch is on the first argument. If ‘na.rm’ is ‘FALSE’ an ‘NA’ or ‘NaN’ value in any of the arguments will cause a value of ‘NA’ or ‘NaN’ to be returned, otherwise ‘NA’ and ‘NaN’ values are ignored. Logical true values are regarded as one, false values as zero. For historical reasons, ‘NULL’ is accepted and treated as if it were ‘integer(0)’. Loss of accuracy can occur when summing values of different signs: this can even occur for sufficiently long integer inputs if the partial sums would cause integer overflow. Where possible extended-precision accumulators are used, but this is platform-dependent. Value: The sum. If all of ‘...’ are of type integer or logical, then the sum is integer, and in that case the result will be ‘NA’ (with a warning) if integer overflow occurs. Otherwise it is a length-one numeric or complex vector. *NB:* the sum of an empty set is zero, by definition. S4 methods: This is part of the S4 ‘Summary’ group generic. Methods for it must use the signature ‘x, ..., na.rm’. ‘plotmath’ for the use of ‘sum’ in plot annotation. References: Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) _The New S Language_. Wadsworth & Brooks/Cole. See Also: ‘colSums’ for row and column sums. Examples: ## Pass a vector to sum, and it will add the elements together. sum(1:5) ## Pass several numbers to sum, and it also adds the elements. sum(1, 2, 3, 4, 5) ## In fact, you can pass vectors into several arguments, and everything gets added. sum(1:2, 3:5) ## If there are missing values, the sum is unknown, i.e., also missing, .... sum(1:5, NA) ## ... unless we exclude missing values explicitly: sum(1:5, NA, na.rm = TRUE)
array(['/projects/sage/sage/local/lib/R//library/base/help/sum'], dtype='|S54')