CoCalc Public Fileswww / books / sagebook / sagebook.texOpen with one click!
Author: William A. Stein
Compute Environment: Ubuntu 18.04 (Deprecated)
1
% for book
2
\documentclass{report}
3
\voffset=-0.05\textheight \textheight=1.1\textheight
4
\hoffset=-0.05\textwidth \textwidth=1.1\textwidth
5
6
% for in-class presentation
7
%\documentclass[landscape,12pt]{report}
8
%\voffset=-0.3\textheight \textheight=1.5\textheight
9
%\hoffset=-0.35\textwidth \textwidth=1.7\textwidth
10
11
12
\bibliographystyle{amsalpha}
13
\include{macros}
14
\DeclareMathOperator{\Li}{Li}
15
16
\renewcommand{\todo}[1]{}
17
18
19
\usepackage{pdfsync}
20
\usepackage[hypertex]{hyperref}
21
22
\usepackage{graphicx}
23
\newcommand{\sagegraphic}[2]{\begin{center}\includegraphics[width=#1\textwidth]{graphics/#2}\end{center}}
24
25
\title{Sage for Power Users}
26
27
\author{William Stein}
28
29
\begin{document}
30
%\sf\LARGE % for in-class presentation
31
32
\maketitle
33
\tableofcontents
34
35
%\part{Sage, Python and Cython}
36
37
\chapter*{Preface}
38
This is a book about Sage \url{http://sagemath.org}, which is a large
39
free open source software project that I started in 2005, whose
40
``mission statement'' is to create a viable free open source
41
alternative to the commercial programs Magma, Maple, Mathematica, and
42
Matlab. I have given many talks, tutorials, and workshops on Sage,
43
and this book records what I have personally found to be the most
44
important key ideas that are needed to make effective use of Sage. My
45
intention is that you read the whole book cover-to-cover, and have
46
thus kept the book intentionally short.
47
48
49
% Background
50
I assume that you have some previous computer programming experience,
51
but not necessarily in Python. Though I'll make reference to many
52
mathematical topics when illustrating how to use Sage, do not worry if
53
some are not familiar to you.
54
55
% License
56
This book is licensed under the Creative Commons Attribution 3.0
57
license\footnote{See \url{http://creativecommons.org/licenses/by/3.0/}.}, so
58
it will always be freely available in many formats.
59
60
% Overview of book
61
\todo{Put an overview of the book -- chapter by chapter -- here.}
62
63
64
\chapter{Introduction to Sage}
65
\section{Motivation}
66
67
I started the Sage project in early 2005 in order to create a viable
68
free open source mathematical software package that I could use for my
69
research. I was frustrated with not being allowed to easily change or
70
understand the internals of closed source systems\footnote{For me,
71
this was a powerful niche program called ``Magma''.} and I had a
72
deep concern that my students and colleagues could not easily use the
73
commercially distributed software that I had spent many years
74
developing (and contributing). I started Sage as a new project
75
instead of switching to another system, since the capabilities of any
76
available software for number theory at the time were far behind
77
many key features of commercial systems.\footnote{For example, Magma's tools
78
for linear algebra over the rational numbers and finite fields were
79
vastly superior to anything available anywhere else.} Several
80
hundred people have since become involved in Sage development, and the
81
goals have broadened substantially.
82
83
Sage uses a mainstream programming language, unlike all popular
84
mathematics software, including Maple,
85
Mathematica, and Matlab, which each use their own
86
special-purpose languages written just for mathematics. One works
87
with Sage using Python, which is one of the world's most popular
88
general purpose scripting languages. By using Python, one can use
89
almost anything ever written in Python directly in Sage. And
90
there is much useful Python code out there that addresses a wide
91
range of application areas.
92
93
Instead of writing many of the core libraries from scratch like most
94
math software systems have done in Sage I assembled together the best
95
open source software out there, and built on it\footnote{Sage includes
96
over 500,000 lines of code that does not come from third party
97
projects.}. Also, the complete system is easily buildable from
98
source on a range of computers. There are challenges: some of the
99
upstream libraries can be difficult to understand, are written in a
100
range of languages, and have different conventions than Sage. Thus it
101
is important to strongly encouraging good relations with the projects
102
that create many of the components of Sage.
103
104
A wide and vibrant community of developers and users have become
105
involved with Sage. Due to the broad interests of this large
106
community of developers, Sage has grown into a project with the
107
following specific goal:
108
\begin{quote}
109
{\bf Mission Statement:} Provide a viable free open source alternative
110
to Magma, Maple, Mathematica, and Matlab.
111
\end{quote}
112
%The Sage development model has matured. There are regular releases of
113
%Sage, and all code that goes into Sage is peer reviewed.
114
115
\section{What is Sage?}
116
117
Sage is a free open-source mathematics software system licensed under
118
the GNU Public License (GPL). It combines the power of about 100
119
open-source packages with a large amount of new code to provide a free
120
open source platform for mathematical computation. Sage has many
121
notable features.
122
123
\begin{itemize}
124
\item Sage is free, due mainly to the volunteer effort of hundreds of
125
people and generous funding from the National Science Foundation,
126
private donations, and other organizations such as Google and
127
Microsoft. There are no license codes or copy protection. Sage is
128
also open source, so there are absolutely no secret or proprietary
129
algorithms anywhere in Sage. There is nothing that you are not
130
allowed to see or change.
131
132
\item Sage uses the mainstream programming language Python. Learning
133
Sage will make you proficient in this popular, widely used, and well
134
supported free programming language, which you will likely also use
135
for other non-mathematics projects. Moreover, Sage features the
136
Cython compiler, which allows one to combine Python, C/C++/Fortran
137
libraries, and native machine types for potentially huge speedups.
138
139
\item Sage is uniquely able to combine functionality from dozens of
140
other mathematical software programs and programming languages via
141
smart psuedoterminal interfaces. You can combine Lisp, Mathematica,
142
and C code to attack a single problem.
143
144
\item Sage has both a sophisticated multiuser web-based graphical user
145
interface and a powerful command line interface. Sage can also be
146
made to work with any other Python interactive development environment (IDE).
147
148
\item Sage may have the widest range of mathematical capabilities of
149
any single mathematical software system available. Sage and its
150
components are developed by an active and enthusiastic worldwide
151
community of people from many areas of mathematics, science, and
152
engineering.
153
154
\item Modifications to Sage are publicly peer reviewed, and what goes
155
into Sage is decided via community discussions; no matter who you
156
are, if you have a brilliant idea, the energy, and can clearly argue
157
that something should go into Sage, it probably will. Known bugs in
158
Sage, and all discussions about them are available for all to see.
159
\end{itemize}
160
161
Sage is nothing like Magma, Maple, Mathematica, and Matlab, in which
162
details of their implementations of algorithms is secret, their list
163
of bugs is concealed, how they decided what got included in each
164
release is under wraps, their custom programming language locks you
165
in, and you must fight with license codes, copy protection and
166
intentionally crippled web interfaces.
167
168
\section{``This unique American idea of the
169
entrepreneurial company.''}
170
171
The Mathematica documentation has an argument for why looking at the internals
172
of mathematical software is not necessary.
173
\begin{quote}
174
``Particularly in more advanced applications of Mathematica, it may
175
sometimes seem worthwhile to try to analyze internal algorithms in
176
order to predict which way of doing a given computation will be the
177
most efficient. And there are indeed occasionally major improvements
178
that you will be able to make in specific computations as a result
179
of such analyses.
180
181
But most often the analyses will not be worthwhile. For the
182
internals of Mathematica are quite complicated, and even given a
183
basic description of the algorithm used for a particular purpose, it
184
is usually extremely difficult to reach a reliable conclusion about
185
how the detailed implementation of this algorithm will actually
186
behave in particular circumstances.''
187
188
-- {\tiny \url{http://reference.wolfram.com/mathematica/tutorial/WhyYouDoNotUsuallyNeedToKnowAboutInternals.html}}
189
\end{quote}
190
191
Wolfram, who founded the company that sells Mathematica, admits that
192
the mathematical community hates some of what he has done, arguing
193
that a closed source commercial model is the only approach that
194
can possibly work.
195
196
\begin{quote}
197
``There's another thing, quite honestly, that that community has a
198
hard time with. They sort of hate one aspect of what I have done,
199
which is to take intellectual developments and make a company out of
200
them and sell things to people.
201
202
My own view of that, which has hardened over the years, is, my god,
203
that's the right thing to do. If you look at what's happened with
204
TeX, for example, which went in the other direction... well,
205
Mathematica could not have been brought to where it is today if it
206
had not been done as a commercial effort. The amount of money that
207
has to be spent to do all the details of development, you just can't
208
support that in any other way than this unique American idea of the
209
entrepreneurial company.''
210
211
-- Stephen Wolfram, 1993, Doctor Dobbs Journal Interview
212
\end{quote}
213
214
215
\noindent{}For the last 20 years, Matlab, Mathematica, and the other
216
commercial systems have dominanted with on the order of a hundred
217
million dollars a year in revenue. If the Sage project succeeds at
218
its goals (still a big if), it will have proved that Wolfram is wrong
219
and radically change the landscape of computational mathematics.
220
221
\section{Getting Started}
222
The easiest way to get started with Sage right {\em now} is to visit
223
\url{http://480.sagenb.org} and login using OpenID by clicking one of
224
the buttons at the bottom right. This should work with nearly any
225
operating system and browser combination\footnote{I recommend that you
226
avoid using Internet Explorer if at all possible.}. Using Sage via
227
the above webpage is fine if you just want to use Sage via the
228
notebook, e.g., for learning Python (Chapter~\ref{ch:python}) and
229
Cython (Chapter~\ref{ch:cython}).
230
231
232
There are some situations where you will instead want to install Sage
233
on your own computer, or get an account on a command-line server on
234
which Sage is installed:
235
236
\begin{enumerate}
237
\item You want to use the Sage command line interface.
238
\item You want to use the interactive command line profiler and
239
debugger, which haven't been properly ported to the notebook yet
240
(see Chapter~\ref{ch:profdeb}).
241
\item You want to modify Sage and contribute back new code (see
242
Chapter~\ref{ch:contrib}).
243
\item You want to interface nonfree software with Sage (see
244
Chapter~\ref{ch:interfaces}). It would be illegal for me to allow
245
just anybody to run Maple/Mathematica/etc. code at
246
\url{http://480.sagenb.org}.
247
\item You do not have access to the Internet.
248
249
\end{enumerate}
250
251
252
\begin{remark}
253
Eliminating all but the last reason above are current goals of the
254
Sage project. A command line interface should be added to the
255
notebook, and it should support the profiler and debugger. It
256
should be possible to edit all files in the source code of Sage, use
257
revision control systems, etc., completely via the web. Even the
258
legal issue involving nonfree software could be resolved by hooking
259
into our University's authentication system, just as you
260
authenticate for off-campus access to library resources.
261
\end{remark}
262
263
\section{A Tour}
264
265
Sage uses the basic user-interface principle of ``question and
266
answer'' found in many other mathematical software systems. You enter
267
input written in a well-defined language and, after pressing the
268
\fbox{\sf return} key in the command line interface or pressing
269
\fbox{\sf shift+return} in the notebook interface, Sage evaluates your
270
input and returns the result.
271
272
A traditional test that Sage is working is to compute 2+2:
273
\begin{lstlisting}
274
sage: 2 + 2
275
4
276
\end{lstlisting}
277
278
We factor a whole number.
279
\begin{lstlisting}
280
sage: factor(2012)
281
2^2 * 503
282
\end{lstlisting}
283
Thus $2012 = 2\times 2 \times 503$.
284
Sage can also factor negative numbers and rational numbers:
285
\begin{lstlisting}
286
sage: factor(-2012/2015)
287
-1 * 2^2 * 5^-1 * 13^-1 * 31^-1 * 503
288
\end{lstlisting}
289
290
The language that Sage uses is almost the same as the
291
Python programming language.
292
One difference between Sage and Python is that \verb|^| means
293
exponentiation in Sage but exclusive or in Python.
294
Another difference is that integer division results in a rational
295
number in Sage, but is floor division in Python.
296
\begin{lstlisting}
297
sage: 2^3
298
8
299
sage: 2012/6
300
1006/3
301
\end{lstlisting}
302
303
We can also factor symbolic expressions using Sage. To introduce a
304
symbolic variable, use the {\sf var} command.
305
\begin{lstlisting}
306
sage: var('x,y')
307
(x, y)
308
sage: F = factor(x^2 - 4*sin(y)^2); F
309
(x - 2*sin(y))*(x + 2*sin(y))
310
\end{lstlisting}%link
311
312
If you want to put any result in a \LaTeX{} document\footnote{\LaTeX{}
313
is the dominant tool for producing {\em professional} quality mathematical
314
papers and books; it is free and open source and you should learn
315
it.}, use the {\sf latex} command:
316
%link
317
\begin{lstlisting}
318
sage: latex(F)
319
{\left(x - 2 \, \sin\left(y\right)\right)} {\left(x + 2 \,
320
\sin\left(y\right)\right)}
321
\end{lstlisting}
322
which looks like this:
323
$$
324
{\left(x - 2 \, \sin\left(y\right)\right)} {\left(x + 2 \,
325
\sin\left(y\right)\right)}
326
$$
327
328
Sage knows Calculus:
329
\begin{lstlisting}
330
sage: integrate(e^x * sin(x), x)
331
1/2*(sin(x) - cos(x))*e^x
332
sage: derivative(1/2*(sin(x) - cos(x))*e^x).expand()
333
e^x*sin(x)
334
\end{lstlisting}
335
336
Sage can plot functions:
337
\begin{lstlisting}
338
sage: g = plot(sin(x) + (1-x^2), (x, 0, 2)); g
339
\end{lstlisting}%link
340
\sagegraphic{.4}{plot1}
341
To include this plot in a document, save it as a PDF file:
342
%link
343
\begin{lstlisting}
344
sage: g.save('plot1.pdf')
345
\end{lstlisting}
346
347
We numerically find a root of $\sin(x) + (1-x^2)$ between 0 and 2, as follows:
348
\begin{lstlisting}
349
sage: find_root(sin(x) + (1-x^2), 0, 2)
350
1.4096240040025754
351
\end{lstlisting}
352
353
You can use some other programming languages directly from Sage, such as Lisp:
354
\begin{lstlisting}
355
sage: s = "(defun factorial(n)"
356
sage: s += " (if (= n 1) 1 (* n (factorial (- n 1)))))"
357
sage: lisp(s)
358
FACTORIAL
359
sage: lisp('(factorial 10)')
360
3628800
361
\end{lstlisting}
362
363
Or use Mathematica (this won't work if you don't have Mathematica):
364
\begin{lstlisting}
365
sage: mathematica('Integrate[Sin[x^2],x]') # optional - Mathematica
366
Sqrt[Pi/2]*FresnelS[Sqrt[2/Pi]*x]
367
\end{lstlisting}
368
369
Or use Magma, over the web (this should work as long as you have an
370
Internet connection, since it just uses
371
\url{http://magma.maths.usyd.edu.au/calc/}):
372
\begin{lstlisting}
373
sage: magma_free("Factorisation(2012)")
374
[ <2, 2>, <503, 1> ]
375
\end{lstlisting}
376
377
% \chapter{Using Sage: The Notebook and Command Line}
378
379
380
% \section{The Notebook}
381
382
383
384
% \subsection{Documentation}
385
386
% \subsection{Creating interacts}
387
388
% \section{The Command Line}\label{sec:line}
389
390
\part{Programming Sage}
391
392
\chapter{Python}\label{ch:python}
393
394
Sage uses the Python programming language, which is relatively easy to
395
learn and fun to use. This chapter is a quick Sage-oriented
396
introduction to Python, which you should supplement with a book.
397
Fortunately, the two best books on Python are free: {\em The Python
398
Tutorial} (see \url{http://docs.python.org/}) and {\em Dive Into
399
Python} (see \url{http://www.diveintopython.net/}).
400
%The best non-free Python reference manual is {\em Python in a Nutshell} (see
401
%\url{http://oreilly.com/catalog/9780596001889}).
402
403
\section{What is Python?}
404
Python is a popular free open source language with {\em no} particular
405
company pushing it. Many big companies such as Google use Python, and
406
support its development. From \url{http://python.org}:
407
\begin{quote}
408
``Python is a programming language that lets you work more quickly
409
and integrate your systems more effectively. You can learn to use
410
Python and see almost immediate gains in productivity and lower
411
maintenance costs.''
412
\end{quote}
413
\begin{itemize}
414
\item {\em Work more quickly}: you get stuff done instead of
415
fighting with the language and environment for silly reasons
416
\item {\em Integrate your systems}: Python is particular useful at
417
creating big systems out of possibly messy collections of software
418
tools.
419
\item {\em Maintenance costs}: Python code is more likely to be readable
420
and hackable.
421
\end{itemize}
422
423
Sage is a big integrated system built out of {\em several million}
424
lines of possibly messy software, code written using Sage tends to
425
be readable and hackable, and people use Sage since it helps them
426
get stuff done immediately.
427
428
\section{The Sage Preparser}
429
When you type commands into Sage, the computer programming language
430
you use is (almost) Python. Each line of code gets automatically run
431
through a preparser before it is sent to the Python interpreter. To
432
see exactly what changes occur, use the {\tt preparse} command:
433
\begin{lstlisting}
434
sage: preparse('a = 2.5^3')
435
"a = RealNumber('2.5')**Integer(3)"
436
\end{lstlisting}
437
As you can see, decimal literals get wrapped using the {\tt
438
RealNumber} command, so when you type 2.5, Python will see {\tt
439
RealNumber('2.5')}. Similarly, integer literals get wrapped using
440
{\tt Integer}. Finally, the caret symbol \verb|^| is replaced by {\tt
441
**}, which is Python's exponentiation operator. One motivation for
442
doing all this is that in Magma, Maple, Mathematica, Matlab and \LaTeX the
443
\verb|^| operator is exponentiation, and making Sage have the same
444
behavior as those systems helps minimize confusion (whereas in Python \verb|^| is
445
``exclusive or''). The preparse does a few other things, but not much
446
more.
447
448
If you want to turn off the preparser, type {\tt preparser(False)}:
449
%skip
450
\begin{lstlisting}
451
sage: preparser(False)
452
sage: 2/3 + 2^3
453
1
454
\end{lstlisting}
455
\begin{lstlisting}
456
sage: preparser(True)
457
sage: 2/3 + 2^3
458
26/3
459
\end{lstlisting}
460
461
Read more about the preparser at \url{http://www.sagemath.org/doc/reference/sage/misc/preparser.html}.
462
463
464
\section{Variables}
465
%http://wiki.wstein.org/10/480b/lectures/lec2
466
467
In Python you create a variable by writing {\tt var = expression}; for example,
468
\begin{lstlisting}
469
sage: a = 2
470
sage: b = 3
471
sage: a + b
472
5
473
\end{lstlisting}
474
475
You can also include several assignment statements on the same line if you separate them
476
with a semicolon:
477
\begin{lstlisting}
478
sage: c = 7; d = 15; e = 5
479
sage: c + d + e
480
27
481
\end{lstlisting}
482
You do {\em not} have to end lines with a semicolon.
483
You can also assign the same value to several variables at once:
484
\begin{lstlisting}
485
sage: c = d = 10
486
sage: c + d
487
20
488
\end{lstlisting}
489
490
We have only used integers as the expression, but
491
Python supports many other types of objects, such as lists, which
492
we make using square brackets:
493
\begin{lstlisting}
494
sage: v = [7, 15, 5]; v
495
[7, 15, 5]
496
\end{lstlisting}
497
498
The most important gotcha (and feature) of variables in Python is that
499
variables are a {\em reference} to a Python object, not a new copy of
500
that object. Thus in the example below, both {\tt v} and {\tt w}
501
``reference'' exactly the same Python list:
502
\begin{lstlisting}
503
sage: v = [1, 2, 3]
504
sage: w = v
505
sage: w[0] = 10
506
sage: v
507
[10, 2, 3]
508
\end{lstlisting}
509
Continuing the above example:
510
\begin{lstlisting}
511
sage: v[1] = 5
512
sage: w
513
[10, 5, 3]
514
\end{lstlisting}
515
516
If you want a copy of an object, use the \verb|copy| command.
517
\begin{lstlisting}
518
sage: v = [1,2,3]
519
sage: w = v
520
sage: z = copy(w)
521
sage: v[0] = 10
522
sage: print w
523
[10, 2, 3]
524
sage: z
525
[1, 2, 3]
526
\end{lstlisting}
527
528
The {\tt copy} function only copies the {\em references} in the list:
529
\begin{lstlisting}
530
sage: v = [[1,2], 3, 4]
531
sage: w = copy(v)
532
sage: w[1] = 10
533
sage: w[0][0] = 5
534
sage: v
535
[[5, 2], 3, 4]
536
sage: w
537
[[5, 2], 10, 4]
538
\end{lstlisting}
539
540
To recursively make a new copy of everything (as much as possible), use
541
{\tt deepcopy}:
542
\begin{lstlisting}
543
sage: v = [[1,2], 3, 4]
544
sage: w = deepcopy(v)
545
sage: w[1] = 10; w[0][0] = 5
546
sage: v
547
[[1, 2], 3, 4]
548
sage: w
549
[[5, 2], 10, 4]
550
\end{lstlisting}
551
552
You probably won't have to use {\tt deepcopy} often. In over 500,000
553
lines of code in the core Sage library, deepcopy is used around 177
554
times:
555
\begin{lstlisting}
556
sage -grep deepcopy |wc -l
557
177
558
\end{lstlisting}
559
560
The main reason many people are very confused by variables being
561
references in Python is that most other mathematical software
562
(including both Mathematica and Matlab) works differently. For
563
example, in Matlab assignment to a variable creates a new copy. For
564
example, noting that arrays in Matlab are 1-based instead of 0-based,
565
\begin{lstlisting}
566
$ matlab
567
>> v = [1,2,3]
568
v =
569
1 2 3
570
>> w = v
571
w =
572
1 2 3
573
>> v(1) = 10
574
v =
575
10 2 3
576
>> w
577
w =
578
1 2 3
579
\end{lstlisting}
580
581
And in Mathematica,
582
\begin{lstlisting}
583
In[27]:= v = {1,2,3};
584
In[28]:= w = v;
585
In[29]:= v[[1]] = 10;
586
In[30]:= v
587
Out[30]= {10, 2, 3}
588
In[31]:= w
589
Out[31]= {1, 2, 3}
590
\end{lstlisting}
591
592
But of course in Sage:
593
\begin{lstlisting}
594
sage: v = [1,2,3]
595
sage: w = v
596
sage: v[0] = 10
597
sage: v
598
[10, 2, 3]
599
sage: w
600
[10, 2, 3]
601
\end{lstlisting}
602
603
\begin{remark}
604
Another subtle difference in various computer
605
languages is that exponentiation may associate either
606
left to right or right to left. For example,
607
\begin{lstlisting}
608
sage: 3^3^3
609
7625597484987
610
\end{lstlisting}
611
But in Matlab, we have
612
\begin{lstlisting}
613
$ matlab
614
>> 3^3^3
615
19683
616
\end{lstlisting}
617
Finally, in Maple we have
618
\begin{lstlisting}
619
> 3^3^3
620
syntax error, ambiguous use of `^`, please use parentheses:
621
\end{lstlisting}
622
Thus watch out: of the two possible design choices about the
623
meaning of \verb|3^3^3|, we quickly find three design decisions
624
made in practice!
625
\end{remark}
626
627
Like in Magma, Maple, Matlab, and Mathematica, you do not have to
628
explicitly declare the type of a variable, and it can have several
629
different types in a single snippet of code. This is
630
different to the situation with C, C++ and Java\footnote{Sage
631
is ``dynamically typed'', whereas C, C++ and Java are
632
``statically typed''.}.
633
Use the {\tt
634
type} function to determine the type of a variable,
635
and the {\tt id} function to find out the memory location
636
of the object that a variable references.
637
\begin{lstlisting}
638
sage: a = 10
639
sage: type(a)
640
<type 'sage.rings.integer.Integer'>
641
sage: id(a) # random; memory location a points at
642
4468006416
643
sage: a = "hello world"
644
sage: type(a)
645
<type 'str'>
646
sage: id(a) # random; new memory location a now points at
647
4507478816
648
\end{lstlisting}
649
650
\section{Control Flow}
651
The basic control flow statements in Python are {\tt if}, {\tt while} and {\tt for}.
652
The {\bf if} statement lets you choose between alternative code at runtime.
653
Here is an example:
654
%skip
655
\begin{lstlisting}
656
a = 2; b = 3
657
if a > b:
658
print(1)
659
print("-----")
660
elif a == b:
661
print(2)
662
else:
663
print(3)
664
\end{lstlisting}
665
The Python interpreter evaluates the expression right after {\tt if}
666
and before the colon, and if it evaluates to {\tt True}, then all of
667
the code that is indented before the {\tt elif} or {\tt else} is
668
executed. Otherwise, the expression right after {\tt elif} is
669
evaluated, and if {\tt True}, then the indented code directly below it
670
is evaluated. Otherwise the code under the final {\tt else} is
671
evaluated. The {\tt elif} and {\tt else} are optional, and you can have
672
any number of {\tt elif} blocks.
673
674
675
Unlike nearly every other programming language, there are no explicit
676
begin and end markers around the block of code that will get evaluated
677
when a branch of the if statement is satisfied. Instead the code is
678
indented. There are at least two advantages to Python's choice:
679
(1) you will type and read less, and
680
(2) you will not be fooled by
681
misleading indentation in code like this C code:
682
\begin{lstlisting}
683
if (a>b)
684
printf("1");
685
printf("-----");
686
\end{lstlisting}
687
688
Python's {\tt while} statement repeatedly executes the code indented
689
below it until the expression between the {\tt while} and the colon
690
evaluates to False, or until an explicit break statement is executed.
691
Here is an example:
692
%skip
693
\begin{lstlisting}
694
i = 5
695
while i > 0:
696
print(i)
697
i = i - 1
698
if i == 20:
699
break
700
\end{lstlisting}
701
When you evaluate this code, you'll see the following output:
702
%skip
703
\begin{lstlisting}
704
5
705
4
706
3
707
2
708
1
709
\end{lstlisting}
710
Each time the indented block of code is executed,
711
the number $i$ is printed out, then the line {\tt i = i - 1} replaces
712
{\tt i} by an integer that is one smaller. Once $0$ is reached, the while
713
loop terminates.
714
715
If instead, we set {\tt i = 25} at the top, and evaluate the code, we see:
716
%skip
717
\begin{lstlisting}
718
25
719
24
720
23
721
22
722
21
723
\end{lstlisting}
724
This is because the {\tt if} statement evaluates to {\tt True} once
725
$i$ hits $20$, and the {\tt break} statement causes
726
the while loop to terminate.
727
728
Use the Python {\tt for} loop to iterate over each element in a
729
list or any other ``iterable'' object.
730
For example,
731
%skip
732
\begin{lstlisting}
733
for i in [1, 2, 3, 4, 5]:
734
print(i, i*i)
735
\end{lstlisting}
736
will make a table of squares:
737
%skip
738
\begin{lstlisting}
739
(1, 1)
740
(2, 4)
741
(3, 9)
742
(4, 16)
743
(5, 25)
744
\end{lstlisting}
745
You can also use {\tt break} in a {\tt for} loop.
746
747
There are many ways to make lists to iterate over
748
(see Section~\ref{sec:lists}), for example:
749
\begin{lstlisting}
750
sage: range(10)
751
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
752
sage: range(5,20)
753
[5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]
754
sage: [1..10]
755
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
756
sage: [n^2 for n in [1..10]]
757
[1, 4, 9, 16, 25, 36, 49, 64, 81, 100]
758
sage: [1,3,..,10]
759
[1, 3, 5, 7, 9]
760
sage: xrange(10^10,10^10+10^9) # a "lazy" list
761
xrange(10000000000, 11000000000)
762
\end{lstlisting}
763
764
For example,
765
%skip
766
\begin{lstlisting}
767
for i in xrange(10^10, 10^10+10^9):
768
print(i)
769
if i > 10^10 + 5: break
770
\end{lstlisting}
771
results in
772
\begin{lstlisting}
773
10000000000
774
10000000001
775
10000000002
776
10000000003
777
10000000004
778
10000000005
779
10000000006
780
\end{lstlisting}
781
782
\section{Functions}
783
%http://uw.sagenb.org/home/pub/28/
784
785
Use \verb|def| to define a function in Python.
786
%skip
787
\begin{lstlisting}
788
def foo(a, bar, w=10):
789
if a:
790
print bar
791
# a block of code that is indented
792
print a, bar, w
793
\end{lstlisting}
794
The syntax is similar to the syntax of {\tt if}, {\tt for}, and {\tt
795
while}: a keyword, something, a colon, then an indented block of
796
code that gets executed under certain circumstances. More precisely,
797
define a function put {\tt def}, then the name of the function, then
798
in parenthesis the inputs to the function with possible default values
799
(e.g., {\tt w=10} above makes {\tt w} default to {\tt 10} if {\tt w}
800
is not specified). When the function is called, the input variables
801
to the function are set to reference the inputs, and the code in the
802
body of the function is executed.
803
804
\begin{lstlisting}
805
sage: foo(1, 'abc', 5)
806
abc
807
1 abc 5
808
sage: foo(1, 'xyz')
809
xyz
810
1 xyz 10
811
\end{lstlisting}
812
813
You can explicitly specify the input variables as follows,
814
which can make reading your code later easier:
815
\begin{lstlisting}
816
sage: foo(bar='gold', a=False, w=3)
817
False gold 3
818
\end{lstlisting}
819
820
Unlike the situation with C/C++/Java, there is {\bf absolutely no way}
821
in the Python language to explicitly declare that the types of the
822
inputs and outputs of a function. You can put constraints on types
823
explicitly using the {\tt isinstance} function, or using decorators
824
(see Section~\ref{sec:decorators}).
825
\begin{lstlisting}
826
def g(a, b):
827
if not isinstance(a, Integer):
828
raise TypeError
829
return a+b
830
\end{lstlisting}
831
Then we have:
832
\begin{lstlisting}
833
sage: g(2, 3)
834
5
835
sage: g('sage', 'math')
836
Traceback (click to the left of this block for traceback)
837
...
838
TypeError
839
\end{lstlisting}
840
841
Returning to the function {\tt foo} defined above, it will
842
just work with any inputs for which {\tt +} is defined.
843
For example, {\tt a}, {\tt b}, and {\tt c} could be strings or lists:
844
\begin{lstlisting}
845
sage: foo('a', 'b', 'c')
846
a= a b= b c= c
847
'abc'
848
sage: f([1,2], [3,4], [5,6])
849
a= [1, 2] b= [3, 4] c= [5, 6]
850
[1, 2, 3, 4, 5, 6]
851
\end{lstlisting}
852
Thus illustrates something in Python called ``duck typing''. So long
853
as an object quacks like a duck (in our case, something that supports
854
+), then we just treat it like a duck. In this sense, all Python functions
855
are extremely generic.
856
857
Any variables that are created in the body of the function are
858
{\em local to the function}, unless you explicitly use the {\tt
859
global} keyword. For example,
860
861
%skip
862
\begin{lstlisting}
863
c = 1; d = 1
864
def bar(a, b):
865
global d
866
c = a; d = b
867
print c, d
868
\end{lstlisting}
869
870
When we call {\tt bar}, the global variable {\tt d} gets changed, but
871
the global {\tt c} does not change:
872
%skip
873
\begin{lstlisting}
874
sage: bar(5, 10)
875
5 10
876
sage: print c, d
877
1 10
878
\end{lstlisting}
879
880
You can also have functions nested within functions (etc.), where the
881
nested function is completey hidden within the scode of the function
882
that contains it:
883
\begin{lstlisting}
884
c = 1; d = 1
885
def bar(a, b):
886
global d # this is a rare beast.
887
c = a; d = b
888
print c, d
889
def foo(x, y):
890
c = 'fun'; d = 'stuff'
891
print c, d
892
foo(c,d)
893
print c,d
894
\end{lstlisting}
895
Running this, note that the global c is not changed, and locally
896
within {\tt foo} we have yet another pair of variables also called
897
{\tt c} and {\tt d} that have nothing to do with the global {\tt c},
898
{\tt d} or the {\tt c}, {\tt d} defined at the top of {\tt bar}.
899
\begin{lstlisting}
900
sage: bar(5,10)
901
5 10
902
fun stuff
903
5 10
904
sage: c,d
905
(1, 10)
906
\end{lstlisting}
907
908
909
As illustrated above, a Python function can have side effects, and
910
behave differently depending on global variables. Thus Python
911
``functions'' are different than the functions $f:X\to Y$ in
912
mathematics. In mathematics, $f(x)$ depends only on $x$, not on the
913
state of some global variable, the time of day, phase of the moon,
914
etc, but in Python {\tt f(x)} can depend on more than just {\tt x}.
915
The following Python function evaluates to $x^2$ when the number of
916
seconds since the beginning is even, and $x^3$ when it is odd:
917
918
%skip
919
\begin{lstlisting}
920
def f(x):
921
import time
922
if int(time.time()) % 2 == 0:
923
return x^2
924
else:
925
return x^3
926
\end{lstlisting}
927
928
Here we imported a Python module called {\tt time} using the Python
929
{\tt import} command.
930
In case you are wondering ``what the heck is 'time''', you can type
931
\begin{lstlisting}
932
sage: import time
933
sage: help(time)
934
\end{lstlisting}
935
into Sage. In the notebook, you'll get a link to a webpage all about
936
the {\tt time} Python module. Python includes an {\em enormous}
937
standard library of modules, and you should read all about them at
938
\url{http://docs.python.org/library/}. I have more than once
939
reimplemented functionality that is already in one of the standard
940
modules because I didn't think to look at the above web page. Want to
941
use Python to parse a webpage? create JSON or XML? use regular
942
expressions? walk a directory tree? compress a file? use a SQLite
943
database? Then consult the Python standard library.
944
945
Returning to our function {\tt f} defined above, when we run it, we
946
might get:
947
%skip
948
\begin{lstlisting}
949
sage: f(7)
950
49
951
sage: f(7)
952
343
953
\end{lstlisting}
954
955
956
Sage (but not Python) also has a notion of Calculus-style functions.
957
For example,
958
\begin{lstlisting}
959
sage: f(x,y) = sin(x) + e^cos(y)
960
sage: f(2,pi)
961
e^(-1) + sin(2)
962
sage: f.integrate(x)
963
(x, y) |--> x*e^cos(y) - cos(x)
964
sage: plot3d(f, (x,-pi,pi), (y,-2*pi,2*pi), viewer='tachyon')
965
\end{lstlisting}
966
\sagegraphic{.6}{3dplot1}
967
968
969
\subsection{Further remarks on passing variables to functions}
970
We mentioned above that Python uses call by reference semantics.
971
The following example helps clarify this point very explicitly. First
972
we create a list and note where it is stored in memory (at address
973
69421536 on my computer right now).
974
\begin{lstlisting}
975
sage: v = [1,2,3]
976
sage: id(v) # random - memory location of v
977
69421536
978
\end{lstlisting}%link
979
Next we define a function that
980
prints where in memory its input {\tt w} is stored, and modifies {\tt w}:
981
%link
982
\begin{lstlisting}
983
sage: def foo(w):
984
... print "location of w =", id(w)
985
... w.append('hello')
986
... print "w =", w
987
\end{lstlisting}%link
988
When we call foo with {\tt v}, note that the variable {\tt w}
989
points to the same memory location as {\tt v}:
990
%link
991
\begin{lstlisting}
992
sage: foo(v)
993
location of w = 69421536
994
w = [1, 2, 3, 'hello']
995
\end{lstlisting}%link
996
Moreover, and it's critical you understand this, the list {\tt v}
997
has now changed!
998
%link
999
\begin{lstlisting}
1000
sage: v
1001
[1, 2, 3, 'hello']
1002
\end{lstlisting}%link
1003
If we want foo to modify a copy of {\tt v} instead, we have to
1004
explicitly use the {\tt copy} function:
1005
%link
1006
\begin{lstlisting}
1007
sage: foo(copy(v))
1008
location of w = 69535936
1009
w = [1, 2, 3, 'hello', 'hello']
1010
\end{lstlisting}%link
1011
And this worked fine, as expected:
1012
%link
1013
\begin{lstlisting}
1014
sage: v
1015
[1, 2, 3, 'hello']
1016
\end{lstlisting}
1017
1018
This illustrates part of the ``Zen of Python'':
1019
\begin{center}
1020
\LARGE\bf Explicit is better than implicit.
1021
\end{center}
1022
To see the rest of the Zen of Python, type {\tt import this} into Sage.
1023
1024
\subsection{Gotcha: Default Arguments}
1025
Consider the following function \verb|my_append(a,L)| which appends {\tt a} to {\tt L}, but whose second argument is optional, so \verb|my_append(a)|
1026
just creates a new list {\tt [a]}:
1027
\begin{lstlisting}
1028
sage: def my_append(a, L=[]):
1029
... L.append(a)
1030
... print L
1031
sage: my_append(1)
1032
[1]
1033
sage: my_append(2) # what?
1034
[1, 2]
1035
sage: my_append(3) # what? what?
1036
[1, 2, 3]
1037
\end{lstlisting}
1038
1039
What happened? You might have expected to see output
1040
of {\tt [1]}, then {\tt [2]}, then {\tt [3]}.
1041
Let's modify the function \verb|my_append| to also print the memory
1042
location of {\tt L}.
1043
\begin{lstlisting}
1044
sage: def my_append(a, L=[]):
1045
... L.append(a)
1046
... print L, id(L)
1047
sage: my_append(1) # random memory location
1048
[1] 69438424
1049
sage: my_append(2) # same random memory location
1050
[1, 2] 69438424
1051
sage: my_append(3) # same random memory location
1052
[1, 2, 3] 69438424
1053
\end{lstlisting}
1054
1055
When the function \verb|my_append| is first encountered by the Python
1056
interpreter, it evaluates each of the default arguments. When Python
1057
sees {\tt L=[]}, it creates a list in memory at location 69438424.
1058
Each time you call \verb|my_append| and don't specify the second argument, that
1059
same list---at address 69438424---is used, {\em and modified} in
1060
this case.
1061
1062
\subsection{Gotcha: Recursion}
1063
1064
Python supports recursive functions, but they are cripled in that
1065
there is by default a fairly small limit on the depth of recursion
1066
(you can increase this). This is because Python does not have ``tail
1067
recursion'' like a language such as lisp.
1068
1069
%skip
1070
\begin{lstlisting}
1071
def my_factorial(n):
1072
if n == 1: return n
1073
assert n > 0
1074
return n * my_factorial(n-1)
1075
\end{lstlisting}
1076
1077
\noindent{}This works fine:
1078
%skip
1079
\begin{lstlisting}
1080
sage: my_factorial(20)
1081
2432902008176640000
1082
\end{lstlisting}
1083
But:
1084
%skip
1085
\begin{lstlisting}
1086
sage: my_factorial(1000)
1087
Traceback (click to the left of this block for traceback)
1088
...
1089
RuntimeError: maximum recursion depth exceeded in cmp
1090
\end{lstlisting}
1091
1092
So be careful when writing recursive functions. Often recursive
1093
functions will never ever be called with a big depth. However, if you
1094
need to write a recursive function that will be called with a big
1095
depth, you can simply increase the {\tt recursionlimit} as illustrated
1096
below.
1097
%skip
1098
\begin{lstlisting}
1099
sage: import sys
1100
sage: sys.getrecursionlimit()
1101
1000
1102
sage: sys.setrecursionlimit(1000000)
1103
sage: a = my_factorial(1000) # works fine!
1104
\end{lstlisting}
1105
1106
As an aside, you can in fact write little lisp programs using Sage if
1107
you want, since Sage includes an embedded lisp interpreter.
1108
For example,
1109
\begin{lstlisting}
1110
sage: lisp.eval('(defun factorial (n) (if (= n 1) 1 (* n (factorial (- n 1)))))')
1111
'FACTORIAL'
1112
sage: lisp('(factorial 10)')
1113
3628800
1114
sage: lisp(10).factorial()
1115
3628800
1116
\end{lstlisting}
1117
1118
1119
\subsection{Style}
1120
There is a standard coding style that almost everybody uses when
1121
writing Python code. Read about it in the Python tutorial:
1122
\begin{center}\large
1123
\url{http://docs.python.org/tutorial/controlflow.html#intermezzo-coding-style}
1124
\end{center}
1125
1126
Here is a stylish example:
1127
\begin{lstlisting}
1128
def good_function(a, b = 10):
1129
"""
1130
This is a good function.
1131
1132
This function has a docstring and is named using
1133
lower_case_with_underscores.
1134
1135
It takes as input integers a and b and outputs something computed
1136
using them. (Notice that the above line is <= 79 characters.)
1137
"""
1138
c = 0
1139
1140
for i in range(a):
1141
# add i-th power of b to a and
1142
# put spaces around operators (comment on line of its own).
1143
c = b**i + a
1144
1145
# Another block, and a correctly named class (CamelCase).
1146
class UselessWrapper(int):
1147
pass
1148
1149
return UselessWrapper(c)
1150
\end{lstlisting}
1151
1152
1153
1154
\section{Classes}
1155
1156
Python classes are typically used to define your own new data type
1157
(though they can be used in other ways as well). New classes are easy
1158
to define, and support standard object-oriented features such as
1159
``multiple inheritance'' and ``operator overloading''.
1160
1161
Here is a trivial example of a class:
1162
%skip
1163
\begin{lstlisting}
1164
class CoolThing(object):
1165
def foo(self, xyz):
1166
print self, xyz
1167
\end{lstlisting}
1168
Let's try it out:
1169
\begin{lstlisting}
1170
sage: z = CoolThing()
1171
sage: z.foo('abc')
1172
<__main__.CoolThing object at 0x...> abc
1173
sage: type(z)
1174
<class '__main__.CoolThing'>
1175
\end{lstlisting}
1176
1177
The line {\tt class CoolThing(object):} starts declaration of
1178
the class {\tt CoolThing}, which derives from the builtin
1179
class {\tt object}. Typing {\tt z = CoolThing()} creates
1180
a new instance of the class with the variable {\tt z}
1181
referencing it. The {\tt foo} method defined above is a function
1182
that can only be used with instances, which we call
1183
by writing {\tt z.foo('abc')}. Note that the first argument
1184
to {\tt def foo(self, xyz)} is {\tt self}, which refers to the
1185
particular instance of the class.
1186
1187
Next, we make a more complicated class, which also illustrates
1188
how to customize creation of new objects using the
1189
\verb|__init__| ``dunder method'', define how our
1190
objects print themselves using \verb|__repr__|,
1191
and define how {\tt +} and {\tt *} implement
1192
arithmetic using \verb|__add__| and \verb|__mul__|.
1193
1194
% skip
1195
\begin{lstlisting}
1196
class MyRational:
1197
def __init__(self, n, d):
1198
self._n = Integer(n); self._d = Integer(d)
1199
def __repr__(self):
1200
return '%s/%s'%(self._n, self._d)
1201
def __add__(self, right):
1202
return MyRational(self._n*right._d + self._d*right._n,
1203
self._d*right._d)
1204
def __mul__(self, right):
1205
return MyRational(self._n*right._n, self._d*right._d)
1206
def reduced_form(self):
1207
"""Return the reduced form of this rational number."""
1208
a = self._n / self._d
1209
return MyRational(a.numerator(), a.denominator())
1210
\end{lstlisting}
1211
1212
Once we define the above class, we have our own new version of
1213
``rational numbers''.
1214
\begin{lstlisting}
1215
sage: a = MyRational(2,6); b = MyRational(2, 3)
1216
sage: print a, b
1217
2/6 2/3
1218
sage: a.reduced_form()
1219
1/3
1220
sage: c = a + b; c
1221
18/18
1222
sage: c.reduced_form()
1223
1/1
1224
\end{lstlisting}
1225
However, notice that subtraction doesn't work:
1226
\begin{lstlisting}
1227
sage: a - b
1228
Traceback (most recent call last):
1229
...
1230
TypeError: unsupported operand type(s) for -: 'instance' and 'instance'
1231
\end{lstlisting}
1232
1233
This is because we didn't define a \verb|__sub__| method. You can add
1234
that method, which looks just like the \verb|__add__| method, except
1235
with the {\tt +} replaced by a {\tt -}, and subtraction will work.
1236
Alternatively, we can define a derived class that also defines a
1237
\verb|__sub__| method, as follows:
1238
\begin{lstlisting}
1239
class MyRational2(MyRational): # inheritence (multiple also fully supported)
1240
def __sub__(self, right):
1241
return MyRational2(self._n*right._d - self._d*right._n,
1242
self._d*right._d)
1243
\end{lstlisting}
1244
This has absolutely no impact on the original {\tt MyRational} class:
1245
\begin{lstlisting}
1246
sage: MyRational(2,6) - MyRational(2, 3)
1247
Traceback (most recent call last):
1248
...
1249
TypeError: unsupported operand type(s) for -: 'MyRational' and 'MyRational'
1250
\end{lstlisting}
1251
However, instances of {\tt MyRational2} support subtraction, in addition
1252
to the multiplication and addition defined above:
1253
\begin{lstlisting}
1254
sage: a = MyRational2(2,6); b = MyRational2(2, 3)
1255
sage: print a, b
1256
2/6 2/3
1257
sage: a + b
1258
18/18
1259
sage: a - b
1260
-6/18
1261
\end{lstlisting}
1262
1263
Big caveat (!): If you do {\tt a+b}, then the resulting object
1264
is an instance of {\tt MyRational}, not of {\tt MyRational2}!
1265
1266
\begin{lstlisting}
1267
sage: type(a-b)
1268
<class '__main__.MyRational2'>
1269
sage: type(a+b)
1270
<class '__main__.MyRational'>
1271
\end{lstlisting}
1272
1273
This is because the \verb|__add__| method is execute, which explicitly
1274
refers to MyRational. You can make the code more robust regarding
1275
derivation by using \verb|self.__class__|, as illustrated below:
1276
\begin{lstlisting}
1277
class MyRational3(object):
1278
def __init__(self, n, d): # called to initialize object
1279
self._n = Integer(n); self._d = Integer(d)
1280
def __add__(self, right): # called to implement self + right
1281
return self.__class__(self._n*right._d + self._d*right._n,
1282
self._d*right._d)
1283
1284
class MyRational4(MyRational3):
1285
def __sub__(self, right): # called to implement self + right
1286
return self.__class__(self._n*right._d - self._d*right._n,
1287
self._d*right._d)
1288
\end{lstlisting}
1289
1290
Now things work better:
1291
\begin{lstlisting}
1292
sage: a = MyRational4(2,6); b = MyRational4(2, 3)
1293
sage: type(a-b), type(a+b)
1294
(<class '__main__.MyRational4'>, <class '__main__.MyRational4'>)
1295
\end{lstlisting}
1296
1297
1298
Here is another example that illustrates a default class attribute.
1299
%skip
1300
\begin{lstlisting}
1301
class MyClass:
1302
"""
1303
A simple example class.
1304
"""
1305
# a Python object attribute; this is basically a default
1306
# piece of data that is available to each instance of the
1307
# class, but can be changed in the instance without changing
1308
# it in the class. (See example below.)
1309
i = 12345
1310
1311
# A function attribute. Again, this is available to each
1312
# instance, and can be changed in the instance without
1313
# changing the class object itself.
1314
def f(self):
1315
return 'hello world'
1316
\end{lstlisting}
1317
First notice that {\tt MyClass} itself is just another Python object
1318
(we can have variables reference it, pass it into functions, etc.):
1319
%skip
1320
\begin{lstlisting}
1321
sage: MyClass
1322
<class __main__.MyClass at 0x...>
1323
sage: MyClass.i
1324
12345
1325
sage: MyClass.f
1326
<unbound method MyClass.f>
1327
sage: MyClass.__doc__
1328
'A simple example class.'
1329
\end{lstlisting}
1330
We ``call'' {\tt MyClass} to create an instance \verb|x| of it:
1331
%skip
1332
\begin{lstlisting}
1333
sage: x = MyClass(); x
1334
<__main__.MyClass instance at 0x...>
1335
\end{lstlisting}
1336
We can then call methods of the instance \verb|x| and get access to its attributes.
1337
%skip
1338
\begin{lstlisting}
1339
sage: x.f()
1340
'hello world'
1341
sage: x.i
1342
12345
1343
\end{lstlisting}
1344
We can also change the attributes and methods of {\tt x}.
1345
%skip
1346
\begin{lstlisting}
1347
sage: x.i = 50
1348
sage: def g(): return "goodbye"
1349
sage: x.f = g
1350
sage: x.f()
1351
'goodbye'
1352
\end{lstlisting}
1353
1354
This does not change the attributes or methods of {\tt MyClass} or
1355
new instances of {\tt MyClass}.
1356
%skip
1357
\begin{lstlisting}
1358
sage: y = MyClass(); y.i
1359
12345
1360
sage: y.f()
1361
'hello world'
1362
\end{lstlisting}
1363
We could change those if we wanted to though, as follows:
1364
%skip
1365
\begin{lstlisting}
1366
sage: def g(self): return "goodbye"
1367
sage: MyClass.f = g
1368
sage: y = MyClass()
1369
sage: y.f()
1370
'goodbye'
1371
\end{lstlisting}
1372
1373
1374
As you can see, Python is a {\em dynamic} language. The above is all
1375
happening at runtime. This is different than static languages such as
1376
C/C++/Java. It has pros and cons, with the main con being that Python
1377
can be slower. We will learn about Cython soon, which is similar to
1378
Python but gives you the option of surrending some of the dynamic
1379
features of Python in exchange for faster (but less dynamic) static
1380
semantics.
1381
1382
1383
\subsection{Creating a Number}
1384
The next example illustrates how to use self and some ``dunder''
1385
(=double underscore) methods:
1386
1387
%skip
1388
\begin{lstlisting}
1389
class Number:
1390
def __init__(self, x):
1391
# called when Number is instantiated
1392
self.x = x
1393
def __repr__(self):
1394
# defines how Number prints
1395
return "The Number %s"%self.x
1396
def __add__(self, right):
1397
# defines how "+" works
1398
return Number(self.x + right.x)
1399
\end{lstlisting}
1400
1401
1402
Now we create a number {\tt n}, print it, and add it (using +) to
1403
another number.
1404
%skip
1405
\begin{lstlisting}
1406
sage: n = Number(37)
1407
sage: n
1408
The Number 37
1409
sage: n + Number(15)
1410
The Number 52
1411
\end{lstlisting}
1412
1413
Try to add subtraction and multiplication to the class {\tt Number}
1414
right now. The names of the relevant dunder methods are \verb|__sub__|
1415
and \verb|__mul__|.
1416
1417
1418
1419
See \url{http://docs.python.org/reference/datamodel.html} for long
1420
lists of dunder methods.
1421
1422
1423
1424
\section{Data Types: list, tuple, set, str, and file}\label{sec:datatypes}
1425
1426
% http://wiki.wstein.org/10/480b/lectures/lec7
1427
% http://wiki.wstein.org/10/480b/lectures/lec8
1428
1429
\subsection{Lists}\label{sec:lists}
1430
A list in Python is a finite ordered ``list'' of any Python objects at
1431
all. Many useful operations are supported, along with a handy ``list
1432
comprehension'' notation that makes building lists easy.
1433
1434
First we create a list, whose entries are an integer, a string, a data
1435
type, and another list with a list in it. Note that {\tt v} has type
1436
{\tt list}.
1437
\begin{lstlisting}
1438
sage: v = [3, 'hello', Integer, ['a', [1,2]]]
1439
sage: type(v)
1440
<type 'list'>
1441
sage: v
1442
[3, 'hello', <type 'sage.rings.integer.Integer'>, ['a', [1, 2]]]
1443
\end{lstlisting}%link
1444
Lists in Python are 0-based, in that {\tt v[0]} is
1445
the first entry in the list. {\em Remember this!}
1446
%link
1447
\begin{lstlisting}
1448
sage: v[0]
1449
3
1450
sage: v[1]
1451
'hello'
1452
\end{lstlisting}%link
1453
You can also index into the list from the other side by using negative numbers:
1454
%link
1455
\begin{lstlisting}
1456
sage: v[-1]
1457
['a', [1, 2]]
1458
sage: v[-2]
1459
<type 'sage.rings.integer.Integer'>
1460
\end{lstlisting}%link
1461
You can slice lists. When slicing you specify a start and stop point,
1462
and take all the elements between. Keep in mind that it includes the
1463
starting point you specify, but excludes the endpoint.
1464
%link
1465
\begin{lstlisting}
1466
sage: v[1:]
1467
['hello', <type 'sage.rings.integer.Integer'>, ['a', [1, 2]]]
1468
sage: v[0:3]
1469
[3, 'hello', <type 'sage.rings.integer.Integer'>]
1470
sage: v[0:3:2] # just the even-indexed positions
1471
[3, <type 'sage.rings.integer.Integer'>]
1472
\end{lstlisting}
1473
Use {\tt len} to get the length of a list. New Sage/Python users often get very frustrated trying to figure out how to find the length of a list. Just memorize this right now!
1474
%link
1475
\begin{lstlisting}
1476
sage: len(v)
1477
4
1478
\end{lstlisting}%link
1479
You can also sort, append to, delete elements from, extend, etc., lists.
1480
See the Python documentation.
1481
%link
1482
\begin{lstlisting}
1483
sage: w = copy(v)
1484
sage: w.sort(); w
1485
[3, ['a', [1, 2]], 'hello', <type 'sage.rings.integer.Integer'>]
1486
sage: w.extend([1,2,3,4]); w
1487
[3, ['a', [1, 2]], 'hello', <type 'sage.rings.integer.Integer'>,
1488
1, 2, 3, 4]
1489
\end{lstlisting}
1490
1491
You can build lists in place using list comprehension, which is a lot like "set building notation" in mathematics. For example:
1492
%link
1493
\begin{lstlisting}
1494
sage: [n*(n+1)/2 for n in range(1, 10) if n%2 == 1]
1495
[1, 6, 15, 28, 45]
1496
\end{lstlisting}
1497
1498
1499
The basic structure of a list comprehension is the following (there are
1500
more complicated forms):
1501
%skip
1502
\begin{lstlisting}
1503
[ <expression(i)> for i in <iterable> <optional if condition> ]
1504
\end{lstlisting}
1505
1506
Notice above that {\tt for n in range(1,10)} and {\tt if n\%2 == 1}
1507
are both valid snippets of Python code. Aside from possible scoping
1508
issues, list comprehensions are basically equivalent to combining a
1509
{\tt for} loop with an {\tt if} statement in them, where you append to
1510
a list. To illustrate this, note that you can literally almost
1511
rearrange the code of such a {\tt for} loop into a list comprehension, for
1512
example:
1513
%skip
1514
\begin{lstlisting}
1515
z = []
1516
for n in range(1, 10):
1517
if n % 2 == 1:
1518
z.append(n*(n+1)/2)
1519
\end{lstlisting}
1520
1521
If you evaluate the above code, then print {\tt z}, you'll see
1522
%skip
1523
\begin{lstlisting}
1524
sage: z
1525
[1, 6, 15, 28, 45]
1526
\end{lstlisting}
1527
1528
If you want to be effective with Sage/Python, you must master lists.
1529
1530
\subsection{Tuples}
1531
1532
Tuples are similar to lists, except you can't change which objects are
1533
stored in a tuple. Also, there is no tuple-comprehension; you have to
1534
make a list {\tt v}, then change it into a tuple by typing {\tt
1535
tuple(v)}. You can however, change the objects themselves if they
1536
are mutable.
1537
1538
\begin{lstlisting}
1539
sage: v = (3, 'hello', Integer, ['a', [1,2]]); type(v)
1540
<type 'tuple'>
1541
sage: v[0] = 5 # nope!
1542
Traceback (most recent call last):
1543
...
1544
TypeError: 'tuple' object does not support item assignment
1545
sage: v[3].append('change a mutable entry'); v
1546
(3, 'hello', <type 'sage.rings.integer.Integer'>,
1547
['a', [1, 2], 'change a mutable entry'])
1548
\end{lstlisting}
1549
1550
{\bf BIG FAT WARNING:} The following looks like a ``tuple
1551
comprehension'' (if there were such a thing), but it isn't one:
1552
\begin{lstlisting}
1553
sage: w = (n*(n+1)/2 for n in range(1, 10) if n%2 == 1); type(w)
1554
<type 'generator'>
1555
\end{lstlisting}%link
1556
1557
Notice that you can't index into {\tt w}:
1558
%link
1559
\begin{lstlisting}
1560
sage: w[0]
1561
Traceback (click to the left of this block for traceback)
1562
...
1563
TypeError: 'generator' object is unsubscriptable
1564
\end{lstlisting}%link
1565
You can iterate over {\tt w} though:
1566
%link
1567
\begin{lstlisting}
1568
sage: for n in w: print n,
1569
1 6 15 28 45
1570
\end{lstlisting}%link
1571
Here, we get no output since {\tt w} is ``used up''.
1572
%link
1573
\begin{lstlisting}
1574
sage: for n in w: print n,
1575
\end{lstlisting}
1576
1577
1578
Anyway, if you want to make a tuple using a list comprehension, be
1579
explicit, like so:
1580
\begin{lstlisting}
1581
sage: tuple( n*(n+1)/2 for n in range(1, 10) if n%2 == 1 )
1582
(1, 6, 15, 28, 45)
1583
\end{lstlisting}
1584
1585
1586
1587
\subsection{Strings}
1588
1589
A string is a finite immutable (unchangeable) sequence of characters.
1590
Python supports a wonderful range of string processing functions. To
1591
make a string literal:
1592
\begin{itemize}
1593
\item Enclose it is in either single or double quotes (just be
1594
consistent) -- if you use single quotes you can use double quotes in
1595
your string without escaping them, and vice versa.
1596
\item For a multiline string use three single or double quotes in a
1597
row -- then you can include newlines directly in your string.
1598
\item There are many escape characters for including special
1599
characters in strings, e.g., \verb|'\n'| for ``newline''. If you put
1600
the letter {\tt r} right before the quotes you get a raw string, for
1601
which a backslash just stays a backslash and you can't escape anything;
1602
this is often useful for \LaTeX{} code.
1603
\end{itemize}
1604
1605
The following examples illustrates some of the above ways of creating strings.
1606
1607
\begin{lstlisting}
1608
sage: s = "this is a string's string using double quotes"; s
1609
"this is a string's string using double quotes"
1610
sage: print s
1611
this is a string's string using double quotes
1612
sage: s = 'this is a string"s using single quotes'; s
1613
'this is a string"s using single quotes'
1614
\end{lstlisting}
1615
1616
%skip
1617
\begin{lstlisting}
1618
s = """this is a
1619
multiline string."""
1620
1621
s = r"""Consider \sin(x) +
1622
\cos(y) and add \pi."""
1623
\end{lstlisting}
1624
1625
Strings in Python are extremely flexible and easy to manipulate. You
1626
can slice them exactly like lists, find substrings, concatenate, etc.
1627
1628
\begin{lstlisting}
1629
sage: s = "This is a string."; s[:10]
1630
'This is a '
1631
sage: s[10:]
1632
'string.'
1633
sage: s[::2] # get just the even indexed characters
1634
'Ti sasrn.'
1635
sage: s.find('a')
1636
8
1637
sage: s + " Yes, a string."
1638
'This is a string. Yes, a string.'
1639
sage: s.replace('a', 'b')
1640
'This is b string.'
1641
\end{lstlisting}
1642
1643
The join method is also amazingly useful. If {\tt s} is a string,
1644
then {\tt s.join([list of strings])} joins together the list of
1645
strings putting {\tt s} between each.
1646
1647
\begin{lstlisting}
1648
sage: ', '.join(['Stein', 'William', 'Arthur'])
1649
'Stein, William, Arthur'
1650
\end{lstlisting}
1651
1652
Other useful methods are upper and capitalize:
1653
1654
\begin{lstlisting}
1655
sage: s = 'this is lower case'; s.upper()
1656
'THIS IS LOWER CASE'
1657
sage: s.capitalize()
1658
'This is lower case'
1659
\end{lstlisting}
1660
1661
Finally, the string formating operator \% appears constantly in Python
1662
code and is extremely useful to know about. Basically, you just put
1663
\%s's in your string, and these get replaced by the string
1664
representations of a tuple of Python objects. Here's how you use it:
1665
\begin{lstlisting}
1666
sage: 'Hi %s. Meet %s.'%('Mom', 2/3)
1667
'Hi Mom. Meet 2/3.'
1668
\end{lstlisting}
1669
1670
Really what just happened was we created a string and a tuple, and
1671
used the mod operator on them, as illustrated below.
1672
\begin{lstlisting}
1673
sage: s = 'Hi %s. Meet %s.'
1674
sage: t = ('Mom', 2/3)
1675
sage: s % t
1676
'Hi Mom. Meet 2/3.'
1677
\end{lstlisting}
1678
1679
There are many other formating options besides just \%s. E.g., \%f is useful for numerical computations.
1680
1681
\begin{lstlisting}
1682
sage: '%.2f %.3f'%(.5, 7/11)
1683
'0.50 0.636'
1684
\end{lstlisting}
1685
1686
Above, \%.2f formats the string with 2 decimal digits after the point,
1687
and \%.3f with 3 decimal digits.
1688
1689
\subsection{Sets}
1690
1691
A set consists of unique elements with no ordering. You know what is
1692
or isn't in the set and can interate over it. The elements of a set
1693
must be immutable, since otherwise there would be no way to guarantee
1694
objects stay unique after putting them together in a set. Lists are
1695
{\em not immutable} so can't be put in a set, but strings can
1696
be.
1697
\begin{lstlisting}
1698
sage: v = ['this', 'is', 'what', 'this', 'is']; v
1699
['this', 'is', 'what', 'this', 'is']
1700
sage: X = set(v); X
1701
set(['this', 'is', 'what'])
1702
sage: type(X)
1703
<type 'set'>
1704
sage: X[0] # makes no sense
1705
Traceback (most recent call last):
1706
...
1707
TypeError: 'set' object does not support indexing
1708
sage: 'this' in X
1709
True
1710
sage: 'that' in X
1711
False
1712
sage: for a in X: print a,
1713
this is what
1714
\end{lstlisting}
1715
Here is how to use the set data structure to obtain
1716
the distinct types appearing in a list:
1717
\begin{lstlisting}
1718
sage: v = [1/2, 5/8, 2.5, 5/2, 3.8]
1719
sage: t = [type(a) for a in v]; t
1720
[<type 'sage.rings.rational.Rational'>,
1721
<type 'sage.rings.rational.Rational'>,
1722
<type 'sage.rings.real_mpfr.RealLiteral'>,
1723
<type 'sage.rings.rational.Rational'>,
1724
<type 'sage.rings.real_mpfr.RealLiteral'>]
1725
sage: list(set(t))
1726
[<type 'sage.rings.real_mpfr.RealLiteral'>,
1727
<type 'sage.rings.rational.Rational'>]
1728
\end{lstlisting}
1729
1730
If you create your own class, you can decide whether or not Python
1731
should consider it immutable by whether or not you define a
1732
\verb|__hash__| dunder method. If defined, then your object is
1733
considered immutable, and is allowed in sets. First, notice that
1734
sets can't contain lists.
1735
\begin{lstlisting}
1736
sage: v = [[1,2], [1,4]]
1737
sage: set(v)
1738
Traceback (most recent call last):
1739
...
1740
TypeError: unhashable type: 'list'
1741
\end{lstlisting}
1742
However, nothing stops us from making a class that derives from list and has a hash method:
1743
\begin{lstlisting}
1744
class NaughtyList(list):
1745
def __hash__(self): # a 32 or 64-bit int; equal objects should have the same hash.
1746
return hash(str(self))
1747
\end{lstlisting}
1748
\begin{lstlisting}
1749
sage: v = [NaughtyList([1,2]), NaughtyList([1,4])]; v
1750
[[1, 2], [1, 4]]
1751
sage: X = set(v); X
1752
set([[1, 2], [1, 4]])
1753
\end{lstlisting}
1754
Do something naughty:
1755
\begin{lstlisting}
1756
sage: v[1][1] = 2
1757
sage: X
1758
set([[1, 2], [1, 2]])
1759
sage: v[0] == v[1]
1760
True
1761
\end{lstlisting}
1762
The set doesn't know:
1763
\begin{lstlisting}
1764
sage: X
1765
set([[1, 2], [1, 2]])
1766
\end{lstlisting}
1767
1768
1769
1770
\subsection{Files}
1771
1772
It is straightforward to open, read, write, append to, and close files
1773
on disk. For example, below we create a file {\tt foo}, write to it,
1774
cose it, open it, then read it.
1775
1776
\begin{lstlisting}
1777
sage: F = open('foo','w')
1778
sage: F
1779
<open file 'foo', mode 'w' at 0x...>
1780
sage: F.write('hello there')
1781
sage: F.close()
1782
sage: print open('foo').read()
1783
hello there
1784
\end{lstlisting}
1785
1786
In the Sage notebook each input cell is executed in a different
1787
directory. Thus if you just create a file in one cell, you can't
1788
easily open and read it in another cell. The best workaround is to use
1789
the {\tt DATA} variable, which is a string that contains the name of a
1790
single directory that all cells have access to, and which you can
1791
upload/download files to and from using the Data menu.
1792
1793
\begin{lstlisting}
1794
sage notebook: open(DATA + 'foo','w').write('hi')
1795
sage notebook: print open(DATA + 'foo').read()
1796
hi
1797
sage notebook: os.system('ls -l %s'%DATA)
1798
total 4
1799
-rw-r--r-- 1 sagenbflask sagenbflask 2 ... ... foo
1800
0
1801
sage notebook: print DATA
1802
/sagenb/flask/sage_notebook.sagenb/home/.../.../data/
1803
\end{lstlisting}
1804
1805
Another important topic involving files is how to read in interesting
1806
files, e.g., png image files, wav audio files, csv files, Excel
1807
spreadsheets, etc. There are various ways of loading a huge range of
1808
interesting files into Sage, but unfortunately there is still no
1809
single simple command that parses them all. This would be a good idea
1810
for a student project.
1811
1812
1813
\section{Exception Handling}
1814
1815
Python fully supports exception handling, which allows us to raise and
1816
handle error conditions eloquently. The syntax in Python for
1817
exception handling is as simple and straightforward as you can
1818
possibly imagine.
1819
1820
We would like to write a function \verb|formal_sum| that
1821
takes as input two arbitrary objects, and adds them (using +) if {\em
1822
possible}, and if not creates their sum in some formal sense.
1823
Our first attempt, of course, does not just magically just work:
1824
\begin{lstlisting}
1825
def formal_sum(a, b):
1826
return a + b
1827
\end{lstlisting}
1828
Then:
1829
\begin{lstlisting}
1830
sage: formal_sum(2, 3) # good
1831
5
1832
sage: formal_sum(5, [1,2,3]) # nope
1833
Traceback (most recent call last):
1834
...
1835
TypeError: unsupported operand parent(s) for '+':
1836
'Integer Ring' and '<type 'list'>'
1837
\end{lstlisting}
1838
How can we {\em know} whether or not \verb|a + b| will work?
1839
You could try to do something really complicated by attempting
1840
to predict (via looking at code?) whether \verb|__add__| will work,
1841
but that way insanity lies. Instead, use {\em exception handling}:
1842
1843
\begin{lstlisting}
1844
class FormalSum(object):
1845
def __init__(self, a, b):
1846
self.a = a; self.b = b
1847
def __repr__(self):
1848
return '%s + %s'%(self.a, self.b)
1849
1850
def formal_sum(a, b):
1851
try:
1852
return a + b
1853
except TypeError:
1854
return FormalSum(a,b)
1855
\end{lstlisting}
1856
The class {\tt FormalSum} block above defines a new Python class whose instances
1857
represent the formal sum of the two attributes {\tt a} and {\tt b}, and which
1858
print themselves nicely. The function \verb|formal_sum| tries to add
1859
{\tt a} and {\tt b}, and if this causes a {\tt TypeError}, it instead creates
1860
a {\tt FormalSum} instance.
1861
\begin{lstlisting}
1862
sage: formal_sum(3, 8)
1863
11
1864
sage: formal_sum(5, [1,2,3])
1865
5 + [1, 2, 3]
1866
sage: formal_sum(5, 'five')
1867
5 + five
1868
\end{lstlisting}
1869
1870
1871
1872
For our next example, instead of catching an error, we create a
1873
function {\tt divide} that raises an error if the second input {\tt d}
1874
equals {\tt 0}.
1875
1876
%skip
1877
\begin{lstlisting}
1878
def divide(n, d):
1879
if d == 0:
1880
raise ZeroDivisionError, "divide by 0!?!"
1881
return n/d
1882
\end{lstlisting}
1883
1884
Try it out:
1885
1886
%skip
1887
\begin{lstlisting}
1888
sage: divide(5, 7)
1889
5/7
1890
sage: divide(5, 0)
1891
Traceback (most recent call last):
1892
...
1893
ZeroDivisionError: divide by 0!?!
1894
\end{lstlisting}
1895
1896
Typically, if you try to divide numbers at the denominator is 0, Sage
1897
will raise a {\tt ZeroDivisionError}. Just as above, we can catch
1898
this case if we want, and return something else, as illustrated below:
1899
1900
%skip
1901
\begin{lstlisting}
1902
def divide2(n, d):
1903
try:
1904
return divide(n, d) # or just put "n/d"
1905
except ZeroDivisionError:
1906
return 'infinity'
1907
\end{lstlisting}
1908
1909
%skip
1910
\begin{lstlisting}
1911
sage: divide2(5, 3)
1912
5/3
1913
sage: divide2(5, 0)
1914
'infinity'
1915
\end{lstlisting}
1916
1917
This web page \url{http://docs.python.org/lib/module-exceptions.html}
1918
lists all the standard builtin exceptions along with what each means.
1919
Some common exceptions that often appear in the context of mathematics
1920
are: {\tt TypeError, ZeroDivisionError, ArithmeticError, ValueError,
1921
RuntimeError, NotImplementedError, OverflowError, IndexError}.
1922
We illustrate each of these below:
1923
1924
\begin{lstlisting}
1925
sage: ''.join([1,2])
1926
Traceback (most recent call last):
1927
...
1928
TypeError: sequence item 0: expected string, sage.rings.integer.Integer found
1929
sage: 1/0
1930
Traceback (most recent call last):
1931
...
1932
ZeroDivisionError: Rational division by zero
1933
sage: factor(0)
1934
Traceback (most recent call last):
1935
...
1936
ArithmeticError: Prime factorization of 0 not defined.
1937
sage: CRT(2, 1, 3, 3)
1938
Traceback (most recent call last):
1939
...
1940
ValueError: No solution to crt problem since gcd(3,3) does not divide 2-1
1941
sage: find_root(SR(1), 0, 5)
1942
Traceback (most recent call last):
1943
...
1944
RuntimeError: no zero in the interval, since constant expression is not 0.
1945
sage: RealField(50)(brun)
1946
Traceback (most recent call last):
1947
...
1948
NotImplementedError: brun is only available up to 41 bits
1949
sage: float(5)^float(902830982304982)
1950
Traceback (most recent call last):
1951
...
1952
OverflowError: (34, 'Numerical result out of range')
1953
sage: v = [1,2,3]
1954
sage: v[10]
1955
Traceback (most recent call last):
1956
...
1957
IndexError: list index out of range
1958
\end{lstlisting}
1959
1960
The key points to remember about exceptions are:
1961
\begin{enumerate}
1962
\item Three keywords: {\tt try}, {\tt except}, {\tt raise}
1963
\item How to catch multiple possible exceptions correctly (there is a gotcha here -- see below!).
1964
\item One more keyword: {\tt finally}
1965
\end{enumerate}
1966
1967
There is more to exceptions, but these are the key points. We
1968
illustrate the last two below in a contrived example.
1969
1970
1971
%skip
1972
\begin{lstlisting}
1973
def divide(n, d):
1974
try:
1975
return n/d
1976
except (ZeroDivisionError, ValueError), msg:
1977
print msg
1978
return '%s/%s'%(n,d)
1979
except TypeError, NotImplementedError:
1980
# the above line is PURE EVIL(!)
1981
print "NotImplementedError is now '%s'"%NotImplementedError
1982
print "What have I done?!"
1983
finally:
1984
print "The finally block is *always* executed."
1985
\end{lstlisting}
1986
1987
Now try it out:
1988
%skip
1989
\begin{lstlisting}
1990
sage: divide(2,3)
1991
The finally block is *always* executed.
1992
2/3
1993
sage: divide(2, 0)
1994
Rational division by zero
1995
The finally block is *always* executed.
1996
'2/0'
1997
sage: divide('hi', 'mom')
1998
NotImplementedError is now 'unsupported operand type(s) for /: 'str' and 'str''
1999
What have I done?!
2000
The finally block is *always* executed.
2001
\end{lstlisting}
2002
2003
The form of the except statement is:
2004
%skip
2005
\begin{lstlisting}
2006
except [single exception], message
2007
\end{lstlisting}
2008
%skip
2009
\begin{lstlisting}
2010
except (tuple,of,exceptions), message</p>
2011
\end{lstlisting}
2012
2013
For example,
2014
\begin{lstlisting}
2015
try:
2016
import foobar
2017
1/0
2018
except (ZeroDivisionError, ImportError), msg:
2019
print "oops --", msg
2020
\end{lstlisting}
2021
outputs {\tt oops -- No module named foobar}
2022
and
2023
\begin{lstlisting}
2024
try:
2025
1/0
2026
import foobar
2027
except (ZeroDivisionError, ImportError), msg:
2028
print "oops --", msg
2029
\end{lstlisting}
2030
outputs {\tt oops -- Rational division by zero}.
2031
2032
An extremely confusing error, which has cost me hours of frustration,
2033
is to write
2034
%skip
2035
\begin{lstlisting}
2036
except exception1, exception2:
2037
\end{lstlisting}
2038
2039
The result is that if exception1 occurs, then exception2 is set to the
2040
error message. Don't make the same mistake.
2041
2042
For example, if we evaluate
2043
\begin{lstlisting}
2044
try:
2045
1/0
2046
import foobar
2047
except ZeroDivisionError, ImportError:
2048
print "oops --"
2049
\end{lstlisting}
2050
then evaluate
2051
\begin{lstlisting}
2052
try:
2053
import foobar
2054
1/0
2055
except ImportError, ZeroDivisionError:
2056
print "oops --"
2057
\end{lstlisting}
2058
we see
2059
\begin{lstlisting}
2060
Traceback (click to the left of this block for traceback)
2061
...
2062
ImportError: No module named foobar
2063
\end{lstlisting}
2064
2065
Wait, what just happened above? We appear to have totally broken
2066
Python!? Actually, we have smashed the {\tt ImportError} variable,
2067
making it point at the {\tt ZeroDivisionError} message above!
2068
2069
\begin{lstlisting}
2070
sage: ImportError
2071
ZeroDivisionError('Rational division by zero',)
2072
\end{lstlisting}
2073
We can fix this for now using the {\tt reset} command, which resets a variable
2074
to its default state when Sage started up:
2075
\begin{lstlisting}
2076
sage: reset('ImportError')
2077
sage: ImportError
2078
<type 'exceptions.ImportError'>
2079
\end{lstlisting}
2080
2081
2082
Another {\em major mistake} I made once\footnote{Actually, I made it
2083
several hundred times in 2005--2006!}
2084
with exceptions is illustrated in the following example code:
2085
%skip
2086
\begin{lstlisting}
2087
def divide(n, d):
2088
if d == 0:
2089
raise ZeroDivisionError, "error dividing n(=%s) by d(=%s)"%(n,d)
2090
\end{lstlisting}
2091
2092
It's so friendly and nice having a helpful error message that explains
2093
what went wrong in the division, right? (No!):
2094
%skip
2095
\begin{lstlisting}
2096
sage: divide(3948,0)
2097
Traceback (click to the left of this block for traceback)
2098
...
2099
ZeroDivisionError: error dividing n(=3948) by d(=0)
2100
\end{lstlisting}
2101
2102
But if we put a large value of $n$ as input, then several seconds (or
2103
minutes!) will be spent just creating the error message. It's
2104
ridiculous that divide2 below takes over 3 seconds, given that all the
2105
time is spent creating an error message that we just ignore.
2106
%skip
2107
\begin{lstlisting}
2108
def divide2(n,d):
2109
try:
2110
divide(n, d)
2111
except ZeroDivisionError, msg:
2112
return 'infinity'
2113
\end{lstlisting}
2114
2115
%skip
2116
\begin{lstlisting}
2117
sage: n = 3^(10^7)
2118
sage: time divide2(n, 0)
2119
'infinity'
2120
Time: CPU 3.45 s, Wall: 3.46 s
2121
\end{lstlisting}
2122
2123
Once the Sage developer David Harvey spent a long time tracking down
2124
why certain power series arithmetic in Sage was so slow for his
2125
application. It turned out that deep in the code there was a
2126
try/except block in which the error message itself took over a minute
2127
to construct, and then it was immediately discarded. {\bf Moral:}
2128
{\em be very careful when constructing the error message that you
2129
include along with an exception!}
2130
2131
2132
\section{Decorators}\label{sec:decorators}
2133
2134
The definition of decorators is remarkably simple,
2135
but using them is subtle, powerful, and potentially dangerous.
2136
From PEP 318 (see \url{http://www.python.org/dev/peps/pep-0318}), we have
2137
the following new notation in Python (note the first line with the mysterious
2138
@ sign):
2139
%skip
2140
\begin{lstlisting}
2141
@dec1
2142
def func(arg1, arg2, ...):
2143
pass
2144
\end{lstlisting}
2145
This is equivalent to:
2146
2147
%skip
2148
\begin{lstlisting}
2149
def func(arg1, arg2, ...):
2150
pass
2151
func = dec2(dec1(func))
2152
\end{lstlisting}
2153
That's it!
2154
2155
To motivate the point of decorators, let's make a function called {\tt
2156
echo} that takes as input a function {\tt f}, and returns a new
2157
function that acts just like {\tt f}, except that it prints all of its
2158
inputs. Here we use {\tt *args} and {\tt **kwds}, which is something
2159
that we have not discussed before. In Python, use {\tt *args} to
2160
refer to all of the positional inputs to a function, and {\tt **kwds}
2161
to refer to all of the keyword inputs. When you do this, {\tt args}
2162
is a Python tuple containing the positional inputs in order, and {\tt kwds}
2163
is a dictionary of the {\tt keyword=value} pairs. You can pass args and
2164
kwds on to another function (as illustrated below) by typing {\tt *args}
2165
and {\tt **kwds}.
2166
2167
%skip
2168
\begin{lstlisting}
2169
def echo(f):
2170
def g(*args, **kwds):
2171
print "args =", args
2172
print "kwds =", kwds
2173
return f(*args, **kwds)
2174
return g
2175
\end{lstlisting}
2176
2177
Now, let's try it out. Define a function:
2178
%skip
2179
\begin{lstlisting}
2180
def add_em_up(a, b, c):
2181
return a + b + c
2182
\end{lstlisting}
2183
Now use it:
2184
%skip
2185
\begin{lstlisting}
2186
sage: add_em_up(1, 2, 3)
2187
6
2188
\end{lstlisting}
2189
2190
The following works, but it sort of looks funny.
2191
%skip
2192
\begin{lstlisting}
2193
sage: add_em_up = echo(add_em_up)
2194
sage: add_em_up(1, 2, 3)
2195
args = (1, 2, 3)
2196
kwds = {}
2197
6
2198
\end{lstlisting}
2199
2200
Using a decorator right when we define \verb|add_em_up| is much, much cleaner:
2201
%skip
2202
\begin{lstlisting}
2203
@echo
2204
def add_em_up(a, b, c):
2205
return a + b + c
2206
\end{lstlisting}
2207
Now we have:
2208
%skip
2209
\begin{lstlisting}
2210
sage: add_em_up(1, 2, 3)
2211
args = (1, 2, 3)
2212
kwds = {}
2213
6
2214
\end{lstlisting}
2215
2216
Here's another example of a very handy decorator (only available in the Sage notebook):
2217
2218
%skip
2219
\begin{lstlisting}
2220
@interact
2221
def add_em_up(a=1, b=[1..10], c=(1..10)):
2222
return a + b + c
2223
\end{lstlisting}
2224
\begin{center}
2225
\includegraphics[width=.5\textwidth]{graphics/interact}
2226
\end{center}
2227
2228
A hope you can {\em sense the possibilities...}. Here we do type checking:
2229
2230
%skip
2231
\begin{lstlisting}
2232
class returns:
2233
def __init__(self, typ):
2234
self._typ = typ
2235
def __call__(self, f):
2236
return lambda *args, **kwds: self._typ(f(*args, **kwds))
2237
2238
2239
@returns(float)
2240
def f(n,m):
2241
"""Returns n + m."""
2242
return n + m
2243
\end{lstlisting}
2244
Let's try it out:
2245
%skip
2246
\begin{lstlisting}
2247
sage: f(2,3)
2248
5.0
2249
sage: type(f(5,6))
2250
<type 'float'>
2251
sage: f('4', '123')
2252
4123.0
2253
\end{lstlisting}
2254
2255
Here's another example I use all the time. If you put {\tt
2256
@parallel(ncpus)} before a function {\em and} you call the function
2257
using a list as input, then the function gets evaluated at each
2258
element of the list in parallel, and the results are returned as an
2259
iterator. If you call the function without giving a list as input, it
2260
just works as usual (not in parallel).
2261
2262
2263
%skip
2264
\begin{lstlisting}
2265
@parallel(10)
2266
def f(n):
2267
sleep(1) # make function seem slow
2268
return n*(n+1)/2
2269
\end{lstlisting}
2270
First, try it not in parallel, which takes a long time.
2271
%skip
2272
\begin{lstlisting}
2273
%time
2274
sage: for n in [1..10]: print n, f(n)
2275
1 1
2276
2 3
2277
3 6
2278
4 10
2279
5 15
2280
6 21
2281
7 28
2282
8 36
2283
9 45
2284
10 55
2285
CPU time: 0.00 s, Wall time: 10.00 s
2286
\end{lstlisting}
2287
2288
Now try it in parallel:
2289
%skip
2290
\begin{lstlisting}
2291
%time
2292
sage: for X in f([1..10]): print X
2293
(((1,), {}), 1)
2294
(((2,), {}), 3)
2295
(((3,), {}), 6)
2296
(((4,), {}), 10)
2297
(((5,), {}), 15)
2298
(((6,), {}), 21)
2299
(((7,), {}), 28)
2300
(((8,), {}), 36)
2301
(((9,), {}), 45)
2302
(((10,), {}), 55)
2303
CPU time: 0.19 s, Wall time: 1.32 s
2304
\end{lstlisting}
2305
2306
2307
2308
\section{The Ecosystem}
2309
2310
The Sage distributuion itself consists of about 100 open source
2311
programs and libraries, which (like Linux) are developed by a loosely
2312
knit international group of programmers. Many of these programs are
2313
written as Python libraries.
2314
2315
Any software engineer knows that a programming language is much more
2316
than just the formal language specification or even a particular
2317
implementation. It's also the user community, the general pace of
2318
development, and---most importantly---the collections of tools and
2319
libraries that are available in that language, especially the free
2320
ones. Python excels in available tools, as the following list of many
2321
of the Python-based components of Sage attests:
2322
2323
\begin{itemize}
2324
\item Pycrypto -- fast implementations of many cryptosystems.
2325
\item Cython -- a Python compiler and tool for efficient use of C/C++
2326
libraries from Python. We will have much more to say about Cython
2327
in Chapter~\ref{ch:cython}.
2328
\item IPython -- interactive interpreter shell
2329
\item Jinja2 -- HTML and other templating tools; popular for web
2330
applications.
2331
\item Moinmoin -- a standalone wiki, e.g., the one used by \url{http://wiki.sagemath.org}.
2332
\item PIL -- Python imaging library (a ``programmable Photoshop'')
2333
\item Pygments -- HTML source code highlighting
2334
\item SQLalchemy -- abstracts interface to most SQL databases and an
2335
object:relational mapper.
2336
\item Sphinx -- ReST documentation system for Python, which is used by
2337
many Python projects (including Sage).
2338
\item Twisted -- a networking framework; everything from web applications
2339
to email to ssh servers are implemented in Twisted.
2340
\item ZODB -- The Zope object-oriented database
2341
\item arpack -- A sparse numerical linear algebra library.
2342
\item CVXopt -- A library for solving convex (and other) optimization problems.
2343
\item Docutils -- related to Python documentation
2344
\item easy-install -- you can do \verb|easy_install foobar| to
2345
install any of the over 13,000 Python packages available at
2346
\url{http://pypi.python.org/}.
2347
2348
\item gd -- very quickly draw png images with lines, arcs, etc.
2349
2350
\item matplotlib -- the canonical Python 2d graphics library
2351
2352
\item mpmath -- arbitrary precision floating point mathematics
2353
special functions, numerical integration, matrices, etc.
2354
2355
\item NumPy -- an $n$-dimensional array library, which is the
2356
fundamental package needed for scientific computing with Python.
2357
2358
\item pexpect -- control command-line subprocesses
2359
2360
\item rpy2 -- fast compiled interface to the R statistics program,
2361
which is also included in Sage.
2362
2363
\item sage -- the Sage library; mainly implements mathematical
2364
algorithms, especially symbolic ones.
2365
2366
\item sagenb -- the Sage notebook web application (can be used
2367
standalone separate from Sage).
2368
2369
\item sagetex -- allows you to embed Sage in \LaTeX{} documents
2370
2371
\item SciPy -- a large library of numerical functions that are useful
2372
in mathematics, science, and engineering, including numerical
2373
integration, optimization, statistics, differential equations, etc.
2374
2375
\item setuptools -- package for distributing and working with standalone python packages
2376
2377
\item SymPy -- a lightweight Python library for symbolic mathematics.
2378
\end{itemize}
2379
2380
\section{Exercise: Build Python from Source}
2381
If your computer operating system is Linux or OS X (with XCode
2382
installed), it is an easy ``exercise'' to build the Python language
2383
from source. This is particularly relevant if you want to understand
2384
Python more deeply, since you can change anything you want in the
2385
interpreter itself, recompile, and try out the result!
2386
2387
First, go to \url{http://python.org/download/} and download some
2388
version of Python. I am using OS X (with XCode installed) and choose
2389
Python 3.2. In a few seconds I have the file
2390
{\tt Python-3.2.tar.bz2} in my {\sf Downloads} folder.
2391
Using the {\sf Terminal} application, I navigate to that folder, extract
2392
Python, configure and build it, which takes under 2 minutes (!).
2393
%skip
2394
\begin{lstlisting}
2395
deep:~ wstein$ cd Downloads
2396
deep:Downloads wstein$ tar xf Python-3.2.tar.bz2
2397
deep:Downloads wstein$ cd Python-3.2
2398
deep:Python-3.2 wstein$ ./configure; time make -j8
2399
...
2400
real 1m18.284s
2401
user 1m59.552s
2402
sys 0m9.980s
2403
deep:Python-3.2 wstein$
2404
\end{lstlisting}
2405
2406
And now let's try it out:
2407
\begin{lstlisting}
2408
deep:Python-3.2 wstein$ ./python.exe
2409
Python 3.2 (r32:88445, Mar 30 2011, 10:20:45)
2410
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
2411
Type "help", "copyright", "credits" or "license" for more
2412
information.
2413
>>> 2 + 2
2414
4
2415
\end{lstlisting}
2416
2417
For fun, let's change something in the core of Python, recompile, and
2418
observe our change. On line 288 of {\tt
2419
Python-3.2/Objects/listobject.c}, I insert a line that calls the C
2420
{\sf printf} function to print out some graffiti:
2421
\begin{lstlisting}
2422
...
2423
PyObject *
2424
PyList_GetItem(PyObject *op, Py_ssize_t i)
2425
{
2426
printf("Hi Mom!\n"); /* I added this! */
2427
if (!PyList_Check(op)) {
2428
PyErr_BadInternalCall();
2429
...
2430
\end{lstlisting}
2431
2432
I then type {\tt make} again, wait a few seconds, and try out Python again:
2433
\begin{lstlisting}
2434
deep:Python-3.2 wstein$ ./python.exe
2435
Python 3.2 (r32:88445, Mar 30 2011, 10:25:56)
2436
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
2437
Type "help", "copyright", "credits" or "license" for more
2438
information.
2439
Hi Mom!
2440
...
2441
Hi Mom!
2442
>>> v = [1,2,3]
2443
>>> v[0]
2444
1
2445
>>> v['a']
2446
Traceback (most recent call last):
2447
File "<stdin>", line 1, in <module>
2448
Hi Mom!
2449
Hi Mom!
2450
Hi Mom!
2451
Hi Mom!
2452
Hi Mom!
2453
TypeError: list indices must be integers, not str
2454
\end{lstlisting}
2455
2456
Interestingly, the function \verb|PyList_GetItem| appears to not be
2457
called when we use an integer to access a list, but it is used when we
2458
try to access the list with anything else.
2459
2460
For more information about how the Python source code is laid out,
2461
see the README file, especially the section at the end called
2462
``Distribution structure''.
2463
2464
\chapter{Cython}\label{ch:cython}
2465
2466
Cython is a compiled variant of the Python language, that can be used
2467
to write very, very fast code for Sage, and makes it possible to
2468
efficiently use code from existing C/C++ libraries in Sage, or write
2469
part of a program in C and the other part in Sage.
2470
2471
Additional references for learning Cython include the Cython Users
2472
Guide (see \url{http://docs.cython.org/src/userguide/}) as you can,
2473
and the other Cython documentation (see \url{http://docs.cython.org/}).
2474
2475
\section{An Example: Speeding up a Simple Function}
2476
2477
Let's start with a first simple example. We write a brute force
2478
Python program that computes a double precision (53-bit) floating
2479
point approximation to
2480
$$
2481
f(n) = \sin(1) + \sin(2) + \sin(3) + \cdots + \sin(n-1) + \sin(n),
2482
$$
2483
for $n$ any positive integer.
2484
Let's start with the most naive and straightforward
2485
implementation of this:
2486
\begin{lstlisting}
2487
def python_sum_symbolic(n):
2488
return float( sum(sin(n) for n in xrange(1,n+1)) )
2489
\end{lstlisting}
2490
2491
As a sanity check for each of our implementations, we compute $f(1000)$:
2492
\begin{lstlisting}
2493
sage: python_sum_symbolic(1000)
2494
0.813969634073164
2495
\end{lstlisting}
2496
2497
Let's benchmark it (these benchmarks were done under OS X on a 2.66GHz
2498
Intel Core i7 laptop).
2499
\begin{lstlisting}
2500
sage: timeit('python_sum_symbolic(1000)')
2501
5 loops, best of 3: 193 ms per loop
2502
\end{lstlisting}
2503
2504
Next, try a bigger input. We use time instead, since that only runs
2505
the function once, and this is going to take awhile:
2506
2507
\begin{lstlisting}
2508
sage: time python_sum_symbolic(10^4)
2509
1.6338910217924467
2510
Time: CPU 15.88 s, Wall: 17.64 s
2511
\end{lstlisting}
2512
2513
2514
This is really, really, shockingly slow. By the end of this section,
2515
you'll see how to compute $f(n)$ over {\bf 17 million times} faster!
2516
2517
2518
One reason \verb|python_sum_symbolic| is so bad, is that the
2519
\verb|sin| function is a ``symbolic function'', in that it keeps
2520
track of exact values, etc. For example,
2521
\begin{lstlisting}
2522
sage: sum(sin(n) for n in xrange(1, 10+1))
2523
sin(1) + sin(2) + sin(3) + sin(4) + sin(5) + sin(6) +
2524
sin(7) + sin(8) + sin(9) + sin(10)
2525
\end{lstlisting}
2526
2527
So the \verb|python_sum_symbolic| function is computing a huge formal
2528
sum, then converting each summand to a float, and finally adding them
2529
up. This is wasteful. Our next implementation using a different sin
2530
function, one that is standard with Python, which immediately turns
2531
its input into a float (53-bit double precision number), computes
2532
\verb|sin|, and returns the result as a float.
2533
2534
\begin{lstlisting}
2535
def python_sum(n):
2536
from math import sin
2537
return sum(sin(i) for i in xrange(1, n+1))
2538
\end{lstlisting}
2539
2540
2541
How does this do?
2542
\begin{lstlisting}
2543
sage: python_sum(1000) # right answer
2544
0.8139696340731659
2545
sage: timeit('k = python_sum(1000)')
2546
625 loops, best of 3: 185 microseconds per loop
2547
sage: timeit('k = python_sum(10^4)')
2548
125 loops, best of 3: 2.07 ms per loop
2549
\end{lstlisting}
2550
Thus \verb|python_sum| is over {\em one thousand times faster}
2551
than \verb|python_sum_symbolic|!
2552
2553
\begin{lstlisting}
2554
sage: 193e-3 / 185e-6
2555
1043.24324324324
2556
sage: 15.88 / 2.07e-3
2557
7671.49758454106
2558
\end{lstlisting}
2559
2560
2561
Perhaps there is some win in not using \verb|sum|?
2562
\begin{lstlisting}
2563
def python_sum2(n):
2564
from math import sin
2565
s = float(0)
2566
for i in range(1, n+1):
2567
s += sin(i)
2568
return s
2569
\end{lstlisting}
2570
2571
Try it out: nope, no win at all (!):
2572
\begin{lstlisting}
2573
sage: timeit('k = python_sum2(10^4)')
2574
125 loops, best of 3: 2.11 ms per loop
2575
\end{lstlisting}
2576
2577
2578
Next we try using Cython to make our code faster.
2579
Note that our
2580
``rewritten'' program looks identical---the only difference so far is
2581
that we told Sage to compile the program using Cython by putting {\tt
2582
\%cython} at the beginning of the block (if you are using the
2583
command line instead of the notebook, put the code without
2584
\verb|%cython|
2585
in a file \verb|foo.pyx|, then type \verb|load foo.pyx|.)
2586
2587
%skip
2588
\begin{lstlisting}
2589
%cython
2590
def cython_sum(n):
2591
from math import sin
2592
return sum(sin(i) for i in xrange(1, n+1))
2593
\end{lstlisting}
2594
2595
If you evaluate the above code in the Sage notebook, you'll see that
2596
two linked files appear after the input cell:
2597
\begin{enumerate}
2598
\item A file that ends in .c: this is the C program that the above
2599
code got turned into. This is compiled and linked automatically
2600
into the running copy of Sage.
2601
\item A file that ends in .html: this is an annotated version of the
2602
above Cython program; double click on a line to see
2603
the corresponding C code.
2604
\end{enumerate}
2605
2606
Is the Cython program any faster?
2607
%skip
2608
\begin{lstlisting}
2609
sage: cython_sum(1000) # it works
2610
0.8139696340731659
2611
sage: timeit('cython_sum(1000)')
2612
625 loops, best of 3: 144 microseconds per loop
2613
sage: timeit('k = cython_sum(10^4)')
2614
625 loops, best of 3: 1.52 ms per loop
2615
\end{lstlisting}
2616
It's faster, but only about 30\% faster. This is very typical
2617
of what you should expect by simply putting \verb|%cython| above Python
2618
code.
2619
\begin{lstlisting}
2620
sage: 2.07/1.52
2621
1.36184210526316
2622
\end{lstlisting}
2623
2624
2625
The Cython program is not {\em that} much faster, because the computer
2626
is doing essentially the same thing in both functions.
2627
In the case of \verb|python_sum|,
2628
the Python interpreter is carrying out a sequence of operations
2629
(calling functions in the Python C library), and in the case of
2630
\verb|cython_sum|, a C program is running (the compiled Cython
2631
module), which is simply calling pretty much the same functions in the
2632
Python C library.
2633
2634
To get a further speedup, we declare certain variables to have C data
2635
types and use a C version of the sin function.
2636
2637
%skip
2638
\begin{lstlisting}
2639
%cython
2640
cdef extern from "math.h":
2641
double sin(double)
2642
2643
def cython_sum_typed_lib(long n):
2644
cdef long i
2645
return sum(sin(i) for i in range(1, n+1))
2646
\end{lstlisting}
2647
The differences are that we declared {\tt n} to be long and we added a
2648
new line \verb|cdef long i|, which declares {\tt i} to
2649
also be long. This tells Cython to treat {\tt n} and {\tt i}
2650
as being of data type {\tt long}, which is a 32 or 64-bit integer,
2651
depending on the computer you're using.
2652
This is the same as the long
2653
datatype in C.
2654
We also call the \verb|sin| function in the math C library.
2655
Let's see if this is faster.
2656
2657
\begin{lstlisting}
2658
sage: cython_sum_typed_lib(1000)
2659
0.8139696340731659
2660
sage: timeit('cython_sum_typed_lib(1000)')
2661
625 loops, best of 3: 85.2 µs per loop
2662
sage: timeit('cython_sum_typed_lib(10^4)')
2663
625 loops, best of 3: 951 µs per loop
2664
sage: 2.07e-3 / 778e-6
2665
2.66066838046272
2666
\end{lstlisting}
2667
2668
So now our coding is beating the pure Python code by a factor of nearly 3.
2669
2670
In general, you have to be more careful, e.g., {\tt long} integers
2671
{\em overflow}. They may be either 32 or 64-bit
2672
depending on the computer you are using. The following example
2673
illustrates overflow:
2674
%skip
2675
\begin{lstlisting}
2676
%cython
2677
def longmul(long a, long b):
2678
return a*b
2679
\end{lstlisting}
2680
Now let's try it:
2681
2682
%skip
2683
\begin{lstlisting}
2684
sage: longmul(2^10, 2^20)
2685
1073741824
2686
sage: longmul(2^20, 2^50) # overflows!
2687
0
2688
sage: 2^20 * 2^50
2689
1180591620717411303424
2690
\end{lstlisting}
2691
2692
2693
For our next optimization, we use a for loop instead of the
2694
sum command:
2695
\begin{lstlisting}
2696
%cython
2697
cdef extern from "math.h":
2698
double sin(double)
2699
2700
def cython_sum_typed_lib_loop(long n):
2701
cdef long i
2702
cdef double s = 0
2703
for i in range(1, n+1):
2704
s += sin(i)
2705
return s
2706
\end{lstlisting}
2707
In Cython, this is really worth it.
2708
\begin{lstlisting}
2709
sage: cython_sum_typed_lib_loop(1000)
2710
0.8139696340731659
2711
sage: timeit('cython_sum_typed_lib_loop(10^4)')
2712
625 loops, best of 3: 298 µs per loop
2713
sage: 2.07e-3 / 298e-6
2714
6.94630872483221
2715
\end{lstlisting}
2716
We thus obtain a speedup of a factor of about 7 by switching
2717
from Python to Cython, and implementing exactly the same algorithm.
2718
Way under the hood, the same \verb|sin| function (providing by the
2719
math library on the operating system) is being called, but
2720
the Cython version of the function avoids a lot of overhead.
2721
2722
2723
2724
Another approach to this particular numerical problem is to use the
2725
numpy library, which allows one to evaluate a function on all entries
2726
in an array, and sum the entries. This is called ``vectorization''.
2727
Here's what we get in this particular example:
2728
2729
\begin{lstlisting}
2730
def sum_numpy(n):
2731
import numpy
2732
return sum(numpy.sin(numpy.array(range(1, n+1))))
2733
\end{lstlisting}
2734
2735
Let's try it out:
2736
\begin{lstlisting}
2737
sage: sum_numpy(1000)
2738
0.81396963407316592
2739
sage: timeit('sum_numpy(10^4)')
2740
625 loops, best of 3: 1.33 ms per loop
2741
sage: 2.07e-3 / 1.33e-3
2742
1.55639097744361
2743
\end{lstlisting}
2744
So for this benchmark, our Numpy implementation is slightly better than
2745
pure Python, but Cython is still much faster.
2746
2747
Finally, we try a different algorithm, both using Python and Cython.
2748
The symbolic capabilities of Sage can be used to find closed
2749
form expressions for certain formal sums.
2750
\begin{lstlisting}
2751
sage: var('i, n')
2752
sage: f = sum(sin(i), i, 1, n).full_simplify(); f
2753
1/2*((cos(1) - 1)*sin(n*arctan(sin(1)/cos(1))) +
2754
sin(1)*cos(n*arctan(sin(1)/cos(1))) - sin(1))/(cos(1) - 1)
2755
\end{lstlisting}
2756
Thus (and I had no idea before trying this),
2757
$$
2758
f(n) = \frac{{\left(\cos\left(1\right) - 1\right)} \sin\left(n \arctan\left(\frac{\sin\left(1\right)}{\cos\left(1\right)}\right)\right) + \sin\left(1\right) \cos\left(n \arctan\left(\frac{\sin\left(1\right)}{\cos\left(1\right)}\right)\right) - \sin\left(1\right)}{2 \, {\left(\cos\left(1\right) - 1\right)}}
2759
$$
2760
2761
Using this, we can give an algorithm to compute this sum that is much
2762
faster than anything above, and scales much better as well
2763
to larger $n$.
2764
\begin{lstlisting}
2765
def sum_formula(n):
2766
from math import sin, cos, atan as arctan
2767
return (1/2*((cos(1) - 1)*sin(n*arctan(sin(1)/cos(1))) +
2768
sin(1)*cos(n*arctan(sin(1)/cos(1))) - sin(1))/(cos(1) - 1))
2769
\end{lstlisting}
2770
How does it do?
2771
\begin{lstlisting}
2772
sage: sum_formula(1000)
2773
0.8139696340731664
2774
sage: timeit('sum_formula(10^4)')
2775
625 loops, best of 3: 36.5 µs per loop
2776
sage: 2.07e-3 / 36.5e-6
2777
56.7123287671233
2778
\end{lstlisting}
2779
Finally, we implement the closed formula, but instead in Cython.
2780
2781
\begin{lstlisting}
2782
%cython
2783
2784
cdef extern from "math.h":
2785
double sin(double)
2786
double cos(double)
2787
double atan(double)
2788
2789
def sum_formula_cython(double n):
2790
return (.5*((cos(1) - 1)*sin(n*atan(sin(1)/cos(1))) +
2791
sin(1)*cos(n*atan(sin(1)/cos(1))) - sin(1))/(cos(1) - 1))
2792
\end{lstlisting}
2793
2794
This is 40 times faster than our Python implementation!
2795
\begin{lstlisting}
2796
sage: sum_formula_cython(1000)
2797
0.8139696340731664
2798
sage: timeit('sum_formula_cython(10^4)')
2799
625 loops, best of 3: 906 ns per loop
2800
sage: 36.5e-6 / 906e-9
2801
40.2869757174393
2802
\end{lstlisting}
2803
And it is over 2000 times faster than \verb|python_sum|.
2804
2805
\begin{lstlisting}
2806
sage: 2.07e-3 / 906e-9
2807
2284.76821192053
2808
\end{lstlisting}
2809
It is a whopping {\bf 17 million times} faster than our first attempt!
2810
\begin{lstlisting}
2811
sagre: 15.88 / 906e-9
2812
1.75275938189845e7
2813
\end{lstlisting}
2814
2815
2816
2817
\section{Using External C/C++ Code}
2818
2819
Cython is absolutely critical to the design of Sage, and potentially
2820
very important to your own work, because it makes it possible to
2821
efficiently make use of data types and functions defined in any C/C++
2822
library. Since there is an enormous amount of useful, fast, debugged
2823
C/C++ code out there, Cython gives your Sage and Python programs
2824
access to vast amounts of useful capabilities. Also, when used
2825
correctly, there is no overhead in calling out to the C libraries,
2826
unlike the situation with SWIG, ctypes, and many other approaches
2827
to writing C library wrappers.
2828
2829
\subsection{Simple random example}
2830
Here's a first simple example. Type {\tt man random} on the command line
2831
(or Google it) to find out about the random C library function:
2832
\begin{lstlisting}
2833
RANDOM(3) BSD Library Functions Manual RANDOM(3)
2834
2835
NAME
2836
initstate, random, setstate, srandom, srandomdev -- better
2837
random number generator; routines for changing generators
2838
2839
LIBRARY
2840
Standard C Library (libc, -lc)
2841
2842
SYNOPSIS
2843
#include <stdlib.h>
2844
2845
char *
2846
initstate(unsigned seed, char *state, size_t size);
2847
2848
long
2849
random(void);
2850
...
2851
\end{lstlisting}
2852
2853
Despite {\tt random} being a function defined in the standard C
2854
library, we can still call it from Cython, as follows:
2855
%skip
2856
\begin{lstlisting}
2857
%cython
2858
cdef extern from "stdlib.h": # (1)
2859
long random() # (2)
2860
2861
def random_nums(int n): # (3)
2862
cdef int i # (4)
2863
v = [random() for i in range(n)] # (5)
2864
return v
2865
\end{lstlisting}
2866
2867
Let's try it out:
2868
%skip
2869
\begin{lstlisting}
2870
sage: random_nums(5)
2871
[1315705257, 1147455227, 1571270137, 1106977565, 1805149207]
2872
sage: timeit('v = random_nums(10^5)')
2873
125 loops, best of 3: 5.56 ms per loop
2874
\end{lstlisting}
2875
2876
It's interesting to see how this compares to pure Python. Here's the same program in Python:
2877
%skip
2878
\begin{lstlisting}
2879
%python
2880
import random
2881
k = 2**31-1
2882
def py_random_nums(n):
2883
return [random.randint(0,k) for i in range(n)]
2884
\end{lstlisting}
2885
So the speedup is by a factor of nearly 50:
2886
%skip
2887
\begin{lstlisting}
2888
sage: py_random_nums(5)
2889
[317567506, 1289482476, 1766134327, 1216261810, 1427493671]
2890
sage: timeit('v = random_nums(10^5)')
2891
5 loops, best of 3: 251 ms per loop
2892
sage: 251/5.56
2893
45.1438848920863
2894
\end{lstlisting}
2895
2896
Finally we explain the above code line by line. (TODO)
2897
2898
2899
2900
\subsection{Adding rational numbers using MPIR}
2901
2902
We next consider a more mathematical example: arithmetic with
2903
arbitrary precision rational numbers. The MPIR C library (which is
2904
included with Sage, but can also be downloaded separately for free for
2905
any standard operating system from \url{http://mpir.org/}) provides
2906
highly optimized arithmetic with arbitrary precision integers and
2907
rational numbers.\footnote{MPIR and GMP \url{http://gmplib.org/} are
2908
basically the same for our discussion; technically they are
2909
``forks'' of each other, but export essentially the same functions.}
2910
We could make use of MPIR by reading the documentation for MPIR and
2911
using {\tt cdef extern} as above. Fortunately, all of the necessary
2912
{\tt cdef extern} declarations needed to use MPIR are already declared
2913
in Sage. You can view all the declarations from the notebook by
2914
navigating to {\tt <url of notebook server>/src/libs/gmp}.
2915
2916
Let's use MPIR directly to create two rational numbers and add them
2917
together. The code below is complicated and illustrates many issues
2918
and techniques, so we will explain it in great depth. Once you
2919
understand this, you can deal with many issues that will come up with
2920
Cython.
2921
2922
%skip
2923
\begin{lstlisting}
2924
%cython
2925
from sage.libs.gmp.all cimport * # (1)
2926
def add_rationals(bytes a, bytes b): # (2)
2927
cdef mpq_t x, y, z # (3)
2928
mpq_init(x); mpq_init(y); mpq_init(z) # (4)
2929
mpq_set_str(x, a, 10) # base 10 string # (5)
2930
mpq_set_str(y, b, 10)
2931
mpq_add(z, x, y) # (6)
2932
cdef int n = (mpz_sizeinbase (mpq_numref(z), 10) # (7)
2933
+ mpz_sizeinbase (mpq_denref(z), 10) + 3)
2934
cdef char* s = <char*>sage_malloc(sizeof(char)*n) # (8)
2935
if not s: raise MemoryError # (9)
2936
cdef bytes c = mpq_get_str(s, 10, z) # (10)
2937
mpq_clear(x); mpq_clear(y); mpq_clear(z) # (11)
2938
sage_free(s) # (12)
2939
return c
2940
\end{lstlisting}
2941
2942
Now let's try it out:
2943
%skip
2944
\begin{lstlisting}
2945
sage: add_rationals('2/3', '-5/21')
2946
'3/7'
2947
sage: 2/3 - 5/21
2948
3/7
2949
sage: add_rationals('1/29048203984092834823049',
2950
'-394/29302938402384092834')
2951
'-11444963066794174536188472/851197732045760533660225724673976778930866'
2952
\end{lstlisting}
2953
2954
Timings suggest we didn't mess up:
2955
%skip
2956
\begin{lstlisting}
2957
sage: timeit("add_rationals('2/3', '-5/21')")
2958
625 loops, best of 3: 1.29 µs per loop
2959
sage: timeit('2/3 - 5/21')
2960
625 loops, best of 3: 2.16 µs per loop
2961
\end{lstlisting}
2962
Here's a simplistic check that we probably didn't screw up and
2963
introduce any memory leaks. (Go up to the code and comment out some
2964
frees to see how this changes.)
2965
2966
%skip
2967
\begin{lstlisting}
2968
sage: print get_memory_usage()
2969
sage: timeit("add_rationals('9012038409238411/13',
2970
'-4/9082309482309487')",number=10^6)
2971
sage: get_memory_usage()
2972
917.5625
2973
1000000 loops, best of 3: 1.72 µs per loop
2974
917.5625
2975
\end{lstlisting}
2976
2977
Finally, we will go line by line through the code and explain exactly
2978
what is going on and why. TODO
2979
2980
2981
\section{Important Cython Language Constructions}
2982
In this section we systematically go through {\em the most important}
2983
standard Cython language constructions. We will not talk about using
2984
numpy from Cython, dynamic memory allocation, or subtleties of the C
2985
language in this section. Instead we cover declaring and using cdef'd
2986
variables, explicit type casts, declaring external data types and
2987
functions, defining new Cython cdef'd functions, and declaring new
2988
Cython cdef'd classes that can have C attributes.
2989
2990
\subsection{Declaring Cython Variables Using {\tt cdef}}
2991
\begin{lstlisting}
2992
cdef type_name variable_name1, variable_name2, ...
2993
\end{lstlisting}
2994
The single most important statement that Cython adds to Python is
2995
\begin{lstlisting}
2996
cdef type_name
2997
\end{lstlisting}
2998
This allows you to declare a variable to have a type. The possibilities for the type include:
2999
\begin{itemize}
3000
\item C data type: int, float, double, char. Each can be modified by: short, long, signed, unsigned.
3001
\item Certain Python types, including: list, dict, str, object (=Python object), etc.
3002
\item Name of a known {\tt cdef class} (see below). You may have to cimport the class.
3003
\item More complicated C/C++ data types: struct, C++ class, typedef, etc., that have been declared using some other method described below.
3004
\end{itemize}
3005
3006
\begin{lstlisting}
3007
%cython
3008
def C_type_example():
3009
# ^ = exclusive or -- no preparsering in Cython!
3010
cdef int n=5/3, x=2^3
3011
cdef long int m=908230948239489394
3012
cdef float y=4.5969
3013
cdef double z=2.13
3014
cdef char c='c'
3015
cdef char* s="a C string"
3016
print n, x, m, y, z, c, s
3017
\end{lstlisting}
3018
When we run the above function, we get the following. Note the lack
3019
of preparsing, and that the char variable {\tt c} is treated as a
3020
number.
3021
\begin{lstlisting}
3022
sage: C_type_example()
3023
1 1 908230948239489394 4.59689998627 2.13 99 a C string
3024
\end{lstlisting}
3025
3026
\begin{lstlisting}
3027
%cython
3028
def type_example2(x, y):
3029
cdef list v
3030
cdef dict z
3031
v = x
3032
z = y
3033
\end{lstlisting}
3034
3035
\begin{lstlisting}
3036
sage: type_example2([1,2], {'a':5})
3037
sage: type_example2(17, {'a':5})
3038
Traceback (most recent call last):
3039
...
3040
TypeError: Expected list, got sage.rings.integer.Integer
3041
sage: type_example2([1,2], 17)
3042
Traceback (most recent call last):
3043
...
3044
TypeError: Expected dict, got sage.rings.integer.Integer
3045
\end{lstlisting}
3046
3047
3048
We can also define a new Cython cdef'd class and use that type in cdef:
3049
\begin{lstlisting}
3050
%cython
3051
cdef class MyNumber:
3052
cdef int n
3053
def __init__(self, k):
3054
self.n = k
3055
def __repr__(self):
3056
return repr(self.n)
3057
3058
def type_example3(MyNumber z):
3059
cdef MyNumber w = MyNumber(5)
3060
print z, w
3061
\end{lstlisting}
3062
Use it:
3063
3064
\begin{lstlisting}
3065
sage: type_example3(MyNumber(10))
3066
10 5
3067
sage: type_example3(10)
3068
Traceback (most recent call last):
3069
...
3070
TypeError: Argument 'z' has incorrect type...
3071
\end{lstlisting}
3072
3073
Terrifying caveat!: Any type can also be {\tt None}, which will cause
3074
horrible segfaults. Use \verb|not None| to deal with this.
3075
3076
\begin{lstlisting}
3077
sage: type_example3(None) # this could be very bad
3078
None 5
3079
\end{lstlisting}
3080
3081
This is how to be safe:
3082
\begin{lstlisting}
3083
%cython
3084
cdef class MyNumber:
3085
cdef int n
3086
def __init__(self, k):
3087
self.n = k
3088
def __repr__(self):
3089
return repr(self.n)
3090
3091
def type_example4(MyNumber z not None):
3092
cdef MyNumber w = MyNumber(5)
3093
print z, w
3094
\end{lstlisting}
3095
3096
Now None is rejected:
3097
\begin{lstlisting}
3098
sage: type_example4(MyNumber(4))
3099
4 5
3100
sage: type_example4(None)
3101
Traceback (most recent call last):
3102
...
3103
TypeError: Argument 'z' has incorrect type ...
3104
\end{lstlisting}
3105
3106
3107
For the Cython source code of Sage integers, in the Sage library see
3108
\verb|rings/integer.pxd| and \verb|rings/integer.pyx|. Also, browse
3109
\verb|libs/gmp/| for the definition of functions such as
3110
\verb|mpz_set| below.
3111
3112
\begin{lstlisting}
3113
%cython
3114
from sage.rings.integer cimport Integer # note the cimport!
3115
def unsafe_mutate(Integer n, Integer m):
3116
mpz_set(n.value, m.value)
3117
\end{lstlisting}
3118
3119
\begin{lstlisting}
3120
sage: n = 15
3121
sage: print n, id(n)
3122
15 54852752
3123
sage: unsafe_mutate(n, 2011)
3124
sage: print n, id(n)
3125
2011 54852752
3126
\end{lstlisting}
3127
3128
3129
\subsection{Explicit casts}
3130
\begin{lstlisting}
3131
<data_type> foo
3132
\end{lstlisting}
3133
3134
If you need to force the compiler to treat a variable of one data type
3135
as another, you have to use an explicit cast. In Java and C/C++ you
3136
would use parenthesis around a type name, as follows:
3137
3138
\begin{lstlisting}
3139
int i = 1;
3140
long j = 3;
3141
i = (int)j;
3142
\end{lstlisting}
3143
3144
In Cython, you use angle brackets (note: in Cython this particular
3145
cast isn't strictly necessary to get the code to compile, but in Java
3146
it is):
3147
3148
\begin{lstlisting}
3149
%cython
3150
cdef int i = 1
3151
cdef long j = 3
3152
i = <int> j
3153
print i
3154
\end{lstlisting}
3155
3156
Here's an example where we convert a Python string to a char* (i.e., a
3157
pointer to an array of characters), then change one of the characters,
3158
thus mutating an immutable string.
3159
3160
\begin{lstlisting}
3161
%cython
3162
def unsafe_mutate_str(bytes s, n, c):
3163
cdef char* t = <char*>s
3164
t[n] = ord(c)
3165
\end{lstlisting}
3166
Try it out:
3167
\begin{lstlisting}
3168
sage: s = 'This is an immutable string.'
3169
sage: print s, id(s), hash(s)
3170
This is an immutable string. 72268152 -5654925717092887818
3171
sage: unsafe_mutate_str(s, 9, ' ')
3172
sage: unsafe_mutate_str(s, 11, ' ')
3173
sage: unsafe_mutate_str(s, 12, ' ')
3174
print s, id(s), hash(s)
3175
This is a mutable string. 72268152 -5654925717092887818
3176
sage: hash('This is a mutable string.')
3177
-7476166060485806082
3178
\end{lstlisting}
3179
3180
\subsection{Declaring External Data Types and Functions}
3181
3182
In order for Cython to make use of a function or data type defined in
3183
external C/C++ library, Cython has to {\em explicitly} be told what
3184
the input and output types are for that function and what the function
3185
should be called. Cython will then generate appropriate C/C++ code
3186
and conversions based on these assumptions. There are a large number
3187
of files in Sage and Cython itself that declare all the functions
3188
provided by various standard libraries, but sometimes you want to make
3189
use of a function defined elsewhere, e.g., in your own C/C++ library,
3190
so you have to declare things yourself. The purpose of the following
3191
examples is to illustrate how to do this. It is also extremely useful
3192
to look at the Sage library source code for thousands of additional
3193
nontrivial working examples.
3194
\begin{lstlisting}
3195
cdef extern from "filename.h":
3196
declarations ...
3197
\end{lstlisting}
3198
3199
The following examples illustrates several different possible
3200
declarations. We'll describe each line in detail. This first example
3201
declares a single type of round function on doubles -- it's as
3202
straightforward as it gets.
3203
\begin{lstlisting}
3204
%cython
3205
cdef extern from "math.h":
3206
double round(double)
3207
3208
def f(double n):
3209
return round(n)
3210
\end{lstlisting}
3211
Try it out:
3212
\begin{lstlisting}
3213
sage: f(10.53595)
3214
11.0
3215
\end{lstlisting}
3216
3217
Now suppose we want a version of round that returns a long. By
3218
consulting the man page for round, we find that there is a round
3219
function declared as follows:
3220
\begin{lstlisting}
3221
long int lround(double x);
3222
\end{lstlisting}
3223
We can declare it exactly like the above, or we can use a C ``name
3224
specifier'', which let's us tell Cython we want to call the function
3225
{\tt round} in our Cython code, but when Cython generates code it should
3226
actually emit {\tt lround}. This is what we do below.
3227
3228
\begin{lstlisting}
3229
%cython
3230
cdef extern from "math.h":
3231
long int round "lround"(double)
3232
3233
def f(double n):
3234
return round(n)
3235
\end{lstlisting}
3236
3237
\begin{lstlisting}
3238
sage: f(10.53595)
3239
11
3240
\end{lstlisting}
3241
3242
Another case when using C name specifiers is useful if you want to be
3243
able to call both a C library version of a function and a builtin
3244
Python function with the same name.
3245
3246
\begin{lstlisting}
3247
%cython
3248
cdef extern from "stdlib.h":
3249
int c_abs "abs"(int i)
3250
3251
def myabs(n):
3252
print abs(n)
3253
print c_abs(n)
3254
\end{lstlisting}
3255
Now use it:
3256
\begin{lstlisting}
3257
sage: myabs(-10)
3258
10
3259
10
3260
\end{lstlisting}
3261
3262
We can also declare data types and variables using {\tt cdef extern}.
3263
To write the code below, I used the {\tt man} command on my computer
3264
several times on each referenced function. I knew the relevant
3265
functions because I read a book on the C programming language when I
3266
was a freshman; learning the basics of the C programming language and
3267
standard libraries is a very good idea if you want to be able to make
3268
effective use of Cython... or computers in general, since most systems
3269
programming is done in C.
3270
3271
Coming up with the declarations below is a little bit of an art form,
3272
in that they are not exactly what is given from the man pages, though
3273
they are close. Just realize that the declarations you give here do
3274
exactly one thing: they inform Cython about what C code it should
3275
generate, e.g., it will convert the