Contact
CoCalc Logo Icon
StoreFeaturesDocsShareSupport News AboutSign UpSign In
| Download
Views: 45900
Kernel: Python 3

Contexts

Contexts are central concept of Vcsn: they denote typing information about automata, rational expressions, etc. This information is alike a function type: an input type (the label), and an output type (the weight).

Contexts are created by the vcsn.context function which takes a string as input. This string follows the following syntax:

<context> ::= <labelset> , <weightset>

i.e., a context name is composed of a labelset name, then a comma, then a weightset name.

Labelsets

Different LabelSets model multiple variations on labels, members of a monoid:

  • letterset< genset >
    Fully defined by an alphabet AA, its labels being just letters. It is simply denoted by AA. It corresponds to the usual definition of an NFA.

  • nullableset< labelset >
    Denoted by A?A^?, also defined by an alphabet AA, its labels being either letters or the empty word. This corresponds to what is often called ε\varepsilon-NFAs.

  • wordset< genset >
    Denoted by AA^*, also defined by an alphabet AA, its labels being (possibly empty) words on this alphabet.

  • oneset
    Denoted by {1}\{1\}, containing a single label: 1, the empty word.

  • tupleset< labelset1 , labelset2 , ..., labelsetn >
    Cartesian product of LabelSets, L1××LnL_1 \times \cdots \times L_n. This type implements the concept of transducers with an arbitrary number of "tapes". The concept is developed more in-depth here: Transducers.

Gensets

The gensets define the types of the letters, and sets of the valid letters. There is currently a single genset type.

  • char_letters
    Specify that the letters are implemented as char. Any char will be accepted. The genset is said to be "open".

  • char_letters(abc...)
    Specify that the letters are implemented as char, and the genset is closed to {a, b, c}. Any other char will be rejected.

Abbreviations for Labelsets

There are a few abbreviations that are accepted.

  • lal_char: letterset<char_letters>

  • lal_char(abc): letterset<char_letters(abc)>

  • lan_char: nullableset<letterset<char_letters>>

  • law_char: wordset<letterset<char_letters>>

Weightsets

The WeightSets define the semiring of the weights. Builtin weights include:

  • b
    The classical Booleans: B,,,,\langle \mathbb{B}, \vee, \wedge, \bot, \top \rangle

  • z
    The integers coded as ints: Z,+,×,0,1\langle \mathbb{Z}, +, \times, 0, 1 \rangle

  • q
    The rationals, coded as pairs of ints: Q,+,×,0,1\langle \mathbb{Q}, +, \times, 0, 1 \rangle

  • qmp
    The rationals, with support for multiprecision: Qmp,+,×,0,1\langle \mathbb{Q}_\text{mp}, +, \times, 0, 1 \rangle

  • r
    The reals, coded as doubles: R,+,×,0,1\langle \mathbb{R}, +, \times, 0, 1 \rangle

  • nmin
    The tropical semiring, coded as unsigned ints: N{},min,+,,0\langle \mathbb{N} \cup \{\infty\}, \min, +, \infty, 0 \rangle

  • zmin
    The tropical semiring, coded as ints: Z{},min,+,,0\langle \mathbb{Z} \cup \{\infty\}, \min, +, \infty, 0 \rangle

  • rmin
    The tropical semiring, coded as floatss: R{},min,+,,0\langle \mathbb{R} \cup \{\infty\}, \min, +, \infty, 0 \rangle

  • log
    The log semiring, coded as doubles: R{,+},log,+,+,0\langle \mathbb{R} \cup \{-\infty, +\infty\}, \oplus_\mathrm{log}, +, +\infty, 0 \rangle (where log\oplus_\mathrm{log} denotes x,ylog(exp(x)+exp(y))x, y \rightarrow - \mathrm{log}(\exp(-x) + \exp(-y)).

  • f2
    The field: F2,,,0,1\langle \mathbb{F}_2, \oplus, \wedge, 0, 1 \rangle (where \oplus denotes the "exclusive or").

  • tupleset
    Cartesian product of WeightSets, W1××WnW_1 \times \cdots \times W_n.

Examples

The usual framework for automaton is to use letters as labels, and Booleans as weights:

import vcsn vcsn.context('lal<char(abc)>, b')

{a,b,c}B\{a, b, c\}\to\mathbb{B}

If instead of a simple accepter that returns "yes" or "no", you want to compute an integer, work in Z\mathbb{Z}:

vcsn.context('lal<char(abc)>, z')

{a,b,c}Z\{a, b, c\}\to\mathbb{Z}

To use words on the usual alphabet as labels:

vcsn.context('law<char(a-z)>, z')

{a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z}Z\{a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z\}^*\to\mathbb{Z}

kk-tape Automata

To create a "classical" two-tape automaton:

vcsn.context('lat<lal<char(a-f)>, lal<char(A-F)>>, b')

{a,b,c,d,e,f}×{A,B,C,D,E,F}B\{a, b, c, d, e, f\} \times \{A, B, C, D, E, F\}\to\mathbb{B}

Multiple Weights

To compute a Boolean and an integer:

vcsn.context('lal<char(ab)>, lat<b, z>')

{a,b}B×Z\{a, b\}\to\mathbb{B} \times \mathbb{Z}

The following automaton is almost able to recognize anbna^nb^n: it accepts only words of anbma^nb^m (aka aba^*b^*) and return (n,m)(n, m). One still has to check that n=mn = m.

zmin2 = vcsn.context('lal<char(ab)>, lat<zmin, zmin>') zmin2

{a,b}Zmin×Zmin\{a, b\}\to\mathbb{Z}_{\text{min}} \times \mathbb{Z}_{\text{min}}

ab = zmin2.expression('(<1,0>a)*(<0,0>b)* & (<0,0>a)*(<0,1>b)*') ab

(1,0a)b&a(0,1b)\left( \left\langle 1,0 \right\rangle \,a\right)^{*} \, {b}^{*} \& {a}^{*} \, \left( \left\langle 0,1 \right\rangle \,b\right)^{*}

a = ab.automaton() a
Image in a Jupyter notebook
print(a.shortest(len = 4).format('list'))
<0,0>\e <1,0>a <0,1>b <2,0>aa <1,1>ab <0,2>bb <3,0>aaa <2,1>aab <1,2>abb <0,3>bbb <4,0>aaaa <3,1>aaab <2,2>aabb <1,3>abbb <0,4>bbbb

Boss

The interpretation of the following monster is left as an exercise for the reader:

vcsn.context('''nullableset< lat< lal<char(ba)>, lat< lan<char(vu)>, law<char(x-z)> > > > , lat<expressionset<nullableset<lat<lan<char(fe)>, lan<char(hg)>>>, lat<r, q>>, lat<b, z> > ''')

({a,b}×({u,v})?×{x,y,z})?RatE[({e,f})?×({g,h})?R×Q]×B×Z(\{a, b\} \times (\{u, v\})^? \times \{x, y, z\}^*)^?\to\mathsf{RatE}[(\{e, f\})^? \times (\{g, h\})^?\to\mathbb{R} \times \mathbb{Q}] \times \mathbb{B} \times \mathbb{Z}