Contact
CoCalc Logo Icon
StoreFeaturesDocsShareSupport News AboutSign UpSign In
| Download
Views: 45909
Kernel: Python 3

Derived-term Automata for Extended Weighted Rational Expressions

This page is a complement to Derived-term Automata for Extended Weighted Rational Expressions presented at ICTAC 2016. This page exists in several forms:

More information is available here:

You may change the cells, and run then. To run a cell in a notebook, hit "Control-Enter" (in which case the focus stays on the same cell) or "Shift-Enter" (focus goes to the next cell). Beware that depending on the requested operations, Vcsn may generate and compile code, which may be a really slow process on small machines (about a minute): be patient! However, the code is compiled only once: successive uses will be way faster.

To run all the cells anew, select "Restart & Run All" in the "Kernel" menu above.

import vcsn

Valid Expressions

Beware that the so-called ''context'' (which, in Vcsn parlance denotes the type of labels (letters or words, etc.) and the semiring of weights) has an extreme importance. In particular, an expression might be valid in a given context, and invalid in another.

As an example, a{a^*}^* is valid in B\mathbb{B}, so we can build its derived-term automaton.

e = vcsn.B.expression('a**'); e.is_valid()
True
e.derived_term()
Image in a Jupyter notebook

However, the construction will fail in Q\mathbb{Q}, since the constant term for aa^* is 1, which is not starrable in Q\mathbb{Q} (indeed, 1=kN1k1^* = \sum_{k \in \mathbb{N}} 1^k is not a value in Q\mathbb{Q}).

e = vcsn.Q.expression('a**'); e.is_valid()
False
try: e.derived_term() except RuntimeError as err: print("error:", err)
error: Q: value is not starrable: 1 while computing expansion of: a** while computing derived-term of: a**

Example E3\mathsf{E}_3: A Lex-like Scanner

Desugaring

We introduce the "context" (i.e., the automaton type) that corresponds to labels that are letters (lal_char) and weights that are rational numbers (q).

Q = vcsn.context('lal_char(ab), q') Q

{a,b}Q\{a, b\}\to\mathbb{Q}

Vcsn supports the <+ operator. It is desugared in a combination of conjuction and complement.

e3 = Q.expression('<2>(ab) <+ <3>[ab]{+}') e3

2(ab)+(ab)c&3((a+b)(a+b)) \left\langle 2 \right\rangle \,\left(a \, b\right) + \left(a \, b\right)^{c} \& \left\langle 3 \right\rangle \,\left(\left(a + b\right) \, \left(a + b\right)^{*}\right)

The Derived-Term Automaton

The expansion and the derived-term automaton of E3\mathsf{E}_3 follow.

e3.expansion()

a[2b3bc&(a+b)]b[3(a+b)]a \odot \left[\left\langle 2\right\rangle b \oplus \left\langle 3\right\rangle {b}^{c} \& \left(a + b\right)^{*}\right] \oplus b \odot \left[\left\langle 3\right\rangle \left(a + b\right)^{*}\right]

e3.derived_term()
Image in a Jupyter notebook

Example E1\mathsf{E}_1: A Simple Expression

Polynomials and Derivatives

Z = vcsn.context('lal_char, z') e1 = Z.expression('<5>\e + <2>ace + <6>bce + <4>ade + <3>bde') e1

5ε+2(ace)+4(ade)+6(bce)+3(bde) \left\langle 5 \right\rangle \,\varepsilon + \left\langle 2 \right\rangle \,\left(a \, c \, e\right) + \left\langle 4 \right\rangle \,\left(a \, d \, e\right) + \left\langle 6 \right\rangle \,\left(b \, c \, e\right) + \left\langle 3 \right\rangle \,\left(b \, d \, e\right)

The derivatives of E1\mathsf{E}_1 with respect to aa and bb are the following polynomials.

e1.derivation('a')

2ce4de\left\langle 2\right\rangle c \, e \oplus \left\langle 4\right\rangle d \, e

e1.derivation('b')

6ce3de\left\langle 6\right\rangle c \, e \oplus \left\langle 3\right\rangle d \, e

The expansion of E1\mathsf{E}_1 is as follows. The notation is slightly different from the paper, where for higher clarity the weights in the polynomials are separated from the expressions by a \otimes. In the implementation, since the weight is always displayed, this separation is useless, and not put, for conciseness.

e1.expansion()

5a[2ce4de]b[6ce3de]\left\langle 5\right\rangle \oplus a \odot \left[\left\langle 2\right\rangle c \, e \oplus \left\langle 4\right\rangle d \, e\right] \oplus b \odot \left[\left\langle 6\right\rangle c \, e \oplus \left\langle 3\right\rangle d \, e\right]

The Derived-Term Automaton

The derived-term automaton can be computed using derivations or expansions.

e1.derived_term('expansion')
Image in a Jupyter notebook
e1.derived_term('derivation')
Image in a Jupyter notebook

The Deterministic Derived-Term Automaton

e1.derived_term(deterministic=True)
Image in a Jupyter notebook

Section 4.2: An Infinite Automaton

There exists no finite weighted deterministic automaton for the following expression.

e = Q.expression('a* + (<2>a)*') e

a+(2a){a}^{*} + \left( \left\langle 2 \right\rangle \,a\right)^{*}

e.derived_term()
Image in a Jupyter notebook
a = e.derived_term(deterministic=True, lazy=True) a
Image in a Jupyter notebook

Repeated evaluations uncover parts of this automaton.

print(a('')); a
2
Image in a Jupyter notebook
print(a('aa')); a
5
Image in a Jupyter notebook
print(a('aaaa')); a
17
Image in a Jupyter notebook

Additional Operators

The following automata demonstrate the support of the shuffle operator (denoted ':' in ASCII, and '\between' in LaTeX\LaTeX), and the infiltrate operator ('&:' and '\uparrow'). See also the documentation of the corresponding operations on automata: automaton.shuffle and automaton.infiltrate.

exp = vcsn.context('lal, q').expression e1 = exp('<2>a<3>a : <5>a<6>a', 'trivial') e1

2a3a5a6a \left\langle 2 \right\rangle \,a \, \left\langle 3 \right\rangle \,a \between \left\langle 5 \right\rangle \,a \, \left\langle 6 \right\rangle \,a

The 'trivial' argument above restricts the identities to the trivial ones, otherwise additional identities would defeat our attempt to "taint" the as below:

exp('<2>a<3>a : <5>a<6>a')

6(aa)30(aa) \left\langle 6 \right\rangle \,\left(a \, a\right) \between \left\langle 30 \right\rangle \,\left(a \, a\right)

e1.derived_term()
Image in a Jupyter notebook
e2 = exp('<2>a<3>a &: <5>a<6>a', 'trivial') e2

2a3a5a6a \left\langle 2 \right\rangle \,a \, \left\langle 3 \right\rangle \,a \uparrow \left\langle 5 \right\rangle \,a \, \left\langle 6 \right\rangle \,a

e2.derived_term()
Image in a Jupyter notebook

Broken Derived Terms

"Breaking" variants "split" the polynomials at each step. In short, it means that no state will be labeled by an addition: rather the addition is split into several states. As a consequence, the automaton might have several initial states.

e3 = exp('[ab]{2}') e3.derived_term()
Image in a Jupyter notebook
e3.derived_term(breaking=True)
Image in a Jupyter notebook