Contact
CoCalc Logo Icon
StoreFeaturesDocsShareSupport News AboutSign UpSign In
| Download
Views: 45903
Kernel: Python 3

Questions taken from Stackoverflow

This page lists questions about automata and other rational/regular expressions that were asked on Stackoverflow, and where Vcsn seems to be an appropriate tool to compute the answer.

import os # This trick ensures that we always use the same random seed, # hence running this documentation always gives the same result. os.environ['VCSN_SEED'] = '1'

Build a Regular Expression and Finite Automata

The set of all strings beginning with 101 and ending with 01010.

First, let's define our "context": we work with "labels are letters" (lal), on the alphabet {0,1}\{0, 1\}. We don't use weights, or rather, we use the traditional Boolean weights: B\mathbb{B}.

import vcsn c = vcsn.context('lal_char(01), b') c

{0,1}→B\{0, 1\}\to\mathbb{B}

Then, we build our expression using an unusual operator: E&F\mathsf{E} \& \mathsf{F} denotes the conjunction of expressions E\mathsf{E} and F\mathsf{F}. In this case (unweighted/Boolean automata), it denotes exactly the intersection of languages.

e = c.expression('(101[01]*)&([01]*01010)') e

1 0 1 (0+1)∗&(0+1)∗ 0 1 0 1 01 \, 0 \, 1 \, \left(0 + 1\right)^{*} \& \left(0 + 1\right)^{*} \, 0 \, 1 \, 0 \, 1 \, 0

We want to normalize this extended expression (it has conjunction and complement operators) into a basic expression. To this end, we first convert it to an automaton.

a = e.automaton() a
Image in a Jupyter notebook

and then we convert this automaton into a basic expression:

a.expression()

1 (0 1+0 1 (0+1)∗ 0 1) 0 1 01 \, \left(0 \, 1 + 0 \, 1 \, \left(0 + 1\right)^{*} \, 0 \, 1\right) \, 0 \, 1 \, 0

Or, in ASCII:

print(a.expression())
1(01+01(0+1)*01)010

Regular expression to match text that doesn't contain a word?

I'd like to know if it's possible to match lines that don't contain a specific word (e.g. hede) using a regular expression?

First, let's define that alphabet we work on: from aa to zz for instance.

import vcsn c = vcsn.context('lal_char(a-z), b') c

{a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z}→B\{a, b, c, d, e, f, g, h, i, j, k, l, m, n, o, p, q, r, s, t, u, v, w, x, y, z\}\to\mathbb{B}

Then we define our expression, which is extended (it uses the complement operator), so to normalize it, we first convert it into automaton (with _expression_.automaton), from which we extract a basic expression (with _automaton_.expresion).

e = c.expression('(hede){c}') e

(h e d e)c\left(h \, e \, d \, e\right)^{c}

a = e.automaton() a
Image in a Jupyter notebook
a.expression()

ε+h (ε+e (ε+d))+([^h]+h ([^e]+e ([^d]+d ([^e]+e [^])))) [^]∗\varepsilon + h \, \left(\varepsilon + e \, \left(\varepsilon + d\right)\right) + \left([\hat{}h] + h \, \left([\hat{}e] + e \, \left([\hat{}d] + d \, \left([\hat{}e] + e \, [\hat{}]\right)\right)\right)\right) \, {[\hat{}]}^{*}

Or, in ASCII (+ is usually denoted |; \e denotes the empty word; and [^] denotes any character, usually written .):

print(a.expression())
\e+h(\e+e(\e+d))+([^h]+h([^e]+e([^d]+d([^e]+e[^]))))[^]*

Convert finite state machine to regular expression

Is there a tool (or an algorithm) to convert a finite state machine into a regular expression?

Vcsn is tool for rational expressions and automata. See http://vcsn.lrde.epita.fr.

import vcsn

Build a random automaton aa of 4 states, labeled by letters to choose in a,b,c{a, b, c}. Then "lift" it into an automaton bb labeled by expressions.

a = vcsn.context('lal(abc), b').random_automaton(4, num_final=2) b = a.lift() b
Image in a Jupyter notebook

Eliminate state 2, then 3, etc. The order does not matter for correction (any order gives a correct result), but the quality of the result does depend on this order.

b = b.eliminate_state(2) b
Image in a Jupyter notebook
b = b.eliminate_state(3); b
Image in a Jupyter notebook
b = b.eliminate_state(1); b
Image in a Jupyter notebook

Eliminate the last state, and read the answer.

b = b.eliminate_state(0); b
Image in a Jupyter notebook

Alternatively, you can use the method automaton.expression to ask for the result.

a.expression()

(b+c+c (b+c+b b (a+b))∗ b)∗ (ε+c (b+c+b b (a+b))∗ b b)\left(b + c + c \, \left(b + c + b \, b \, \left(a + b\right)\right)^{*} \, b\right)^{*} \, \left(\varepsilon + c \, \left(b + c + b \, b \, \left(a + b\right)\right)^{*} \, b \, b\right)

You may see this algorithm run "interactively" using %demo.

a = vcsn.context('lal(abc), b').random_automaton(10) b = a.lift() %demo b eliminate_state
MIME type unknown not supported
MIME type unknown not supported

Constructing a Regular Expression from a Finite Automata

I'm trying to construct a regular expression from a Finite Automaton but found my self completely stuck with this one.

import vcsn
%%automaton a $ -> q3 q1 -> q2 a q1 -> q3 b q2 -> q2 a, b q3 -> q4 a q3 -> q3 b q4 -> q4 a q4 -> q1 b q3 -> $ q4 -> $
Image in a Jupyter notebook
a.expression()

(b+a a∗ b b)∗ (ε+a a∗)\left(b + a \, {a}^{*} \, b \, b\right)^{*} \, \left(\varepsilon + a \, {a}^{*}\right)

How to find the right quotient of a language given two languages?

(This is the question with typos in L1L_1 and L2L_2 fixed.)

If L1={anbm∣n⩾1,m⩾0}∪{ba}L_1= \{a^n b^m \mid n \geqslant 1, m \geqslant 0 \} \cup \{ba\} and L2={bm∣m⩾1}L_2= \{b^m \mid m \geqslant 1 \}. I am not getting how the DFA for L1/L2L_1/L_2 is constructed in the second figure using the DFA for L1L_1, please tell me the approach.

First, let's define our "context": we work with "labels are letters" (lal), on the alphabet {a,b}\{a, b\}. We don't use weights, or rather, we use the traditional Boolean weights: B\mathbb{B}.

import vcsn c = vcsn.context('lal(ab), b') c

{a,b}→B\{a, b\}\to\mathbb{B}

Then define the first automaton, Figure 4.1.

%%automaton a1 $ -> q0 q1 -> $ q0 -> q1 a q0 -> q3 b q1 -> q1 a q1 -> q2 b q2 -> $ q2 -> q2 b q2 -> q5 a q3 -> q4 a q3 -> q5 b q4 -> $ q4 -> q5 a, b q5 -> q5 a, b
Image in a Jupyter notebook
a1.is_deterministic() and a1.is_complete()
True

Automaton A2\mathcal{A}_2 represents the language L2L_2:

a2 = c.expression('b{+}').automaton() a2
Image in a Jupyter notebook

We may now compute the quotient, which is indeed the automaton of Figure 4.2. It is better looking once trimmed.

a1 / a2
Image in a Jupyter notebook
(a1 / a2).trim()
Image in a Jupyter notebook