Contact
CoCalc Logo Icon
StoreFeaturesDocsShareSupport News AboutSign UpSign In
| Download
Views: 45903
Kernel: Python 3

context.levenshtein

Generate a transducer encoding the Levenshtein distance for two alphabets.

The resulting transducer can be composed with transducers representing two languages to compute the distance between the two languages (among other things, like the edit-distance automaton).

Preconditions:

  • the labelset has exactly two tapes

  • the labelsets must have the empty word

  • the weightset must be nmin (N,min,+\langle \mathbb{N}, \min, +\rangle)

References:

Examples

import vcsn c = vcsn.context('lat<lan_char(abc), lan_char(bce)>, nmin') l = c.levenshtein() l
Image in a Jupyter notebook

The Levenshtein automaton only has one state, but has n2n^2 transitions, for a common alphabet between tapes of size nn.

To show its use, we will create automata for two languages, a1 and a2, and compute the edit-distance automaton.

a1 = vcsn.context('lan_char(abc), b').expression("bac+cab").derived_term().strip().partial_identity() a1
Image in a Jupyter notebook
a2 = vcsn.context('lan_char(bce), b').expression("bec+bebe").automaton().cominimize().strip().partial_identity() a2
Image in a Jupyter notebook
edit = a1.compose(l).compose(a2).trim() edit
Image in a Jupyter notebook

The automaton can be evaluated on one tape to get the edit distance between a word and a language.

exp = edit.lift(1).proper().evaluate("bac") exp

3((b)(e)(b)(e))+1((b)(e)(c+2((b)(e))))+2((b)(e)(1(c)+2((b)(e))))+1(3((b)(e)(b)(e))+1((b)(e)(1(c)+2((b)(e))))) \left\langle 3 \right\rangle \,\left(\left(b\right) \, \left(e\right) \, \left(b\right) \, \left(e\right)\right) + \left\langle 1 \right\rangle \,\left(\left(b\right) \, \left(e\right) \, \left(c + \left\langle 2 \right\rangle \,\left(\left(b\right) \, \left(e\right)\right)\right)\right) + \left\langle 2 \right\rangle \,\left(\left(b\right) \, \left(e\right) \, \left( \left\langle 1 \right\rangle \,\left(c\right) + \left\langle 2 \right\rangle \,\left(\left(b\right) \, \left(e\right)\right)\right)\right) + \left\langle 1 \right\rangle \,\left( \left\langle 3 \right\rangle \,\left(\left(b\right) \, \left(e\right) \, \left(b\right) \, \left(e\right)\right) + \left\langle 1 \right\rangle \,\left(\left(b\right) \, \left(e\right) \, \left( \left\langle 1 \right\rangle \,\left(c\right) + \left\langle 2 \right\rangle \,\left(\left(b\right) \, \left(e\right)\right)\right)\right)\right)

vcsn.context("lan_char(bce), nmin").expression(exp.format('text'), 'distributive')

1(bec)3(bebe) \left\langle 1 \right\rangle \,\left(b \, e \, c\right) \oplus \left\langle 3 \right\rangle \,\left(b \, e \, b \, e\right)