Algorithms for Acyclic Weighted Finite-State Automata with Failure Arcs

Svete, Anej; Dayan, Benjamin; Vieira, Tim; Cotterell, Ryan; Eisner, Jason

Computer Science > Data Structures and Algorithms

arXiv:2301.06862 (cs)

[Submitted on 17 Jan 2023 (v1), last revised 11 Jul 2023 (this version, v2)]

Title:Algorithms for Acyclic Weighted Finite-State Automata with Failure Arcs

Authors:Anej Svete, Benjamin Dayan, Tim Vieira, Ryan Cotterell, Jason Eisner

View PDF

Abstract:Weighted finite-state automata (WSFAs) are commonly used in NLP. Failure transitions are a useful extension for compactly representing backoffs or interpolation in $n$-gram models and CRFs, which are special cases of WFSAs. The pathsum in ordinary acyclic WFSAs is efficiently computed by the backward algorithm in time $O(|E|)$, where $E$ is the set of transitions. However, this does not allow failure transitions, and preprocessing the WFSA to eliminate failure transitions could greatly increase $|E|$. We extend the backward algorithm to handle failure transitions directly. Our approach is efficient when the average state has outgoing arcs for only a small fraction $s \ll 1$ of the alphabet $\Sigma$. We propose an algorithm for general acyclic WFSAs which runs in $O{\left(|E| + s |\Sigma| |Q| T_\text{max} \log{|\Sigma|}\right)}$, where $Q$ is the set of states and $T_\text{max}$ is the size of the largest connected component of failure transitions. When the failure transition topology satisfies a condition exemplified by CRFs, the $T_\text{max}$ factor can be dropped, and when the weight semiring is a ring, the $\log{|\Sigma|}$ factor can be dropped. In the latter case (ring-weighted acyclic WFSAs), we also give an alternative algorithm with complexity $\displaystyle O{\left(|E| + |\Sigma| |Q| \min(1,s\pi_\text{max}) \right)}$, where $\pi_\text{max}$ is the size of the longest failure path.

Comments:	9 pages, Proceedings of EMNLP 2022
Subjects:	Data Structures and Algorithms (cs.DS); Computation and Language (cs.CL)
Cite as:	arXiv:2301.06862 [cs.DS]
	(or arXiv:2301.06862v2 [cs.DS] for this version)
	https://doi.org/10.48550/arXiv.2301.06862

Submission history

From: Anej Svete [view email]
[v1] Tue, 17 Jan 2023 13:15:44 UTC (169 KB)
[v2] Tue, 11 Jul 2023 09:08:33 UTC (148 KB)

Computer Science > Data Structures and Algorithms

Title:Algorithms for Acyclic Weighted Finite-State Automata with Failure Arcs

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Data Structures and Algorithms

Title:Algorithms for Acyclic Weighted Finite-State Automata with Failure Arcs

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators