Weighted Automata for Speech and Text Processing in NLP

Authors

  • Asfand Butt Department of Software Engineering Sindh Madressatul Islam University, City Campus Karachi, Pakistan
  • Murtaza Mutafa Department of Software Engineering Sindh Madressatul Islam University, City Campus Karachi, Pakistan
  • Muhammad Hassan Department of Software Engineering Sindh Madressatul Islam University, City Campus Karachi, Pakistan
  • Aliza Nadeem Department of Software Engineering Sindh Madressatul Islam University, City Campus Karachi, Pakistan
  • Syeda Ayeha Department of Software Engineering Sindh Madressatul Islam University, City Campus Karachi, Pakistan
  • Maria Memon Department of Computer Science and Information Technology Benazir Bhutto Shaheed University, Lyari Karachi, Pakistan

DOI:

https://doi.org/10.55927/modern.v5i2.41

Keywords:

Natural Language Processing (NLP), Recurrent Neural Networks (RNN), Weighted Automata, Automatic Speech Recognition (ASR)

Abstract

Weighted automata offer an interpretable structure to illustrate sequential behavior, and thus can be useful in speech and text processing as part of natural language processing. The current research study analyses the concept of integration of weighted finite state automata (WFA) and weighted finite state transducers (WFST) into NLP pipelines with specific focus on interpretability and post processing refinement of automatic speech recognition (ASR) results. One of them is the translation of the latent decision patterns of recurrent neural networks (RNNs) to transparent weighted automata. Through examining state paths, regions of activation, recurring transitions, the inertia by sustaining recurrent transitions, the proposed method itself converts the inner logic of RNNs into understandable set of automata thus explaining how the model handles sequences. Such undertaking enriches the interpretability, and allows the re-use of the rules inferred thereof in lower symbolic integrations. To ensure the extracted automata are accurate models of the behavior of the RNN, the study will use decision-guided extraction methods, including selective sampling of informative inputs, clustering algorithms to match automaton states with neuromodelling dynamics, and forecasting transition weights to indicate levels of confidence of the RNN

References

Black, A., & Lenzo, K. (2001). Building Synthetic Voices. CSLU Technical Report.

Chen, S., & Wang, X. (2019). Homophone Resolution in ASR Outputs via WFST–Based Rule Systems. Proceedings of Interspeech.

Graves, A., & Schmidhuber, J. (2005). Framewise Phoneme Classification with Bidirectional LSTM and other Neural Network Architectures. Neural Networks, 18(5–6), 602–610.

Gupta, R., et al. (2023). Dynamic Tagging for Adaptive Text Normalization. Transactions of the Association for Computational Linguistics, 11, 103–118.

Hernández, P., & Silva, M. (2024). Multilingual ASR Normalization: A Case Study on Spanish and Portuguese. Proceedings of Interspeech.

Li, Y., & Zhao, L. (2022). Hybrid Neuro–Symbolic Approaches for ASR Post‑Processing. Journal of Artificial Intelligence Research, 75, 345–368.

Mohri, M., Pereira, F., & Riley, M. (2002). Weighted Finite-State Transducers in Speech Recognition. Computer Speech & Language, 16(1), 69–88.

Müller, K., & Bernstein, J. (2024). Lightweight Automata for Edge-Device ASR Correction.

Proceedings of the Conference on Embedded Machine Learning.

Rogahn, C., et al. (2021). On-Device Streaming Text Normalization for ASR using WFSTs. IEEE Workshop on Spoken Language Technology.

Sproat, R., et al. (2016). Multilingual Text Normalization for ASR: A WFST-Based Approach. Proceedings of Interspeech.

Tyers, F., & Lichtenstein, M. (2020). Toward Transparent ASR: Extracting Interpretable Automata from Transformer-Based Speech Models. Proceedings of ICASSP.

Weiss, J., Suzuki, T., & Neubig, G. (2019). Interpreting Neural Sequence Models using Finite-State Proxies. Proceedings of ACL.

Weiss, R. J., et al. (2017). Sequence-to-Sequence Models for Speech Recognition — A Comparison with WFST Decoders. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 25(12), 2187–2196.

Zeyer, A., Mauser, A., Ney, H., & Zeiler, S. (2018). Decision‑guided Automata Extraction from RNNs for Sequence Modeling. Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP).

Zhang, Q., & Li, H. (2025). Evaluation Metrics for Model Interpretability: Proposals and Case Studies. Journal of Machine Learning Interpretability, 2(1), 45–62

Downloads

Published

2026-04-27

How to Cite

Asfand Butt, Murtaza Mutafa, Muhammad Hassan, Aliza Nadeem, Syeda Ayeha, & Maria Memon. (2026). Weighted Automata for Speech and Text Processing in NLP. Indonesian Journal of Contemporary Multidisciplinary Research, 5(2), 441–450. https://doi.org/10.55927/modern.v5i2.41

Issue

Section

Articles