Enumeration-Based Generic State Machine

Introduction
Insights
Representation of a Transition System
Enumeration Types
Non-Enumeration Types
Transition Labels
State Graph
Undirected Graph Edges
Advanced Graph Calculations
Deterministic Finite-State Acceptor
Transition Side Effect
Invalid Input
Deterministic Finite-State Transducer
Moore Machine Versus Mealy machine
Future Work

Introduction

This article is the fifth article of a small series of articles on enumeration types:

The present article describes the project Generic state machine. Please see the Please see the comprehensive API documentation. The generic classes TransitionSystem, Acceptor, and Transducer provide the functionality of transition systems and finite-state machines. They rely on enumeration-type generic parameters representing the sets of states and the input and output alphabets.

Insights

The system is powered by four ideas:

The sets of states, input, and input alphabets are sets represented by enumeration types.
The generic class TransitionSystem is used as a base class for the finite-state machine classes.
The transition systems are labeled in a special way: a transition label carries the delegate instances representing the arbitrary side effect of a transition.
The transition relation over the set of states is complemented by the relation representing invalid transitions. The transition between each pair of states in this relation is not allowed, but its label carries a delegate instance used to explain why it is not allowed.

To see the power of this approach, let’s start with a transition system.

Representation of a Transition System

Let’s say, we have some system with three states, A, B, and C. Let’s make a Cartesian square out of them:

	A	B	C
A	×
B		×
C			×

The “white” area of the table represents the set of elements of the Cartesian square. Let’s say the crossed cells represent some subset of the Cartesian square. Then we can say that this subset is a relation over the set S = {A, B, C}. That is, for each pair (x, y), x ∈ S, y ∈ S, we can say that the relation holds between x and y, or not. In particular, the relation can represent the set of valid transitions of a transition system. Let’s say that the cells of the first column of a table {A, B, C} represent the starting states of some transition, and the cells of the first row of a table {A, B, C} represent the final state of the transition. Then we can define the relation T ⊂ S, so for each pair (x, y), x ∈ T, y ∈ T the transition from x to y is permitted. We usually denote it as x ⭢ y.

In the table shown above we crossed out the identical relation; it states that “each element is related to itself”. If does not have to be the case, as a relation can be any arbitrary subset of S, including an empty relation (no two elements are related) or a full relation (any element is related to any other element).

For example, we can create a relation representing circular transitions A ⭢ B ⭢ C ⭢ A ⭢ …:

	A	B	C
A		×
B			×
C	×

So, we can formally define a transition system as a triple (S, T, s₀) where:

S is a set of states,
T ⊂ S
s₀ is the initial state.

Note that in the context of this article, we will consider only finite sets S. That said, we will talk only about finite-state transition systems. Therefore, we will have to consider only final-state machines and limit our considerations to deterministic systems.

Enumeration Types

An enumeration type is a perfect type used to represent a finite set, because this is the ultimate abstraction focused only on one aspect: object identity and uniqueness, everything else is consciously reduced to a bare technical minimum. The identity is challenged by the fact that two different enumeration members can be assigned the same underlying value, but the members are still unique during the compile time. Also, unfortunately, enumerations do not enumerate.

Both problems are comprehensively solved by the internal representation of the sets. They are represented as a dictionary or enumeration wrappers built via reflection.

public abstract class TransitionSystemBase<STATE> {

    internal protected delegate void TraverseHandler<ELEMENT>(string name, ELEMENT element);
    internal protected void Traverse<ELEMENT>(TraverseHandler<ELEMENT> handler) {
        FieldInfo[] fields = typeof(ELEMENT).GetFields(BindingFlags.Static | BindingFlags.Public);
        foreach (var field in fields) {
            if (field.GetCustomAttributes(typeof(ExcludeAttribute), inherit: false).Length > 0)
                continue;
            ELEMENT element = (ELEMENT)field.GetValue(null);
            handler(field.Name, element);
        } //loop
    } //Traverse

    internal class Element<ELEMENT> {
        internal Element(string name, ELEMENT element) {
            Name = name; UnderlyingMember = element;
        } //Element
        internal string Name { get; init; }
        internal ELEMENT UnderlyingMember { get; init; }
    } //Element

}

The generic method Traverse is used for different enumeration types, not only for STATE. This is because an Acceptor also has to traverse the input alphabet set, and a Transducer — also output alphabet set.

Non-Enumeration Types

We could apply the System.Enum generic constraint since C# 7.3. I could not do it because I have developed this approach since 2009. But do we really need it? Surprisingly, no. What happens when we use some other type to instantiate the classes for STATE, INPUT, or OUTPUT?

Nothing wrong. The Traverse code will simply pick up all public static fields of the type and interpret them as set elements. Such an exotic example is demonstrated here. In this example, the type double is used, and the states are double.NegativeInfinity, double.MinValue, double.Epsilon, double.MaxValue, double.PositiveInfinity, and double.NaN.

Transition Labels

Let’s look at a Cartesian square again. When we mark some cells with a cross, what information it adds? Obviously, nothing but a Boolean value: saying if the transition is valid or not. So, formally we can represent a transition table as a matrix, a rank-two array of Boolean values. But is it good enough, technically?

It doesn’t seem so. First, for many applications, the matrix would be a sparse matrix, so the arrays would be a waste of memory. More importantly, this is not informative enough. For many practical purposes, we need some labeling.

The idea is to have the same label for both the relation representing the valid transitions and the relation representing invalid ones.

In the first case, the label will provide the delegate instance called each time the transition is performed. This is the most practically important feature. First of all, I have used it for hardware automation. When the transition system walks between states, it can actually drive the machinery. At the same time, the state machine itself is abstracted from the hardware operations.

In the second case of invalid transitions, the label provides a delegate instance used to explain why it is not allowed.

For further detail, please see the source code, TransitionLabel, StateTransitionAction, InvalidStateTransitionAction, TryTransitionTo.

State Graph

Taking into account all of the above, the state graph is represented by the dictionary

readonly Dictionary<StateGraphKey, TransitionLabel> stateGraph = new();

The elements of this dictionary represent both valid and invalid transitions, or the edges of the graph. Technically, all the edges are directional, because the transition from x ⭢ y is not the same as y ⭢ x. To describe the valid transition between both x ⭢ y and y ⭢ x we would need to add two different dictionary elements using TransitionSystem.AddValidStateTransition. The ValidStateTransitionAction delegate instance can be the same, as the starting and finishing states are passed to it.

However, this approach would be far from the optimum. If the transition is valid in two ways, we can use the flag undirected much better, adding only one element to the State Graph dictionary. Let’s take a look at this implementation.

Undirected Graph Edges

The problem of the undirected graph edge is completely resolved by using a special implementation of System.GetHashCode and System.Equals:

    class StateGraphKey {
        internal StateGraphKey(State start, State finish, bool undirected = false) {
            StartState = start; FinishState = finish;
            IsUndirected = undirected;
        }
        public override int GetHashCode() { // important!
            return StartState.Name.GetHashCode()
                ^ FinishState.Name.GetHashCode();
        }
        public override bool Equals(object @object) { // important!
            if (@object == null) return false;
            if (@object is not StateGraphKey objectStateGraphKey) return false;
            bool nameMatch = (objectStateGraphKey.StartState.Name == StartState.Name
                && objectStateGraphKey.FinishState.Name == FinishState.Name);
            return IsUndirected
                ? nameMatch || 
                    (objectStateGraphKey.StartState.Name == FinishState.Name
                    && objectStateGraphKey.FinishState.Name == StartState.Name)
                : nameMatch;
        } //Equals
        internal State StartState { get; init; }
        internal State FinishState { get; init; }
        internal bool IsUndirected { get; init; }
    }

This way, when we inquire for the validity of a transition from x to y or perform TryTransitionTo, we get the same state dictionary element as for the transition from y to x in case IsUndirected is true.

Now, let’s go ahead and get to the finite-state machine classes.

Advanced Graph Calculations

TransitionSystem implements some advanced calculations on the state graph. It includes:

Labyrinth. This method returns all the permitted paths between the states start and finish.
FindDeadEnds. This method finds all “dead ends”, the states not visited along any paths between the states start and finish.
LongestPaths. This property finds the longest path length in the state graph. When this number is found, it finds all the possible paths with this length and returns all these paths.
MaximumPaths This property finds the maximum number of paths between all STATE-to-STATE pairs. When this number is found, it returns the pairs having this number of paths between them.

All these methods and properties depend on Labyrinth. Note that the properties LongestPaths and MaximumPaths involve NP-hard calculations.

Deterministic Finite-State Acceptor

Basically, an acceptor can be considered a transition system with an additional function: it gets some transition based on the input signal.

Formally, a deterministic finite-state machine or deterministic finite-state acceptor is a quintuple (Σ, S, s₀, δ, F), where:

Σ is the input alphabet (a finite non-empty set of symbols)
S is the finite set of states, the same as in the transition system
s₀ ∈ S is the initial state, the same as in the transition system
δ is the state-transition function: δ: S ⨉ Σ ⭢ S
F ⊂ S is the set of final states, defined implicitly via δ

The input signal causes an acceptor instance to perform a transition, but only for the signal-state pairs where the state-transition function is defined.

In the Acceptor implementation, the state-transition function is defined via the calls to Acceptor.AddStateTransitionFunctionPart, and the transition is performed using the call to Acceptor.TransitionSignal.

Transition Side Effect

When an acceptor signal causes a successful transition, it causes the same side effect as the TransitionSystem.TryTransitionTo.

However, if the valid transition is not defined via the base-class method TransitionSystem.AddValidStateTransition the transition is performed as soon as it is defined via AddStateTransitionFunctionPart. In other words, even though the transition system’s transition side effect is respected, the validity of a transition in Acceptor is defined by its state-transition function.

A similar thing happens to the invalid input.

Invalid Input

The invalid input is defined via AddInvalidInput. Also, the input is invalid, if the state transition function is not defined for the pair of input and CurrentState. However, if some signal causes an invalid input, the transition label is respected.

Deterministic Finite-State Transducer

An acceptor can be considered as an acceptor with another additional function: the output function. The input signal causes a transducer instance to perform a transition for the signal-state pairs where the state-transition function is defined. Additionally, upon the successful transition, a transducer instance produces an output signal.

Formally, a finite-state transducer is a sextuple (Σ, Γ, S, s₀, δ, ω), where:

Σ is the input alphabet (a finite non-empty set of symbols)
Γ is the output alphabet (a finite non-empty set of symbols)
S is a finite non-empty set of states
s₀ ∈ S is the initial state
δ is the state-transition function: δ: S ⨉ Σ ⭢ S
ω is the output function ω: S ⨉ Σ ⭢ Γ (ω: S ⭢ Γ)

In the Transducer implementation, the output function is defined via the calls to Transducer.AddOutputFunctionPart, and the transition is performed using the call to Transducer.Signal.

Moore Machine Versus Mealy machine

The same class Transducer is used for the implementation of the Moore Machine and Mealy machine. When an output function is defined using AddOutputFunctionPart, the user can pass the instance of either MooreMachineOutputAction or MealyMachineOutputAction.

If an output function of a state machine uses a mixture of Moore and Mealy functions for different function arguments, it is, formally speaking, a Mealy machine.

Future Work

Operations on finite-state transducers can be implemented.

Besides, the stochastic behavior can be easily added to the state machine.

December 29, 2024 – January 3, 2025

This documentation is generated from the extended Markdown documentation using Extensible Markdown for Visual Studio Code.

Contents