Diagnostic Classifier Demo

‘Diagnostic classification’ is an approach to open the blackbox of deep learning algorithms, where classifiers are trained to ‘read out’ hidden states of neural networks in order to investigate what type of information they represent. In this demo, you can test how a specific deep learning model, the Gated Recurrent Unit, is at solving simple arithmetic problems, and more importantly, what strategy the model is following. Type in expressions like ( ( 5 + 3 ) - ( 7 + -5) ), hit process and see what happens. We will give you a short explanation of the arithmetic language and the model used in this demo below, or you can download the scientific article


Enter an expression and click evaluate to see the model in action.

The arithmetic language

The arithmetic language has words for all the numbers from -10 to +10, for brackets ( and ), and for the operators + and -. A sentence in the arithmetic language could for instance look like this:

( five minus ( two plus six ) )

The meaning of the sentences in the arithmetic language corresponds to the outcome of the arithmetic expression. The meaning of the expression above is thus -3.

The meaning of a sentence in the arithmetic language can be computed in different ways. One option is to first compute the meaning of the smallest units and then combine them. In the example above, this means first computing the meaning of ( two plus six ) then combining this with five to compute the meaning of the expression (-3). We call this strategy a recursive strategy.

Recursively computing the meaning of a sentence that is read from left to right requires keeping track of the intermediate outcomes of all smaller units and the way they should be combined on a stack.

Computing the meaning of an arithmetic expression with two different strategies

Another way to compute the meaning of an arithmetic expression is to keep a running prediction of the outcome and add or subtract numbers as they come in. For instance, to compute the meaning of the example sentence, two would be directly subtracted from the subtotal 5, instead of first being integrated with six. This requires keeping a stack with operators to understand whether the next number should be added or subtracted.

GRU model

We trained a GRU model to compute the meaning of sentences in the arithmetic language. In this GRU model, the words (for instance three or () are represented as 2-dimensional vectors. To investigate which strategy this model is following, we trained several diagnostic classifiers on its hidden layer activations. You can view the predictions of these diagnostic classifiers in the demo above by typing in an expression and indicate which diagnostic classifiers you would like to run. Like this, you can investigate which information is represented in the hidden states and understand which sentences are more difficult to process than others.

We trained diagnostic classifiers for the following features:

This real-valued feature represents the intermediate outcome at every point in time, assuming that this outcome is computed using the recursive strategy.
This real-valued feature represents the intermediate outcome of the cumulative strategy.
This binary feature represents whether an expression is grammatical (this will thus only be the case at the end of the expression, when all brackets are closed)
This binary feature is relevant for the cumulative strategy, it expresses whether the next feature should be added (1) or subtracted (0).
This binary feature describes whether the mode feature remains the same, or changes.
This binary feature represents whether the representation is within the scope of at least 1 minus (in other words, this feature is true when a leaf node has at least one ancestor node which is a minus).
Similar to minus1depth, but for 2 minusses
Similar to minus1depth, but for 3 minusses
Similar to minus1depth, but for 4 minusses
Keeping track of the minusdepth of a sentence requires counting, this real-valued feature stores how many brackets should still be closed for the minus1depth to go to 0.

The figure below gives an example of the values that some of these features take for a long sentence. Use the demo to see how well they are predicted by the GRU model!

Expected values of some classifiers for a long example expression