Diagnostic Classifier Demo
‘Diagnostic classification’ is an approach to open the blackbox
of deep learning algorithms, where classifiers are trained to
‘read out’ hidden states of neural networks in order to
investigate what type of information they represent.
In this demo, you can test how a specific deep learning model,
the Gated Recurrent Unit, is at solving simple arithmetic problems,
and more importantly, what strategy the model is following.
Type in expressions like ( ( 5 + 3 ) - ( 7 + -5) )
, hit process
and see what happens.
We will give you a short explanation of the arithmetic language
and the model used in this demo below, or you can download
the scientific article
Demo
Enter an expression and click evaluate to see the model in action.
The arithmetic language
The arithmetic language has words for all the numbers from -10 to +10,
for brackets ( and ), and for the operators + and -. A sentence in
the arithmetic language could for instance look like this:
( five minus ( two plus six ) )
The meaning of the sentences in the arithmetic language corresponds
to the outcome of the arithmetic expression. The meaning of the expression
above is thus -3
.
The meaning of a sentence in the arithmetic language can be computed in
different ways. One option is to first compute the meaning of the smallest
units and then combine them. In the example above, this means first computing
the meaning of ( two plus six )
then combining this with five
to compute the meaning of the expression (-3
).
We call this strategy a recursive strategy.
Recursively computing the meaning of a sentence that is read from left
to right requires keeping track of the intermediate outcomes of all
smaller units and the way they should be combined on a stack.
Computing the meaning of an arithmetic expression with two different strategies
Another way to compute the meaning of an arithmetic expression is to keep a
running prediction of the outcome and add or subtract numbers as they come in.
For instance, to compute the meaning of the example sentence, two
would
be directly subtracted from the subtotal 5
, instead of first being
integrated with six
.
This requires keeping a stack with operators to understand whether the next
number should be added or subtracted.
GRU model
We trained a GRU model to compute the meaning of sentences in the arithmetic
language. In this GRU model, the words (for instance three
or (
)
are represented as 2-dimensional vectors.
To investigate which strategy this model is following, we trained several
diagnostic classifiers on its hidden layer activations.
You can view the predictions of these diagnostic classifiers in the demo above
by typing in an expression and indicate which diagnostic classifiers you would
like to run. Like this, you can investigate which information is represented
in the hidden states and understand which sentences are more difficult to
process than others.
We trained diagnostic classifiers for the following features:
- subtotal_recursive
- This real-valued feature represents the intermediate outcome at every point in time, assuming that this outcome is computed using the recursive strategy.
- subtotal_cumulative
- This real-valued feature represents the intermediate outcome of the cumulative strategy.
- grammatical
- This binary feature represents whether an expression is grammatical (this will thus only be the case at the end of the expression, when all brackets are closed)
- mode
- This binary feature is relevant for the cumulative strategy, it expresses whether the next feature should be added (1) or subtracted (0).
- mode_switch
- This binary feature describes whether the mode feature remains the same, or changes.
- minus1depth
- This binary feature represents whether the representation is within the scope of at least 1 minus (in other words, this feature is true when a leaf node has at least one ancestor node which is a minus).
- minus2depth
- Similar to minus1depth, but for 2 minusses
- minus3depth
- Similar to minus1depth, but for 3 minusses
- minus4depth
- Similar to minus1depth, but for 4 minusses
- minus1depth_count
- Keeping track of the minusdepth of a sentence requires counting, this real-valued feature stores how many brackets should still be closed for the minus1depth to go to 0.
The figure below gives an example of the values that some of these features
take for a long sentence. Use the demo to see how well they are predicted
by the GRU model!
Expected values of some classifiers for a long example expression