Invariance of the recognition model to variation in speech rate.
A) The normal length stimulus “eight” (400 ms, top panel) has been learned and recognized successfully by the module “eight” (M8). For clarity, we only show the second level causal states (see Model). The same module (without any parameter adaptation) successfully recognizes a time-compressed version of the same stimulus (300 ms, middle panel). For comparison, the module trained on a digit “three” (M3) fails to reconstruct its expected dynamics when exposed to “eight” (bottom panel). B) The total prediction errors produced at the second level hidden states by ten different modules (M0 to M9), which were previously trained on the corresponding digits with normal length, are shown. All modules were exposed to the same 25% time compressed “eight” stimulus. Module M8 (red arrow) produces the lowest prediction error and shows that prediction error can be used for classification, even though the stimulus is time compressed.