Tates, Alberto and Matran-Fernandez, Ana and Halder, Sebastian and Daly, Ian (2026) Decoding Speech Imagery or Just Noise?: A Symptom of the Replicability Crisis. Journal of Neural Engineering. (In Press)
Tates, Alberto and Matran-Fernandez, Ana and Halder, Sebastian and Daly, Ian (2026) Decoding Speech Imagery or Just Noise?: A Symptom of the Replicability Crisis. Journal of Neural Engineering. (In Press)
Tates, Alberto and Matran-Fernandez, Ana and Halder, Sebastian and Daly, Ian (2026) Decoding Speech Imagery or Just Noise?: A Symptom of the Replicability Crisis. Journal of Neural Engineering. (In Press)
Abstract
Objective: Speech Imagery (SI) has emerged as a promising paradigm for Brain-Computer Interface (BCI) control, attracting growing interest due to its intuitive nature-—allowing users to interact with the system by internally saying a command. In this study, we investigate the replicability and reproducibility of SI decoding methods. These two aspects are critical in BCI research, where prior literature has highlighted that many studies suffer from incomplete methodological reporting or flawed evaluation procedures, making reproduction difficult. The inherent variability of brain signals further complicates the replication of results. Evaluating the reproducibility of SI decoding approaches is therefore essential to assess the true feasibility of SI as a viable BCI paradigm. Approach. To assess reproducibility, we selected two of the most widely used open-access SI datasets and attempted to reproduce four published decoding pipelines for each dataset. We followed each implementation step-by-step, documented missing or ambiguous information, detailed how we addressed it, and compared our decoding results to those originally reported. To assess replicability, we applied standard decoding pipelines across different time frequency configurations to three open SI datasets and our own collected dataset. For context and validation, we conducted the same procedure on four publicly available and widely used motor imagery (MI) datasets. Main Results. All evaluated SI studies contained some form of missing methodological detail, and others did not include cross-validation procedures. Our reproduction attempts consistently yielded lower classification accuracies than originally reported, with discrepancies ranging from 2 to 39% (˜x = 11.25 ± 12.41%). In the replication analysis, we found no consistent time-frequency patterns across SI datasets. Furthermore, only 36% of SI participants achieved classification accuracies above statistical significance thresholds, compared to 91% of the participants in MI datasets. Significance . This is the first comprehensive assessment of both reproducibility and replicability in SI decoding. Our findings raise important concerns about the reliability of current SI research and suggest that the feasibility of SI as a practical BCI paradigm may have been overestimated.
| Item Type: | Article |
|---|---|
| Divisions: | Faculty of Science and Health > Computer Science and Electronic Engineering, School of |
| SWORD Depositor: | Unnamed user with email elements@essex.ac.uk |
| Depositing User: | Unnamed user with email elements@essex.ac.uk |
| Date Deposited: | 06 May 2026 08:56 |
| Last Modified: | 06 May 2026 08:57 |
| URI: | http://repository.essex.ac.uk/id/eprint/43210 |