AI ISRC

Intelligent Systems Research Centre (ISRC)

Ulster University

Nominated Award:
Best Application of AI in an Academic Research Body

Website of Company (or Linkedin profile of Person):
https://www.linkedin.com/in/ciaran-cooney-42b031117/

My name is Ciaran Cooney. I am the lead author of the paper I am nominating. At present, I work as a Cognitive Scientist at Aflac Northern Ireland in Belfast. Prior to joining Aflac NI, I was a PhD researcher at Ulster University under the supervision of Professors Damien Coyle and Raffaella Folli. The work I am nominating was carried out while I was undertaking my PhD at the Intelligent Systems Research Centre (ISRC) at Ulster University. Research at the ISRC focuses on assistive communication through neural activity, computational neuroscience, augmented reality, computer vision and robotics.

My own PhD research focused on the developments of methods for studying and decoding speech and speech-related processes directly from neural activity. This was interdisciplinary research which incorporated aspects of neuroscience, linguistics, digital signal processing and machine and deep learning. Some of the early work during my PhD research looked into the relative benefits of traditional machine learning approaches versus more modern deep learning methods. This work indicated that although more susceptible to the effects of low data volume, deep learning methods could provide statistically-significant improvement over the decoding accuracy of more traditional approaches.

These results, alongside the development of technology enabling multimodal recording of neural activity, inspired the work on a multimodal deep learning architecture. My hope is that this work can lead to more bespoke decoding algorithms for imagined speech that will enhance the ability to communicate of those in need of such a device.

Reason for Nomination:

I am nominating work on the development of a novel Artificial Intelligence system used for decoding brain activity corresponding to imagined speech. This work is reported in the journal article, “A bimodal deep learning architecture for EEG-fNIRS decoding of overt and imagined speech” published in IEEE Transactions on Biomedical Engineering in November 2021. The paper reports a novel deep learning architecture design used to decode brain activity correlated with overt and imagined speech, and an innovative experimental paradigm targeted at understanding the effects of multiple conditions on
speech decoding.

Work on the development of this novel decoding approach was carried out while I was a PhD researcher in the Intelligent Systems Research Centre at Ulster University. A brain-computer interface (BCI) enabling communication by means of imagined speech would have potentially transformative implications for those suffering from neuro-pathologies such as Amyotrophic Lateral Sclerosis (ALS) resulting in Locked-In Syndrome. Accurate decoding of speech processes from neural activity is essential to facilitate naturalistic communication for any user of such a device.

However, there are many critical obstacles impeding the development of a functional BCI predicated on imagined speech. The low signal-to-noise ratio of non-invasively acquired neural signals, difficulties in collecting high-volume and high-quality data, and the lack of bespoke solutions for decoding neural activity are included among these obstacles. As research into the development of a BCI to enable language-based communication through decoded brain activity has steadily increased over the past decade, there has been a high degree of variability in the techniques used to decode speech.

Until recently, neural signals were only recorded using one method to target a specific type of brain activity Attempts to maximise the ability of non-invasive BCIs to provide consistent communication has resulted in a recent upsurge in efforts to leverage the different attributes of multiple modalities. Primary among these are bimodal data acquisition protocols combining the temporal resolution of EEG with the spatial resolution of fNIRS. Such bimodal data requires a corresponding bimodal approach to signal decoding. However, relatively few algorithms for decoding multiple streams of neural activity concurrently have been developed. This is not the case in many other fields of artificial intelligence where multimodal deep learning architectures have been developed to learn joint representations from distinct modalities such as text and images.

The work I am nominating builds on the work of researchers in multimodal deep learning by developing a bimodal network consisting of two convolutional sub-networks trained with a common loss function to extract data-specific features. Features are combined in a fusion operation before further feature extraction and classification layers. The convolutional sub- networks are effectively data-specific convolutional neural networks (one for EEG data, one for fNIRS) with a fully-connected linear output layer, rather than a standard classification layer. Building on work in computer vision and related fields, the outputs of the two sub-networks are combined using a method known as late fusion. This is when the processed feature maps of the neural network are combined at a later stage in the overall architecture. During training of the bimodal network, a single loss function is calculated, and weights are updated concurrently to facilitate the learning of joint representations from the two data types.

The novel architecture was evaluated on data collected from participants while the undertook overt and imagined speech tasks. Both EEG and fNIRS data were recorded for each participant. Results presented in the paper demonstrated that the bimodal network significantly improved upon unimodal decoding for all imagined speech tasks. Most subjects’ results improved with the bimodal network, despite sub-networks not being specifically tailored for different data types and the duration of fNIRS data not being optimal. Overall accuracies hinted at the potential for decoding speech from non-invasive neural recordings despite a significant performance gap between overt and imagined speech. The development of this novel deep learning architecture for bimodal decoding of imagined speech demonstrates that advances in communication can be made through adoption of the latest advances in artificial intelligence.

The potential benefits for persons with limited capacity for verbal communication are enormous and further research building on this work can lead to real strides in AI solutions to aid this development. In order to assist the community in the development of enhancements on this approach and to further research into greater advances, the code for this deep learning architecture has been made publicly available at the following link
(https://github.com/cfcooney/BiModNeuroCNN).

Intelligent Systems Research Centre (ISRC)

Services

Resources