Effects of high stimulus presentation rate on EEG template characteristics and performance of c-VEP based BCIs

Toygun Başaklar; Yiğit Tuncel; Yusuf Ziya Ider

doi:10.1088/2057-1976/ab0cee

1. Introduction

A brain-computer interface (BCI) is a communication channel between external environment and the human brain through which brain activities are interpreted and/or directly translated into commands to control external devices [1]. Electroencephalography (EEG) based BCIs have been widely used in the field of neural engineering and clinical rehabilitation due to their non-invasiveness, portability, and high temporal resolution [1]. Among various BCI paradigms [2, 3], visual evoked potential (VEP) based BCIs have received increased interest in recent years [1, 4, 5].

The code-modulated visual evoked potential (c-VEP) paradigm is proven to be superior compared to other commonly used VEP based BCI paradigms with the advantages of less training time, high information transfer rate (ITR), high number of targets, high accuracy rates and ease of use [6, 7]. In a c-VEP based BCI, a binary pseudorandom code sequence and its time lagged versions are assigned to different selectable targets and are used to modulate visual stimuli [6–10]. If a person focuses his/her gaze to one of the targets, a c-VEP is observed in the recorded EEG over the occipital lobe. As a binary pseudorandom coding sequence, m-sequence is generally chosen because of its good autocorrelation properties (close to zero correlation between itself and its shifted versions) [11].

The 2011 study of Bin et al [7] was a groundbreaking one in BCI community since it had reached the highest ITR (108 ± 12 bits min⁻¹) among EEG-based BCIs. They have used the traditional template matching algorithm, with canonical correlation analysis (CCA) for multichannel processing, to decode c-VEPs. In their paper they have suggested two new areas of investigation to increase the performance of c-VEP based BCIs, which are (i) studying the effects of refresh rate of the screen where the stimulus is presented and (ii) finding new coding sequences which yield sharp and early drops in the autocorrelation function of the c-VEP response [7].

Wittevrongel et al have investigated 120 Hz stimulus presentation rate for the coding sequence, in addition to 60 Hz stimulus presentation rate, with a novel decoding algorithm based on spatio-temporal beamforming and they have reported maximal median ITR of 100.46 bits min⁻¹ and 172.82 bits min⁻¹ for 60 Hz and 120 Hz monitor refresh rates respectively with their 32-target system [9]. Also, Gembler et al have compared the performance of a c-VEP based BCI and its user friendliness with three different refresh rates (60 Hz, 120 Hz and 200 Hz) and have reported an average ITR of 37.94 bits min⁻¹, 38.16 bits min⁻¹, and 37.22 bits min⁻¹ for 60 Hz, 120 Hz, and 200 Hz stimulus presentation rates respectively with their 16-target system. They have also stated that 200 Hz stimulus presentation rate was the most user-friendly one [12].

In order to further increase the performance of the c-VEP paradigm, some other studies have suggested new decoding algorithms. Spüler et al have used one-class support vector machines (OCSVM) with an adaptation based on error-related potentials for target identification. They have achieved an average ITR of 144 bits min⁻¹ with their 32-target system [8]. Aminaka et al have also stated that SVM with a linear kernel provides more accurate results than the traditional algorithms [13]. In a recent study, Dimitriadis and Marimpis have presented a new approach for a BCI system where they implemented cross-frequency coupling (CFC) estimator, namely phase-to-amplitude coupling (PAC) [14]. They have used three different publicly available BCI data sets where one of them belongs to the study of Wittevrongel et al [9]. With this dataset, they have outperformed the previous performance of Wittevrongel et al with an average ITR of 124.40 ± 11.68 bits min⁻¹ and 233.99 ± 15.75 bits min⁻¹ for 60 Hz and 120 Hz stimulus presentation rates respectively [14].

Additionally, in some studies the stimulus properties were investigated. Aminaka et al have investigated effect of the color of the stimuli in a c-VEP based BCI [15] while Isaksen et al have studied the optimal pseudorandom sequence selection [16]. Wei et al [10] have conducted experiments with a system with low number of targets to show that stimulus size, stimulus color, proximity of the stimulus, length of modulation sequence, and time lag between the codes of two adjacent stimuli are important parameters which affect the performance of the system.

Speed is a key factor in BCI applications to become more practical and more widely accepted. Since the display time of a single bit in the pseudorandom coding sequence is limited by the refresh rate of the monitor, the overall speed of a c-VEP based BCI somewhat depends on this rate. To the best of our knowledge, there are only few c-VEP based BCI studies as mentioned above with fast stimulus presentation rates above the traditional 60 Hz [9, 12, 14] and furthermore they are confined to investigating the overall performance (ITR and accuracy) of the system. As distinct from the mentioned studies, our goal is to investigate not only the overall performance of a c-VEP based speller BCI but also to report the effects of high stimulus presentation rates on the characteristics of c-VEP responses. To this end, we have conducted three different experiments with refresh rates of 60 Hz, 120 Hz, and 240 Hz and have identified the salient properties of c-VEP responses in these experiments.

2. Methods

2.1. Experimental design

20 healthy (no neurological or psychiatric disorders) subjects (denoted as S1, S2, ..., S20) with a mean age of 22.5 (10 males, 10 females) participated in the experiments. All subjects had normal or corrected-to-normal vision. Prior to the experiments, all subjects signed an informed consent form approved by the ethical committee of Bilkent University which explains the objectives of the study and that flicker stimulation may cause epileptic seizures.

A speller BCI was designed in MATLAB (The MathWorks, Inc., Natick, MA, USA), using Psychtoolbox [17, 18]. The visual stimuli were presented on a 25-inch LCD monitor which had a maximum refresh rate of 240 Hz and a resolution of 1920 × 1080 pixels (Dell Alienware AW2518HF). The participants were seated in front of the screen at a distance of about 60 cm. There were 36 symbols (letters/numbers) which were placed as a 6 × 6 matrix on the screen (see figure 1). Ubuntu 16.04 with a low-latency kernel was preferred as the operating system on the computer where the stimulus was presented in order to provide accurate timing. This computer is called stimulus computer in the rest of this document for easy referral. Specific attention was paid on Psychtoolbox's missed flip counter to make sure that the number of dropped frames is zero during all stages of an experiment.

Figure 1. Refer to the following caption and surrounding text. — **Figure 1.** Two frames were captured during the experiment for better understanding of our speller BCI. (a) A single frame was captured while each cell was flickering according to its own sequence. Each cell is either green if bit value of its sequence at that time is '1' or blue if it is '0'. (b) Letter I was highlighted during an online session (test stage) in order to give feedback to the user. Also, letter I and the previously identified letters were displayed at the bottom left corner. At the training stage, reference target was also highlighted in the same way at the beginning of the experiment.
Download figure:
Standard image High-resolution image

The m-sequence is nearly orthogonal to its time lagged versions and thus, is generally chosen as a pseudorandom coding sequence in traditional c-VEP based BCIs. Therefore, we have selected an m-sequence with a length of 127 in our design:

$\begin{eqnarray*}&&\begin{array}{l}101000111100100010110011101010011111010000111000100100110110\\ 1011011110110001101001011101110011001010101111111000000100000\\ 110000\end{array}\end{eqnarray*}$

This code was assigned to letter A which is at the upper left corner of the symbol matrix. By introducing successive cyclic 3 bits (three frame) time lags to this code, a total of 36 codes were obtained and these were assigned to the symbols in alphanumeric order. Each cell on this matrix is a 180 × 90 pixels (5.18 cm × 2.6 cm) rectangle with a letter/number positioned at its center. These cells flicker (green if bit value is '1', blue if bit value is '0') according to their own 127-bit m-sequence (see figure 1(a)). Display time of a single bit in this sequence depends on the refresh rate of the screen. For instance, for a 60 Hz monitor, a single bit is displayed for 16.67 ms. Parra et al have stated that green/blue flickering is the safest combination for avoiding photosensitive epilepsy [19]. At the beginning of every bit (at each new frame), a marker pulse was also transmitted from the stimulus computer to the EEG amplifier. As previously mentioned, we have conducted three sets of experiments with different refresh rates. We have used 60 Hz refresh rate in first set of experiments and it is called E1 for easy referral in the rest of this document. Similarly, a set of experiments with 120 Hz and 240 Hz refresh rates are called E2 and E3 respectively. In order to measure the actual refresh rates of our monitor, we have designed the following testing procedure. A rectangular section of the screen (same size as the target cells) was switched between black and white at each frame. A PIN photodiode circuit, explained in our previous study [20], was used for measurement. Observing the output of this circuit on the oscilloscope display, we have found that the actual refresh rates were 59.94 Hz, 119.98 Hz, and 239.76 Hz corresponding to the selected refresh rates of 60 Hz, 120 Hz, and 240 Hz.

2.2. Data acquisition

The EEG was recorded with Brain Products V-Amp 16 channel EEG amplifier along with actiCAP, a standard 10–20 EEG cap with 32 electrode sites (Brain Products, Gilching, Germany). EEG was recorded from electrodes 'O1, Oz, O2, Pz, P3, P4, P7, P8' and they were referenced to the FCz electrode. The ground electrode was placed over the nasion, on the forehead. Active and wet electrodes were used and their impedances were measured using ImpBox (Brain Products, Gilching, Germany). Electrode impedances were kept below 10 kΩ. The sampling rate was 2 KHz.

BCI2000 [21] and FieldTrip [22] were used together to record the EEG and marker pulses simultaneously and to transmit these signals to a MATLAB session on another computer (recorder computer) in real time. Pre-processing and classification (target identification) were done in MATLAB on this computer.

2.3. Data pre-processing and classification

There were two stages for each experiment, namely, (i) training stage and (ii) test stage. At the training stage, subjects were asked to fixate their gaze on the reference target, letter A. The coding sequence was repeated 100 times and a raw signal ${X}_{k\times s}$ was recorded where $k=8$ is the number of channels and $s=100\times n$ is the number of samples where $n=sampling\,rate$ $* duration\,of\,one\,sequence.$ For E1, E2, and E3 $n$ equals to 4233, 2117, and 1058 respectively. Another channel is also simultaneously recorded which contains the marker pulses and its size is $1\times s.$ A 4-121 Hz band-pass filter and a 50 Hz notch filter were applied to each row of ${X}_{k\times s}$ to eliminate the 50 Hz interference, DC offset and slow components due to head movements. By utilizing the marker pulses, each EEG channel was averaged over the 100 repetition of the coding sequence and a multichannel averaged EEG signal was obtained. Canonical correlation analysis (CCA) was adopted to find spatial weighting coefficients as described in [23]. CCA finds a between-channels linear combination of the unaveraged (raw) EEG signals and also another between-channels linear combination of the averaged EEG signals. The coefficients of the combinations are found such that the correlation between the combined unaveraged signal and the combined averaged signal is maximum. Multiplying the multichannel averaged EEG data with the obtained spatial weighting coefficients, reference template $({T}_{{0}_{1\times n}})$ was calculated. Templates for all other symbols $({T}_{{1}_{1\times n}},\ldots ,{T}_{{35}_{1\times n}})$ were generated by circularly shifting each consecutive template by a time lag of 3 bits starting with reference template $({T}_{{0}_{1\times n}}).$ The training stage was the same for all three refresh rates. Since the time required to display a 127-bit code is $\tfrac{127}{60}=2.1167$ seconds for a 60 Hz monitor, the training time for E1 was 212 s. Similarly, for E2 and E3, training times were 106 s and 53 s respectively.

At the test stage, subjects were required to write a sequence of 20 symbols. The coding sequence was repeated 1 time, 2 times and 4 times for E1, E2, and E3 respectively. The reason behind multiple repetitions of the coding sequence for E2 and E3 is explained in the Conclusions and Discussion section. After the number of repetitions were complete, the stimulus computer raised a flag over TCP to indicate that the data is ready for target identification. The recorder computer then decided which symbol the subject had focused on. The band-pass and notch filters which were used at the training stage were also applied. By utilizing the marker pulses, each EEG channel was averaged over 1 (i.e. no averaging), 2, or 4 coding sequence repetitions for E1, E2, and E3 respectively and a multichannel averaged EEG signal ${S}_{8\times n}$ was obtained. This signal was multiplied with the spatial weighting coefficients obtained in the training stage for the unaveraged signals and a spatially filtered signal with a size of $1\times n$ was obtained. Pearson's correlation coefficients were calculated between the templates found in the training stage $({T}_{{0}_{1\times n}},\ldots ,{T}_{{35}_{1\times n}})$ and this spatially filtered signal. The symbol of the template with the highest correlation was decided as the target symbol which the subject had fixated his/her gaze to. This information was transmitted over TCP from recorder computer to the stimulus computer. Then, to give feedback to the subject and also to give him/her time to switch his/her gaze onto the next symbol, the cell that contains the decided symbol was highlighted in pink for 1 s and also was displayed at the bottom left corner of the screen (see figure 1(b)). This procedure continued for 20 symbols. The time required for the system to decide which symbol the subject looked at is 3.13 s (including stimulation length of a sequence, classification and the time needed to switch the gaze) and was the same for all three refresh rates.

2.4. Performance evaluation and data analysis

For each experiment ITR and classification accuracies were calculated. We used the commonly used ITR calculation [1] to evaluate the performance of our system and it is given as follows:

$\begin{eqnarray}&&\begin{array}{l}ITR\\ =\,\displaystyle \frac{60}{T}\times \left({\mathrm{log}}_{2}N+P{\mathrm{log}}_{2}P+\left(1-P\right){\mathrm{log}}_{2}\displaystyle \frac{1-P}{N-1}\right)\end{array}\end{eqnarray} \tag{ 1 }$

where $N$ is the number of possible target choice which is 36, $P$ is the accuracy of target identification and is calculated by correctly classified symbols divided by 20 (length of the symbol sequence), and $T$ is the time required to make a selection (in seconds) which is 3.13 s.

Power spectral density (PSD) estimates of reference templates of all subjects for E1, E2 and E3 were calculated using periodogram function of MATLAB to observe the change in frequency content of c-VEP responses for different refresh rates.

We have used principal component analysis (PCA) to observe how many distinguishable responses could be evoked with a 127-bit length m-sequence for three different refresh rates. For each subject, we have constructed a data matrix, ${D}_{n\times 127}$ by circularly shifting the reference template by a time lag of $j$ bits where $j=0,1,2,\mathrm{...},126.$ MATLAB's pca function was used to obtain the principal components of this data matrix.

A questionnaire was also given to the subjects after the experiments asking at which frequency, they felt more comfortable and if they felt visual fatigue at any frequency.

3. Results

Figure 2 shows the ITR values and accuracies for each experiment as box plots. Average ITR and accuracy values are 85.87 bits min⁻¹ and 92% for E1, 94.21 bits min⁻¹ and 97% for E2, and 78.65 bits min⁻¹ and 87% for E3 respectively. From these results, we can state that the overall performance of E2 is better than the other two experiments. In order to observe statistical differences between experiments, Friedman's test, which is a non-parametric alternative of repeated measures ANOVA, was conducted for the accuracy values for the three different experiments, using JASP (JASP Team, Amsterdam, The Netherlands). We have preferred Friedman's test over repeated measures ANOVA since our data does not comply with the assumptions of having normal distribution in each group and having sphericity; Friedman's test does not require such compliance whereas the repeated measures ANOVA does. Since $N$ and $T$ values in (1) are the same for all experiments, ITR only depends on the accuracy value. Although the relation between ITR and accuracy value is nonlinear, Friedman's test ranks the original values between different experiments and therefore, the statistical results do not change when this test was conducted for ITR values. Friedman's test shows that refresh rate has a significant effect on the accuracy values ( $p\lt 0.001,$ ${\chi }^{2}\left(2\right)=14.16$ ). Conover's post-hoc pairwise comparisons show that accuracy values of E2 are significantly different from the accuracy values of E1 $\left(p=0.003\right)$ and the accuracy values of E3 $\left(p\lt 0.001\right).$ Also, the lowest coefficient of variation and interquartile range values belong to E2 both for ITR and accuracy values which means that the results that we obtained for 120 Hz refresh rate are more reliable. As for the results of the questionnaire, 11 out of 20 subjects stated that E3, 5 subjects stated that E2, and 4 subjects stated that E1 is more comfortable than other experiments. Subjects did not report visual fatigue during the experiments for any of the refresh rates.

Figure 2. Refer to the following caption and surrounding text. — **Figure 2.** ITR and accuracy values for each experiment as box plots. Recall that E1 is the experiment with 60 Hz monitor refresh rate, E2 is the experiment with 120 Hz monitor refresh rate and E3 is the experiment with 240 Hz refresh rate. Each box represents the inter-quartile range. Horizontal line in each box represents the median and the ⊕ sign represents the mean of the data for each group.
Download figure:
Standard image High-resolution image

In figure 3, reference templates and the pseudorandom coding sequences are given for E1, E2, and E3 for subject S6 as an example. There seems to be no pattern resemblance between the coding sequences and the templates. Also, in figure 4, the PSD estimates of the reference templates for all subjects are shown. It appears that for all three cases, the powers spectrums of the templates are similarly band-limited. Specifically, the average frequency values which constitute 99% of the cumulative power were limited to 28 ± 9 (s.d.) Hz, 28 ± 10 (s.d.) Hz, and 36 ± 12 (s.d.) Hz for E1, E2, and E3 respectively. This is so, despite the fact that as refresh rate is increased the bandwidth of the input signal significantly increases; the 3-dB cut-off frequencies being 30 Hz, 60 Hz, and 120 Hz for E1, E2, and E3 respectively. Additionally, the spectral densities concentrate within several frequency intervals and these frequency intervals vary between experiments. Especially for E3, a broad and single peak at 15 Hz appears in the spectrum almost for all subjects.

Figure 3. Refer to the following caption and surrounding text. — **Figure 3.** The pseudorandom coding sequences (red) and reference template of subject S6 (blue) obtained from E1, E2, and E3 from top to bottom. Left y-axis shows the microvolts values of templates and the right y-axis shows the binary values of the pseudorandom coding sequences. Note that the duration of one code sequence at different refresh rates are different.
Download figure:
Standard image High-resolution image

Figure 4. Refer to the following caption and surrounding text. — **Figure 4.** PSD estimates of reference templates of all subjects for E1, E2, and E3 from top to bottom. Results of each subject has its own colour and is given in legend of all graphs. Note that the c-VEP responses are band-limited to 28 ± 9 (s.d.) Hz, 28 ± 10 (s.d.) Hz, and 36 ± 12 (s.d.) Hz for E1, E2, and E3 respectively. Also, for this reason, the spectrums are drawn up to 50 Hz.
Download figure:
Standard image High-resolution image

Since the target identification depends on the correlation coefficients, we have decided to further investigate the correlation relationship between recorded EEG at test stage for a single symbol and the 36 templates. Figure 5 shows for all three experiments, the correlation coefficients between the 36 templates and the recorded EEG during test stage when subject S3 fixated his/her gaze on to the letter B. For all refresh rates, the highest correlation coefficient corresponds to the 2nd template (letter B). It is observed that the variation of the correlation coefficients is almost periodic with respect to the time lag, especially in E3. This is not surprising because the reference template for 240 Hz refresh rate approximates a single sinusoid with a certain frequency and thus, the autocorrelation function of the reference template is also periodic with the same frequency. As a consequence of this periodicity, there are high peaks at certain time lags as seen in figure 5 which may result in a misclassification. In fact, we checked the misclassified symbols for E3 and found that they corresponded to the symbols that correspond to those peaks. For example, figure 6 shows the correlation coefficients between the 36 templates and the recorded EEG during test stage when subject S1 fixated his/her gaze on to the letter N (top graph) and letter X (bottom graph) for E3. It can be observed that the misclassification occurs at the symbols on the peaks where letter C (3rd target) was decided as the target symbol instead of letter N (14th target) and letter M (13th target) was decided as the target symbol instead of letter X (24th target). Similar behaviour for the correlation coefficients were observed for all subjects and all symbols.

Figure 5. Refer to the following caption and surrounding text. — **Figure 5.** Correlation coefficients between 36 templates and the recorded EEG, when subject S3 fixated his/her gaze on to the letter B on the screen at the online experiment (test stage), for E1, E2 and E3 from top to bottom. Note that the x-axis is time lag, and each consecutive template has a time lag of 0.05 s, 0.025 s and 0.0125 s for 60 Hz, 120 Hz and 240 Hz refresh rate respectively.
Download figure:
Standard image High-resolution image

Figure 6. Refer to the following caption and surrounding text. — **Figure 6.** Correlation coefficients between the 36 templates and the recorded EEG during test stage when S1 fixated his/her gaze on to the letter N and letter X for E3. Top graph shows that letter C (3rd target) was decided as the target symbol instead of letter N (14th target) and the bottom graph shows that letter M (13th target) was decided as the target symbol instead of letter X (24th target). Note that x-axis shows the target indices.
Download figure:
Standard image High-resolution image

The results of PCA of the 127 templates obtained from the experimental data of subject S1, are given in figure 7 where percent variances of each principal component are plotted in descending order. The percentage of total variance explained by the first principal component increases in going from 60 Hz refresh rate to 240 Hz refresh rate. Furthermore, after a certain principal component, the percentage of explained variance is less in going from E1 to E3. To quantify this behaviour, we have calculated the number of principal components which constitute 95% of the cumulative variance of the data. When all subjects were considered, the average number of principal components which constitute 95% of the cumulative variance of the data were found to be 74 ± 7 (s.d.), 52 ± 10 (s.d.), and 32 ± 9 (s.d.) for E1, E2 and E3 respectively. Hence, it may be conjectured that 74, 52, and 32 distinguishable responses can be evoked with a 127-bit m-sequence in E1, E2, and E3 respectively.

Figure 7. Refer to the following caption and surrounding text. — **Figure 7.** Detailed view of the percent variances of each principal component to observe how many distinguishable responses could be evoked with a 127-bit length m-sequence for 60 Hz, 120 Hz and 240 Hz refresh rates. Data matrix, ${D}_{n\times 127}$ was constructed using the reference template obtained from the experimental data of subject S1. The graphs belong to E1, E2, and E3 from left to right.
Download figure:
Standard image High-resolution image

**Figure 7.** Detailed view of the percent variances of each principal component to observe how many distinguishable responses could be evoked with a 127-bit length m-sequence for 60 Hz, 120 Hz and 240 Hz refresh rates. Data matrix, ${D}_{n\times 127}$ was constructed using the reference template obtained from the experimental data of subject S1. The graphs belong to E1, E2, and E3 from left to right.
Download figure:
Standard image High-resolution image

4. Conclusions and discussion

The main aim of this study is to investigate the changes in the characteristics of c-VEP responses, as well as the changes in performance, depending on the stimulus presentation rate, by utilizing a traditional c-VEP based speller BCI. To our knowledge, alterations in the characteristics of c-VEP responses according to the stimulus presentation rate have never been investigated thoroughly before. Also, this study is the first study which utilizes a monitor with a maximum refresh rate of 240 Hz to investigate the effects of high stimulus presentation rates in c-VEP based BCIs.

To provide reliable target identification in a VEP based BCI, the responses obtained for different targets should be orthogonal to each other. The m-sequence and its time lagged versions are generally assigned to different targets in c-VEP based BCIs since the m-sequence is nearly orthogonal to its time lagged versions. Even though these lagged m-sequences, which are the inputs to the visual system are nearly orthogonal, we have observed that the obtained c-VEP responses for different targets are not exactly orthogonal as revealed by high correlation coefficients between many of them. This is exemplified in figure 5 where the variation of the correlation coefficients is almost periodic with respect to the time lag especially in E3 which indicates that the templates are not orthogonal to each other. In fact, for refresh rate of 240 Hz, the reference template approximates a single sinusoidal wave with a frequency of 15 Hz (see figure 4). Thus, the autocorrelation function of the reference template is also periodic with the same frequency which also supports the non-orthogonality of the templates. Having non-orthogonal responses to orthogonal inputs may even be observed in a linear system. In addition, the system that we are studying, that is the visual system, is nonlinear as shown by both experimental and mathematical modelling studies [20,24–30]. In fact, the visual system has severe nonlinearities such as bifurcation, chaotic behaviour, and period doubling. Thus, it is not surprising that the observed c-VEP responses for different targets are not exactly orthogonal to each other.

PCA was applied to observe how many distinguishable responses could be evoked with a 127-bit length m-sequence and with our experimental procedure for three different refresh rates. This analysis yields that as the refresh rate increases, the number of well distinguishable responses decreases. It can be deduced that it is fairly possible to misclassify some of the symbols using a 127-bit length m-sequence with a 36-target system at 240 Hz refresh rate. In fact, misclassifications due to highly correlated c-VEP responses are demonstrated in figure 6. Therefore, it can be stated that 240 Hz refresh rate may degrade the performance of the BCIs with high number of targets. However, 240 Hz refresh rate can be a suitable choice for a BCI with low number of targets if the time lag between the codes that are assigned to different targets are selected to have low correlation (i.e. codes that have high correlation coefficients due to the periodicity of the templates should be avoided).

We have also observed that the PSD behaviour of the reference templates in reference to the employed refresh rate does not seem to be due to a simple linear system. The frequency content of the c-VEP responses is limited to 28 ± 9 (s.d.) Hz, 28 ± 10 (s.d.) Hz, and 36 ± 12 (s.d.) Hz for E1, E2, and E3 respectively (see figure 4). However, as the refresh rate increases, the bandwidth of the input signal significantly increases with a 3-dB cut-off frequency of 30 Hz, 60 Hz, and 120 Hz for E1, E2, and E3 respectively. Also, the overall amplitude of the PSD estimates at frequencies within the bandwidth decreases as the refresh rate is increased. Hence, the PSD estimates of the reference templates at frequencies below 30 Hz should decrease in amplitude and the higher frequency components should appear in the spectrum as the refresh rate is increased. However, in our experiments, although there is some slight increase in frequency content with increased refresh rate, the decrease in magnitude within the bandwidth is not observed. In addition, the spectral densities concentrate within several frequency intervals which are even different for different refresh rates. Especially for E3, a broad and single peak at 15 Hz appears in the spectrum almost for all subjects. We believe that the observed alterations in the frequency content of c-VEP responses cannot be explained simply by a band-limited behaviour but also maybe the severe nonlinearities mentioned above take role.

Our experimental results indicate that the average performance of 120 Hz is statistically higher and more reliable than the other two experiments (see figure 2). Similarly, Wittevrongel et al stated that using 120 Hz refresh rate results in higher performance than the traditional 60 Hz stimulus presentation rate [9]. Gembler et al compared the performance of three different refresh rates (60 Hz, 120 Hz, and 200 Hz) and reported very similar performance between different refresh rates with 120 Hz being the highest one [12]. On the other hand, increasing refresh rate drastically shortens the time required for training from 212 s to 53 s. Also, 11 out of 20 subjects stated that E3, 5 subjects stated that E2, and 4 subjects stated that E1 is more comfortable than other experiments in a manner of visual comfort and practicality. We can state that most of the participants prefer 240 Hz refresh rate. The results of the questionnaire in the study of Gembler et al also yields that participants found 200 Hz as the most user friendly and the least annoying refresh rate [12]. Therefore, with these advantages related to 240 Hz refresh rate, one may argue that it may be a preferable refresh rate, provided low number of targets is adopted as explained above.

As mentioned in the Data Pre-Processing and Classification section, the coding sequence was repeated for 2 times (cycles) for E2 and 4 times (cycles) for E3 at the test stage of the experiments. The recorded EEG was then averaged over the 2 cycles for E2 and the 4 cycles for E3. In order to explain the reason behind multiple repetitions of coding sequences for E2 and E3, we did an offline performance analysis and calculated ITR and classification accuracy values by using the single responses recorded for each cycle for 120 Hz and 240 Hz refresh rate. Figure 8 shows the ITR values and accuracies calculated from the single responses for each experiment as box plots. The ITR values were calculated as in (1) where $T=2.08\,{\rm{s}}\,and\,3.13\,{\rm{s}}$ for the 1st cycle and the 2nd cycle of E2 respectively. Similarly for E3, $T=1.54\,{\rm{s}},2.08\,{\rm{s}},\,2.6\,{\rm{s}},\,and\,3.13\,{\rm{s}}$ for the 1st, 2nd, 3rd, and the 4th cycles of E3 respectively. The first observation that draws attention is that the mean accuracy value obtained from the 1st cycle is significantly lower compared to the accuracy value obtained from the consecutive single cycles ( $p\lt 0.001,$ Friedman's test and Conover's post-hoc pairwise comparisons) for both E2 and E3. The decrease in accuracy in the 1st cycle may be due to the inadequate time required for gaze-shifting but it needs to be investigated in further experimental studies. One may argue that responses obtained from the 1st cycle may not be utilized. Therefore, for E3, we made a performance analysis by averaging the responses obtained from the last three cycles and calculated the mean ITR and the mean classification accuracy values as 77.75 bits min⁻¹ and 86% respectively. These values are very close to what we have observed in figure 2 in which the ITR and accuracy values were obtained by averaging all 4 cycles. In fact, the mean accuracy value obtained by averaging over the last three cycles is not significantly different from the mean accuracy value obtained by averaging over all 4 cycles ( $p=0.772$ ). Similarly, ITR values for 3-cycle and 4-cycle averaging are not significantly different. For E2, the mean accuracy and ITR values obtained from only the 2nd cycle are not significantly different from the mean accuracy and ITR values obtained by averaging over 2 cycles ( $p=0.666$ ). In summary, it is understood that using also the 1st cycle in the averaged data does not actually have a detrimental effect on the overall performance.

Figure 8. Refer to the following caption and surrounding text. — **Figure 8.** ITR and accuracy values calculated from the single responses recorded for each cycle for E1, E2, and E3 as box plots. Recall that E1 is the experiment with 60 Hz monitor refresh rate, E2 is the experiment with 120 Hz monitor refresh rate and E3 is the experiment with 240 Hz refresh rate. As mentioned in the *Data Pre-Processing and Classification* section, the coding sequence was repeated for 2 times (cycles) for E2 and 4 times (cycles) for E3 at the test stage of the experiments. Each box represents the inter-quartile range. Horizontal line in each box represents the median and the ⊕ sign represents the mean of the data for each group.
Download figure:
Standard image High-resolution image

To sum up, our experimental results and analyses show that the response of the visual system to the m-sequence is considerably affected as the refresh rate increases. We conclude that in a design of a c-VEP based speller BCI, these effects should be taken into consideration. Considering all results of this study together, namely results related to performance, training time, subjects' comfort during the experiments, and the characteristic changes in c-VEP responses with increased refresh rate, it can be claimed that, with averaging, 120 Hz refresh rate is the best choice for the BCIs with high number of targets while 240 Hz refresh rate is a suitable choice for the BCIs with low number of targets.

Acknowledgments

This work was supported by The Scientific and Technological Research Council of Turkey (TUBITAK) under Grant 116E153.

Author e-mails

Author affiliations

Author notes

ORCID iDs

Dates

Peer review information