Histogram of Gradient Orientations of Signal Plots Applied To P300 Detection

published: 05 July 2019

doi: 10.3389/fncom.2019.00043

Histogram of Gradient Orientations

of Signal Plots Applied to P300
Rodrigo Ramele*, Ana Julia Villar and Juan Miguel Santos

Computer Engineering Department, Centro de Inteligencia Computacional, Instituto Tecnológico de Buenos Aires (ITBA),
Buenos Aires, Argentina

The analysis of Electroencephalographic (EEG) signals is of ulterior importance to

aid in the diagnosis of mental disease and to increase our understanding of the
brain. Traditionally, clinical EEG has been analyzed in terms of temporal waveforms,
looking at rhythms in spontaneous activity, subjectively identifying troughs and peaks
in Event-Related Potentials (ERP), or by studying graphoelements in pathological sleep
stages. Additionally, the discipline of Brain Computer Interfaces (BCI) requires new
methods to decode patterns from non-invasive EEG signals. This field is developing
alternative communication pathways to transmit volitional information from the Central
Nervous System. The technology could potentially enhance the quality of life of patients
affected by neurodegenerative disorders and other mental illness. This work mimics
what electroencephalographers have been doing clinically, visually inspecting, and
categorizing phenomena within the EEG by the extraction of features from images of
Rodrigo Ramele 1. INTRODUCTION
Although recent advances in neuroimagining techniques, particularly radio-nuclear and
Ramele et al. Histogram of Gradient Orientations of Signal Plots

One noteworthy aspect of this novel communication channel Speller application. Its validity is verified by offline processing
is the ability to transmit information from the Central Nervous two datasets, one of data from ALS patients and another one from
System (CNS) to a computer device and from there use that data of healthy subjects.
information to control a wheelchair (Carlson and del R. Millan, This article unfolds as follows: section 2.1 is dedicated to
2013), as input to a speller application (Guger et al., 2009), in a explain the Feature Extraction method based on Histogram of
Virtual Reality environment (Lotte et al., 2013) or as aiding tool Gradient Orientations of the Signal Plot, section 2.1.1 shows
in a rehabilitation procedure (Jure et al., 2016). The holly grail of the preprocessing pipeline, section 2.1.2 describes the image
BCI is to implement a new complete and alternative pathway to generation of the signal plot, section 2.1.3 presents the feature
restore lost locomotion (Wolpaw and Wolpaw, 2012). extraction procedure while section 2.1.4 introduces the Speller
EEG signals are remarkably complex and have been Matrix Letter Identification procedure. In section 2.2, the
characterized as a multichannel non-stationary stochastic experimental protocol is expounded. Section 3 shows the results
process. Additionally, they have high variability between different of applying the proposed technique. In the final section 4 we
subjects and even between different moments for the same expose our remarks, conclusions, and future work.
subject, requiring adaptive and co-adaptive calibration and
learning procedures (Clerc et al., 2016). Hence, this imposes an
outstanding challenge that is necessary to overcome in order to
extract information from raw EEG signals. The P300 (Farwell and Donchin, 1988; Knuth et al., 2006) is
BCI has gained mainstream public awareness with worldwide a positive deflection of the EEG signal which occurs around
challenge competitions like Cybathlon (Riener and Seward, 300 ms after the onset of a rare and deviant stimulus that the
2014; Novak et al., 2018) and even been broadcasted during the subject is expected to attend. It is produced under the oddball
inauguration ceremony of the 2014 Soccer World Cup. New paradigm (Wolpaw and Wolpaw, 2012) and it is consistent
developments have overcome the out-of-the-lab high-bar and across different subjects. It has a lower amplitude (±5µV)
they are starting to be used in real world environments (Huggins compared to basal EEG activity, reaching a Signal to Noise Ratio
et al., 2016; Guger et al., 2017). However, they still lack the (SNR) of around −15 db estimated based on the amplitude
necessary robustness, and its performance is well behind any of the P300 response signal divided by the standard deviation
other method of human computer interaction, including of the background EEG activity (Hu et al., 2010). This signal
any kind of detection of residual muscular movement can be used to implement a speller application by means of
(Clerc et al., 2016). a Speller Matrix (Farwell and Donchin, 1988). This matrix is
A few works have explored the idea of exploiting the signal composed of 6 rows and 6 columns of numbers and letters.
waveform to analyze the EEG signal. In Alvarado-González The subject can focus on one character of the matrix. Figure 1
et al. (2016), an approach based on Slope Horizontal Chain shows an example of the Speller Matrix used in the OpenVibe
Code is presented, whereas in Yamaguchi et al. (2009) a open source software (Renard et al., 2010), where the flashes
similar procedure was implemented based on Mathematical of rows and columns provide the deviant stimulus required
Morphological Analysis. The seminal work of Bandt-Pompe to elicit this physiological response. Each time a row or a
Permutation Entropy (Berger et al., 2017) also explores succinctly column that contains the desired letter flashes, the corresponding
this idea as a basis to establish the time series ordinal patterns. synchronized EEG signal should also contain the P300 signature
In the article (Ramele et al., 2016), the authors introduce a and by detecting it, the selected letter can be identified.
method for classification of rhythmic EEG events like Visual
Occipital Alpha Waves and Motor Imagery Rolandic Central µ
Rhythms using the Histogram of Gradient Orientations of signal
2.1. Feature Extraction From Signal Plots
In this section, the signal preprocessing, the method for
plots. Inspired in that work, we propose a novel application of
generating images from signal plots, the feature extraction
the developed method to classify and describe transient events,
procedure and the Speller Matrix identification are described.
particularly the P300 Event Related Potential. The proposed
Figure 2 shows a scheme of the entire process.
approach is based on the waveform analysis of the shape of
the EEG signal. The signal is drawn on a bidimensional image
2.1.1. Preprocessing Pipeline
plot, vector gradients of pixels around the plot are obtained,
The data obtained by the capturing device is digitalized and a
and with them, the histogram of their orientations is calculated.
multichannel EEG signal is constructed.
This histogram is a direct representation of the waveform of
The 6 rows and 6 columns of the Speller Matrix are intensified
the signal. The method is built by mimicking what regularly
providing the visual stimulus. The number of a row or column
electroencephalographers have been performing for almost a
is a location. A sequence of 12 randomly permuted locations
century as it is described in Hartman (2005): visually inspecting
l conform an intensification sequence. The whole set of 12
raw signal plots.
intensifications is repeated ka times.
This paper reports a method to: (1) describe a procedure
to capture the shape of a waveform of an ERP component, • Signal Enhancement: This stage consists of the enhancement
the P300, using histograms of gradient orientations extracted of the SNR of the P300 pattern above the level of basal EEG.
from images of signal plots, and (2) outline the way in which The pipeline starts by applying a notch filter to the raw digital
this procedure can be used to implement an P300-Based BCI signal, a 4th degree 10 Hz lowpass Butterworth filter and finally

Ramele et al. Histogram of Gradient Orientations of Signal Plots

FIGURE 1 | Example of the 6 × 6 Speller Matrix used in the study obtained from the OpenVibe software. Rows and columns flash in random permutations.

FIGURE 2 | For each column and row, an averaged, standardized and scaled signal x̃l (n, c) is obtained from the segments Sli corresponding to the ka intensification
sequences with 1 ≤ i ≤ ka and location l varying between 1 and 12. From the averaged signal, the image I(l,c) of the signal plot is generated and each descriptor is
computed. By comparing each descriptor against the set of templates, the P300 ERP can be detected, and finally the desired letter from the matrix can be inferred.

a decimation with a Finite Impulse Response (FIR) filter of • Artifact Removal: For every complete sequence of 12
order 30 from the original sampling frequency down to 16 Hz intensifications of 6 rows and 6 columns, a basic artifact
(Krusienski et al., 2006). elimination procedure is implemented by removing

Ramele et al. Histogram of Gradient Orientations of Signal Plots

the entire sequence when any signal deviates above/ where γ > 0 is an input parameter of the algorithm and it is
bellow ±70µV. related to the image scale. In addition, xl (n, c) is the point-to-
• Segmentation: For each of the 12 intensifications of one point averaged multichannel EEG signal for the sample point n
intensification sequence, a segment Sli of a window of tmax and for channel c. Lastly,
seconds of the multichannel signal is extracted, starting
from the stimulus onset, corresponding to each row/column 1
x̄l (c) = xl (n, c)
intensification l and to the intensification sequence i. As nmax
intensifications are permuted in a random order, the segments
are rearranged corresponding to row flickering, labeled 1–6, and
whereas those corresponding to column flickering are labeled 
max h i2  12
7–12. Two of these segments should contain the P300 ERP l l l
σ̂ (c) = x (n, c) − x̄ (c)
signature time-locked to the flashing stimulus, one for the row, nmax − 1
and one for the column.
• Signal Averaging: The P300 ERP is deeply buried under basal are the mean and estimated standard deviation of xl (n, c), 1 ≤
EEG so the standard approach to identify it is by point- n ≤ nmax , for each channel c.
to-point averaging the time-locked stacked signal segments. Consequently, a binary image I (l,c) is constructed according to
Hence the values which are not related to, and not time-
255 if z1 = γ n and z2 = x̃l (n, c) + zl (c)

locked to the onset of the stimulus are canceled out (Liang and (l,c)
I (z1 , z2 ) =
Bougrain, 2008). 0 otherwise
This last step determines the operation of any P300 Speller. with 255 being white and representing the signal’s value location
In order to obtain an improved signal in terms of its SNR, and 0 for black which is the background contrast, conforming
repetitions of the sequence of row/column intensification are a black-and-white plot of the signal. Pixel arguments (z1 , z2 ) ∈
necessary. And, at the same time, as long as more repetitions are N × N iterate over the width (based on the length of the signal
needed, the ability to transfer information faster is diminished, so segment) and height (based on the peak-to-peak amplitude) of
there is a trade-off that must be acutely determined. the newly created image with 1 ≤ n ≤ nmax and 1 ≤ c ≤ C. The
The procedure to obtain the point-to-point averaged signal value zl (c) is the image vertical position where the signal’s zero
goes as follows: value has to be situated in order to fit the entire signal within the
1. Highlight randomly the rows and columns from the matrix. image for each channel c:
There is one row and one column that should match the letter
selected by the subject. $ % $ %
l maxn x̃l (n, c) − minn x̃l (n, c) maxn x̃l (n, c) + minn x̃l (n, c)
2. Repeat step 2.1.1 ka times, obtaining the 1 ≤ l ≤ 12 segments z (c) = −
2 2
Sl1 (n, c), . . . , Slka (n, c), of the EEG signal where the variables
1 ≤ n ≤ nmax and 1 ≤ c ≤ C correspond to sample points
and channel, respectively. The parameter C is the number of
where the minimization and maximization are carried out for n
available EEG channels whereas nmax = Fs tmax is the segment
varying between 1 ≤ n ≤ nmax , and ⌊·⌋ denote the rounding to
length and Fs is the sampling frequency. The parameter ka is
the smaller nearest integer of the number.
the number of repetitions of intensifications and it is an input
In order to complete the plot I (l,c) from the pixels, the
parameter of the algorithm.
Bresenham (Bresenham, 1965; Ramele et al., 2016) algorithm
3. Compute the Ensemble Average by
is used to interpolate straight lines between each pair of
consecutive pixels.
1 X 2.1.3. Feature Extraction: Histogram of Gradient
xl (n, c) = Sli (n, c) (1)
ka Orientations
The work of Hubel and Wiesel (1962), on how the visual cortex
sense features was the inspiration to the development of an
for 1 ≤ n ≤ nmax and for the channels 1 ≤ c ≤ C. This algorithm to identify and decode salient local information from
provide an averaged signal xl (n, c) for the twelve locations image regions. The Scale Invariant Feature Transform (SIFT)
1 ≤ l ≤ 12. is a Computer Vision method proposed by Lowe (2004) which
is composed of two parts, the SIFT Detector and the SIFT
2.1.2. Signal Plotting Descriptor. The former is the procedure to identify relevant
Averaged signal segments are standardized and scaled for 1 ≤ areas of an image whereas the latter is the procedure to describe
n ≤ nmax and 1 ≤ c ≤ C by and characterize a region of an image (i.e. patch) calculating an
histogram of the angular orientations of pixel gradients. In order
$ % to characterize EEG signal waveforms, this work proposes an
l (xl (n, c) − x̄l (c))
x̃ (n, c) = γ (2) alternative to the SIFT Descriptor, the Histogram of Gradient
σ̂ l (c) Orientations (HIST) algorithm.

Ramele et al. Histogram of Gradient Orientations of Signal Plots

For each generated image I (l,c) , a keypoint pk is placed on a each location l and channel c a feature called descriptor d(l,c) of
pixel (xpk , ypk ) over the image plot and a window around the 128 dimension is obtained. The main differences between this
keypoint is considered: a local image patch. Its size is Xp × Xp implementation and the standard SIFT Descriptor are described
pixels and is constructed by dividing the window in 16 blocks of in the Appendix.
size 3s each one, where s is the scale of the local patch and it is an Figure 3 shows an example of a patch and a scheme of the
input parameter of the algorithm. It is arranged in a 4 × 4 grid histogram computation. In Figure 3A a plot of the signal and
and the pixel pk is the patch center, thus Xp = 12s pixels. the patch centered around the keypoint is shown. In Figure 3B
A local representation of the signal shape within the patch can the possible orientations on each patch are illustrated. Only the
be described by obtaining the gradient orientations on each of upper-left four blocks are visible. The first eight orientations of
the 16 blocks Bi,j with 0 ≤ i, j ≤ 3 and creating a histogram of the first block, are labeled from 1 to 8 clockwise. The orientations
gradients. In order to calculate the histogram, the interval [0, 360] of the second block B1,2 are labeled from 9 to 16. This labeling
of possible angles is divided in 8 bins, each one of 45 degrees. continues left-to-right, up-down until the eight orientations for
Hence, for each spatial bin 0 ≤ i, j ≤ 3, corresponding to the all the sixteen blocks are assigned. They form the corresponding
indexes of each block Bi,j , the orientations are accumulated in a descriptor d of 128 coordinates. Finally, in (C) an enlarged image
3-dimensional histogram h through the following equation: plot is shown where the oriented gradient vector for each pixel
  can be seen.
X p − pk
h(θ , i, j) = 3s wang (6 J(p) − θ ) wij J(p) (5)
2.1.4. Speller Matrix Letter Identification
p∈I (l,c) P300 ERP extraction
Segments corresponding to row flickering are labeled 1–6,
where p is a pixel from the image I (l,c) , θ is the angle bin with θ ∈ whereas those corresponding to column flickering are labeled
{0, 45, 90, 135, 180, 225, 270, 315}, J(p) is the euclidean norm of 7–12. The extraction process has the following steps:
the gradient vector in the pixel p and it is computed using finite
differences and 6 J(p) is the angle of the gradient vector. • Step A: First highlight rows and columns from the matrix in a
The contribution of each gradient vector to the histogram random permutation order and obtain the Ensemble Average
calculated by Equation 5 is balanced by a trilinear interpolation. as detailed in steps 2.1.1, 2.1.1, and 2.1.1 in section 2.1.1.
The scalar wang (·) and vector wij (·) functions are linear • Step B: Plot the signals x̃l (n, c), 1 ≤ n ≤ nmax , 1 ≤ c ≤ C,
interpolations used by Lowe (2004) and Vedaldi and Fulkerson according section 2.1.2 in order to generate the images I (l,c) for
(2010) to provide a weighting contribution to the eight adjacent rows and columns 1 ≤ l ≤ 12.
bins in the tridimensional histogram. They are calculated as • Step C: Obtain the descriptors d(l,c) for rows and columns
from I (l,c) in accordance to the method described in
wij (v) = w(vx − xi )w(vy − yj ) (6) section 2.1.3.

with 0 ≤ i, j ≤ 3 and Calibration

A trial, as defined by the BCI2000 platform (Schalk et al., 2004),

 is every attempt to select just one letter from the speller. A set of
wang (α) = w + 8r (7) trials is used for calibration and once the calibration is complete

r=−1 it can be used to identify new letters from new trials.
During the calibration phase, two descriptors d(l,c) are
where xi and yi are the spatial bin centers located in xi , yj ∈ extracted for each available channel, corresponding to the
{− 23 , − 12 , 21 , 32 } and the interpolating function w(·) is defined as locations l of a selection of one previously instructed letter
w(z) = max(0, 1 − |z|). The function parameter v = (vx , vy ) from the set of calibration trials. These descriptors are the P300
is a vector variable and α a scalar variable. Vector v holds pixel templates, grouped together in a template set called T c . The set
coordinates (vx , vy ) normalized between −2 and 2 and combined is constructed using the steps described in section 2.1.1 and the
with the function w(z) it produces zero for every combination of steps A, B, and C of the P300 ERP extraction process.
(i, j) except for the 4 adjacent spatial bins. On the other hand, r Additionally, the best performing channel, bpc is identified
is an integer that can vary freely in the set {−1, 0, 1} and α is the based on the the channel where the best Character Recognition
difference between the gradient orientation angle and the angle Rate is obtained.
bin center in radians. By following this procedure, summands on
Equation (7) are nullified except for the 2 adjacent angular bins. Letter identification
These binning functions conform the trilinear interpolation In order to identify the selected letter, the template set T bpc is
that has a combined effect of sharing the contribution of used as a database. Thus, new unclassified descriptors q(l,bpc)
each oriented gradient between their eight adjacent bins in a are computed and they are compared against the descriptors
tridimensional cube in the histogram space, and zero everywhere belonging to the calibration template set T bpc .
else (Mortensen and Shapiro, 2005). The Naive Bayes Nearest Neighbor (k-NBNN) (Boiman et al.,
The fixed value of 3 is a magnification factor which 2008) is a discriminative (Wolpaw and Wolpaw, 2012) semi-
corresponds to the number of pixels per each block when s = 1. supervised classification algorithm that allows the categorization
As the patch has 16 blocks and 8 bin angles are considered, for of an image to one class by comparing the set of extracted

Ramele et al. Histogram of Gradient Orientations of Signal Plots

FIGURE 3 | (A) Example of a plot of the signal, a keypoint and the corresponding patch. (B) A scheme of the orientation’s histogram computation. Only the upper-left
four blocks are visible. The first eight orientations of the first block, are labeled from 1 to 8 clockwise. The orientation of the second block B1,2 is labeled from 9 to 16.
This labeling continues left-to-right, up-down until the eight orientations for all the sixteen blocks are assigned. They form the corresponding descriptor of 128
coordinates. The length of each arrow represents the value of the histogram on each direction for each block. (C) Vector field of oriented gradients. Each pixel is
assigned an orientation and magnitude calculated using finite differences.

descriptors to those which are more similar from template for a sake of reproducibility, the code of the entire algorithm,
dictionaries. This work proposes an adapted version to obtain a including the modified VLFeat library, has been made available
unary classification scheme to identify the selected letter in the at: https://bitbucket.org/itba/hist.
P300-Based BCI Speller, based on the features provided by the In the following sections the characteristics of the datasets and
calculated descriptors. parameters of the identification algorithm are described.
• Step D: Match to the calibration template T bpc by computing
2.2.1. P300 ALS Public Dataset
k The experimental protocol used to generate this dataset is
(bpc) 2
(l,bpc) explained in Riccio et al. (2013) but can be summarized as
ˆ = arg min
row q − dh (8)
l∈{1,...,6} follows: eight subjects with confirmed diagnoses but on different
stages of ALS disease, were recruited and accepted to perform
and the experiments. The Visual P300 detection task designed for
k this experiment consisted of spelling seven words of five letters
(bpc) 2

ˆ = arg each, using the traditional P300 Speller Matrix (Farwell and
X (l,bpc)
col min q − dh (9)
Donchin, 1988). The flashing of rows and columns provide the
deviant stimulus required to elicit this physiological response.
(bpc) The first 3 words are used for calibration and the remaining
with dh belonging to the set NT (q(l,bpc) ), which is defined,
(bpc) four words, for testing with visual feedback. A trial is every
for the best performing channel, as NT (q(l,bpc) ) = {dh ∈
attempt to select a letter from the speller. It is composed of signal
T bpc /d(bpc) is the k-nearest neighbor of q(l,bpc) }. This set is segments corresponding to ka = 10 repetitions of flashes of 6
obtained by sorting all the elements in T bpc based on distances rows and ka = 10 repetitions of flashes of 6 columns of the
between them and q(l,bpc) , choosing the k with smaller values, matrix, yielding 120 repetitions. Flashing of a row or a column
with k a parameter of the algorithm. is performed for 0.125 s, following by a resting period (i.e.,
By computing the aforementioned equations, the letter of the inter-stimulus interval) of the same length. After 120 repetitions
matrix can be determined from the intersection of the row row an inter-trial pause is included before resuming with the
ˆ Figure 2 shows a scheme of this process.
and column col. following letter.
The recorded dataset was sampled at 256 Hz and it consisted
2.2. Experimental Protocol of a scalp multichannel EEG signal for electrode channels
To verify the validity of the proposed framework and method, Fz, Cz, Pz, Oz, P3, P4, PO7, and PO8, identified according
the public dataset 008-2014 (Riccio et al., 2013) published on to the 10–20 International System, for each one of the eight
the BNCI-Horizon website (Brunner et al., 2014) by IRCCS subjects. The recording device was a research-oriented digital
Fondazione Santa Lucia, is used. Additionally, an own dataset EEG device (g.Mobilab, g.Tec, Austria) and the data acquisition
with the same experimental conditions is generated. Both of them and stimuli delivery were handled by the BCI2000 open source
are utilized to perform an offline BCI Simulation to decode the software (Schalk et al., 2004).
spelled words from the provided signals. In order to assess and verify the identification of the P300
The algorithm is implemented on MATLAB V2017a response, subjects are instructed to perform a copy-spelling task.
(Mathworks Inc., Natick, MA, USA). The algorithm described They have to fix their attention to successive letters for copying a
in section 2.1.3 is implemented on a modified version of the previously determined set of words, in contrast to a free-running
VLFeat (Vedaldi and Fulkerson, 2010) Computer Vision library. operation of the speller where each user decides on its own what
Furthermore, in order to enhance the impact of this paper and letter to choose.

Ramele et al. Histogram of Gradient Orientations of Signal Plots

2.2.2. P300 for Healthy Subjects

We replicate the same experiment on healthy subjects using
a wireless digital EEG device (g.Nautilus, g.Tec, Austria).
The experimental conditions are the same as those used
for the previous dataset, as detailed in section 2.2.1. The
produced dataset is available in a public online repository
(Ramele et al., 2017).
Participants are recruited voluntarily and the experiment is
conducted anonymously in accordance with the Declaration
of Helsinki published by the World Health Organization. No
monetary compensation is handed out and all participants
agree and sign a written informed consent. This study is
approved by the Departamento de Investigación y Doctorado,
Instituto Tecnológico de Buenos Aires (ITBA). All healthy subjects
have normal or corrected-to-normal vision and no history of
neurological disorders. The experiment is performed with 8
subjects, 6 males, 2 females, 6 right-handed, 2 left-handed,
average age 29.00 years, standard deviation 11.56 years, range
20–56 years.
EEG data is collected in a single recording session. Participants
are seated in a comfortable chair, with their vision aligned to FIGURE 4 | The scale of local patch is selected in order to capture the whole
a computer screen located one meter in front of them. The transient event. The size of the patch is Xp × Xp pixels. The vertical size
handling and processing of the data and stimuli is conducted by consists of four blocks of size 3sy pixels which is high enough as to contain
the OpenVibe platform (Renard et al., 2010). the signal 1µV, the peak-to-peak amplitude of the transient event. The
horizontal size includes four blocks of 3sx and covers the entire duration in
Gel-based active electrodes (g.LADYbird, g.Tec, Austria) are seconds of the transient signal event, λ.
used on the same positions Fz, Cz, Pz, Oz, P3,P4, PO7, and PO8.
Reference is set to the right ear lobe and ground is preset as the
AFz position. Sampling frequency is slightly different, and is set
to 250 Hz, which is the closest possible to the one used with the (xpk , ypk ) = (0.55Fs γ , zl (c)) = (35, zl (c)) for the corresponding
other dataset. channel c and location l (see Equation 4). In this way the whole
transient event is captured. Figure 4 shows a patch of a signal
2.2.3. Parameters plot covering the complete amplitude (vertical direction) and the
The patch size is XP = 12s × 12s pixels, where s is the scale of complete span of the signal event (horizontal direction).
the local patch and it is an input parameter of the algorithm. The number of channels C is equal to 8 for both datasets, and
The P300 event can have a span of 400 ms and its amplitude the number of intensification sequences ka is fixed to 10. The
can reach 10µV (Rao, 2013). Hence it is necessary to utilize a parameter k used to construct the set NT (q(l,c) ) is assigned to
signal segment of size tmax = 1 second and a size patch XP k = 7, which was found empirically to achieve better results. In
that could capture an entire transient event. With this purpose addition, the norm used on Equations (8) and (9) is the cosine
in consideration, the s value election is essential. norm, and descriptors are normalized to [−1, 1].
We propose the Equations (10) and (11) to compute the scale Lastly, in order to assess the validity of the HIST method,
value in horizontal and vertical directions, respectively. the character recognition rate for both datasets is evaluated
replicating the methodology proposed by the ALS dataset’s
γ λ Fs publisher, since authors Riccio et al. (2013) did not report
sx = (10)
12 the Character Recognition Rate obtained for this dataset.
γ 1µV Frequency filtering, data segmentation and artifact rejection is
sy = (11)
12 conducted according to section 2.1.1 yielding 16 x 8 samples
per epoch. A multichannel feature consists of time points
where λ is the length in seconds covered by the patch, Fs is the vector (Lotte et al., 2018), formed by concatenating all the
sampling frequency of the EEG signal (downsampled to 16 Hz) channels (Krusienski et al., 2006). A single-channel variant
and 1µV corresponds to the amplitude in microvolts that can consists of using time points from a single electrode and
be covered by the height of the patch. The geometric structure performing the analysis on a channel-by-channel basis. Three
of the patch is determined by the waveform to be captured, thus classification schemes are considered as well. A multichannel
we discerned that by using s = sx = sy = 3 and γ = 4, the local version of the Stepwise Linear Discriminant Analysis (SWLDA)
patch and the descriptor can identify events of 9 µV of amplitude, classification algorithm. SWLDA is the methodology proposed
with a span of λ = 0.56 s. This also determines that 1 pixel by the ALS dataset’s publisher. Additionally, a single-channel
represents γ1 = 14 µV on the vertical direction and Fs1γ = 64 1
and a multichannel variant of a linear kernel Support Vector
s on the horizontal direction. The keypoints pk are located at Machine (SVM) (Scholkopf and Smola, 2001) classifier are

Ramele et al. Histogram of Gradient Orientations of Signal Plots

TABLE 1 | Character recognition rates for the public dataset of ALS patients using TABLE 2 | Character recognition rates for the own dataset of healthy subjects
the Histogram of Gradient (HIST) calculated from single-channel plots. using the Histogram of Gradient (HIST) calculated from single-channel plots.

Participant bpc HIST (%) bpc Single channel Participant bpc HIST (%) bpc Single channel
SVM (%) SVM (%)

1 Cz 35 Cz 15 1 Oz 40 Cz 10
2 Fz 85 PO8 25 2 PO7 30 Cz 5

3 Cz 25 Fz 5 3 P4 40 P3 10
4 P4 45 P4 35
4 PO8 55 Oz 5
5 P4 60 P3 10
5 PO7 40 P3 25
6 Pz 50 P4 25
6 PO7 60 PO8 20
7 PO7 70 P3 30
7 PO8 80 Fz 30
8 P4 50 PO7 10
8 PO7 95 PO7 85
Performance rates using single-channel signals with the SVM classifier are shown for
Performance rates using single-channel signals with the SVM classifier are shown for comparison. The best performing channel bpc for each method is visualized.
comparison. The best performing channel bpc for each method is visualized.

utilized. SVM has been successfully used in several BCI

Competitions (Rakotomamonjy and Guigue, 2008).

Table 1 shows the results of applying the HIST algorithm to the
subjects of the public dataset of ALS patients. The percentage of
correctly spelled letters is calculated while performing an offline
BCI Simulation. From the seven words for each subject, the first
three are used for calibration, and the remaining four are used for
testing. The best performing channel bpc is informed as well. The
target ratio is 1:36; hence theoretical chance level is 2.8%. It can
be observed that the best performance of the letter identification
method is reached in a dissimilar channel depending on the
subject being studied. Tables 1, 2 show for comparison the
FIGURE 5 | Performance curves for the eight subjects included in the dataset
obtained performance rates using single-channel signals with the
of ALS patients. Three out of eight subjects achieved the necessary
SVM classifier. The best performing channel, where the best letter performance to implement a valid P300 speller.
identification rate was achieved, is also depicted.
The Information Transfer Rate (ITR), or Bit Transfer Rate
(BTR), in the case of reactive BCIs (Wolpaw and Wolpaw, 2012)
depends on the amount of signal averaging required to transmit Tables 3, 4 are presented in order to compare the performance
a valid and robust selection. Figure 5 shows the performance of the HIST method versus multichannel SWLDA and SVM
curves for varying intensification sequences for the subjects classification algorithms for both datasets. It is verified for the
included in the dataset of ALS patients. It can be noticed that the dataset of ALS patients that it has similar performance against
percentage of correctly identified letters depends on the number other methods like SWLDA or SVM, which use a multichannel
of intensification sequences that are used to obtain the averaged feature (Quade test with p = 0.55) whereas for the dataset of
signal. Moreover, when the number of intensification sequences healthy subjects significant differences are found (Quade test with
tend to 1, which corresponds to single-intensification character p = 0.02) where only the HIST method achieves a different
recognition, the performance is reduced. As mentioned before, performance than SVM (with multiple comparisons, significant
the SNR of the P300 obtained from only one segment of the difference of level 0.05).
intensification sequence is very low and the shape of its P300 The P300 ERP consists of two overlapping components: the
component is not very well defined. P3a and P3b, the former with frontocentral distribution while the
In Table 2 the results obtained for 8 healthy subjects are later stronger on centroparietal region (Polich, 2007). Hence, the
shown. It can be observed that the performance is above standard practice is to find the stronger response on the central
chance level. It is verified that HIST method has an improved channel Cz (Riccio et al., 2013). However, Krusienski et al. (2006)
performance at letter identification than SVM that process the show that the response may also arise in occipital regions. We
signals on a channel by channel strategy (Wilcoxon signed-rank found that by analyzing only the waveforms, occipital channels
test, p = 0.004 for both datasets). PO8 and PO7 show higher performances for some subjects.

Ramele et al. Histogram of Gradient Orientations of Signal Plots

TABLE 3 | Character recognition rates and the best performing channel bpc for TABLE 4 | Character recognition rates and the best performing channel bpc for
the public dataset of ALS patients using the Histogram of Gradient (HIST) the own dataset of healthy subjects using the Histogram of Gradient (HIST)
(repeated here for comparison purposes). (repeated here for comparison purposes).

Participant bpc HIST (%) Multichannel Multichannel Participant bpc HIST (%) Multichannel Multichannel
for HIST SWLDA (%) SVM (%) for HIST SWLDA (%) SVM (%)

1 Cz 35 45 40 1 Oz 40 65 40
2 Fz 85 30 50 2 PO7 30 15 10
3 Cz 25 65 55 3 P4 40 50 25
4 PO8 55 40 50 4 P4 45 40 20
5 PO7 40 35 45 5 P4 60 30 20
6 PO7 60 35 70 6 Pz 50 35 30
7 PO8 80 60 35 7 PO7 70 25 30
8 PO7 95 90 95 8 P4 50 35 20

Performance rates obtained by SWLDA and SVM classification algorithms with a Performance rates obtained by SWLDA and SVM classification algorithms with a
multichannel concatenated feature. multichannel concatenated feature.

As subjects have varying latencies and amplitudes of their population to benefit from BCI systems and EEG processing
P300 components, they also have a varying stability of the shape and analysis.
of the generated ERP (Nam et al., 2010). Figure 6 shows 10 In this work, a method to extract an objective metric from
sample P300 templates patches for patients 8 and 3 from the the waveform of the plots of EEG signals is presented. Its usage
dataset of ALS patients. It can be discerned that in coincidence to implement a valid P300-Based BCI Speller application is
with the performance results, the P300 signature is more clear expounded. Additionally, its validity is evaluated using a public
and consistent for subject 8 (A) while for subject 3 (B) the dataset of ALS patients and an own dataset of healthy subjects.
characteristic pattern is more difficult to perceive. It was verified that this method has an improved performance
Additionally, the stability of the P300 component waveform at letter identification than other methods that process the
has been extensively studied in patients with ALS (Sellers et al., signals on a channel by channel strategy, and it even has a
2006; Madarame et al., 2008; Nijboer and Broermann, 2009; comparable performance against other methods like SWLDA
Mak et al., 2012; McCane et al., 2015) where it was found that or SVM, which uses a multichannel feature. Furthermore, this
these patients have a stable P300 component, which were also method has the advantage that shapes of waveforms can be
sustained across different sessions. In line with these results we analyzed in an objective way. We observed that the shape of
do not find evidence of a difference in terms of the performance the P300 component is more stable in occipital channels, where
obtained by analyzing the waveforms (HIST) for the group of the performance for identifying letters is higher. We additionally
patients with ALS and the healthy group of volunteers (Mann– verified that ALS P300 signatures are stable in comparison to
Whitney U-Test, p = 0.46). Particularly, the best performance those of healthy subjects.
is obtained for a subject from the ALS dataset for which, based We believe that the use of descriptors based on histogram of
on visual observation, the shape of they P300 component is gradient orientation, presented in this work, can also be utilized
consistently identified. for deriving a shape metric in the space of the P300 signals which
It is important to remark that when applied to binary can complement other metrics based on time-domain as those
images obtained from signal plots, the feature extraction method defined by Mak et al. (2012). It is important to notice that the
described in section 2.1.3 generates sparse descriptors. Under analysis of waveform shapes is usually performed in a qualitative
this subspace we found that using the cosine metric yielded approach based on visual inspection (Sellers et al., 2006), and a
a significant performance improvement. On the other hand, complementary methodology which offer a quantitative metric
the unary classification scheme based on the NBNN algorithm will be beneficial to these routinely analysis of the waveform
proved very beneficial for the P300 Speller Matrix. This is of ERPs.
due to the fact that this approach solves the unbalance The goal of this work is to answer the question if a
dataset problem which is inherent to the oddball paradigm P300 component could be solely determined by inspecting
(Tibon and Levy, 2015). automatically their waveforms. We conclude affirmatively,
though two very important issues still remain:
First, the stability of the P300 in terms of its shape is crucial:
4. DISCUSSION the averaging procedure, montages, the signal to noise ratio and
spatial filters all of them are non-physiological factors that affect
Among other applications of Brain Computer Interfaces, the goal the stability of the shape of the P300 ERP. We tested a preliminary
of the discipline is to provide communication assistance to people approach to assess if the morphological shape of the P300 of the
affected by neuro-degenerative diseases, who are the most likely averaged signal can be stabilized by applying different alignments

Ramele et al. Histogram of Gradient Orientations of Signal Plots

FIGURE 6 | Ten sample P300 template patches for subjects 8 (A) and 3 (B) of the ALS Dataset. Downward deflection is positive polarity.

of the stacked segments (see Figure 2) and we verified that there In our opinion, the best benefit of the presented method is that
is a better performance when a correct segment alignment is a closer collaboration of the field of BCI with physicians can be
applied. We applied Dynamic Time Warping (DTW) (Casarotto fostered (Chavarriaga et al., 2017), since this procedure intent
et al., 2005) to automate the alignment procedure but we were to imitate human visual observation. Automatic classification
unable to find a substantial improvement. Further work to study of patterns in EEG that are specifically identified by their
the stability of the shape of the P300 signature component needs shapes like K-Complex, Vertex Waves, Positive Occipital Sharp
to be addressed. Transient (Hartman, 2005) are a prospect future work to be
The second problem is the amplitude variation of the P300. considered. We are currently working in unpublished material
We propose a solution by standardizing the signal, shown in analyzing K-Complex components that could eventually provide
Equation (2). It has the effect of normalizing the peak-to-peak assistance to physicians to locate these EEG patterns, specially in
amplitude, moderating its variation. It has also the advantage long recording periods, frequent in sleep research (Michel and
of reducing noise that was not reduced by the averaging Murray, 2012). Additionally, it can be used for artifact removal
procedure. It is important to remark that the averaged signal which is performed on many occasions by visually inspecting
variance depends on the number of segments used to compute signals. This is due to the fact that the descriptors are a direct
it (Van Drongelen, 2006). The standardizing process converts representation of the shape of signal waveforms. In line with these
the signal to unit signal variance which makes it independent applications, it can be used to build a database (Chavarriaga et al.,
of the number ka of signals averaged. Although this is initially 2017) of quantitative representations of waveforms and improve
an advantageous approach, the standardizing process reduces atlases (Hartman, 2005), which are currently based on qualitative
the amplitude of any significant P300 complex diminishing its descriptions of signal shapes.
automatic interpretation capability.
To further extend the capabilities of this method, it would
be desirable to implement a multichannel version. The ETHICS STATEMENT
straightforward extension of concatenating the obtained
descriptors results in high dimensional feature vector, Participants are recruited voluntarily and the experiment is
while other variants that merge descriptors per channel conducted anonymously in accordance with the declaration
may diminish the mutual information between different of Helsinki published by the World Health Organization. No
channels. Hitherto variants using color versions of SIFT (Van monetary compensation is handed out and all participants agree
De Sande et al., 2010), where different color bands are and sign a written informed consent. This study is approved
mapped to electrode channels, have been explored without by the Departamento de Investigación y Doctorado, Instituto
substantial success. Tecnológico de Buenos Aires (ITBA).

Ramele et al. Histogram of Gradient Orientations of Signal Plots

AUTHOR CONTRIBUTIONS


This work was part of the Ph.D. thesis of RR which is directed by This project was supported by the ITBACyT-15 funding program
JS and co-directed by AV. issued by ITBA University from Buenos Aires, Argentina.

REFERENCES Knuth, K. H., Shah, A. S., Truccolo, W. A., Ding, M., Bressler, S. L., and Schroeder,
C. E. (2006). Differentially variable component analysis: identifying multiple
Alvarado-González, M., Garduño, E., Bribiesca, E., Yáñez-Suárez, O., and Medina- evoked components using trial-to-trial variability. J. Neurophysiol. 95, 3257–
Bañuelos, V. (2016). P300 detection based on EEG shape features. Comput. 3276. doi: 10.1152/jn.00663.2005
Math. Methods Med. 2016:2029791. doi: 10.1155/2016/2029791 Krusienski, D. J., Sellers, E. W., Cabestaing, F., Bayoudh, S., McFarland, D. J.,
Arandjelovic, R., and Zisserman, A. (2012). “Three things everyone should know Vaughan, T. M., et al. (2006). A comparison of classification techniques
to improve object retrieval,” in 2012 IEEE Conference on Computer for the P300 Speller. J. Neural Eng. 3, 299–305. doi: 10.1088/1741-2560/
Vision and Pattern Recognition (CVPR) (Providence, RI), 2911–2918. 3/4/007
doi: 10.1109/CVPR.2012.6248018 Liang, N., and Bougrain, L. (2008). “Averaging techniques for single-trial analysis
Berger, S., Schneider, G., Kochs, E., and Jordan, D. (2017). Permutation of oddball event-related potentials,” 4th International Brain Computer Interfaces
entropy: too complex a measure for EEG time series? Entropy 19:692. Workshop (Graz), 1–6.
doi: 10.3390/e19120692 Lotte, F., Bougrain, L., Cichocki, A., Clerc, M., Congedo, M., Rakotomamonjy,
Boiman, O., Shechtman, E., and Irani, M. (2008). “In defense of nearest-neighbor A., et al. (2018). A review of classification algorithms for EEG-based
based image classification,” in 26th IEEE Conference on Computer Vision and brain–computer interfaces: a 10 year update. J. Neural Eng. 15:031005.
Pattern Recognition, CVPR (Anchorage, AK). doi: 10.1109/CVPR.2008.4587598 doi: 10.1088/1741-2552/aab2f2
Bresenham, J. E. (1965). Algorithm for computer control of a digital plotter. IBM Lotte, F., Faller, J., Guger, C., Renard, Y., Pfurtscheller, G., Lécuyer, A., et al. (2013).
Syst. J. 4, 25–30. doi: 10.1147/sj.41.0025 Combining BCI with Virtual Reality: Towards New Applications and Improved
Brunner, C., Blankertz, B., Cincotti, F., Kübler, A., Mattia, D., Miralles, F., BCI. Berlin; Heidelberg: Springer Berlin Heidelberg.
et al. (2014). BNCI Horizon 2020–towards a roadmap for brain/Neural Lowe, G. (2004). SIFT - The Scale Invariant Feature Transform. Int. J. 2, 91–110.
computer interaction. Lect. Notes Comput. Sci. 8513, 475–486. doi: 10.1023/B:VISI.0000029664.99615.94
doi: 10.1007/978-3-319-07437-5_45 Madarame, T., Tanaka, H., Inoue, T., Kamata, M., and Shino, M. (2008). “The
Carlson, T., and del R. Millan, J. (2013). Brain-controlled wheelchairs: development of a brain computer interface device for amyotrophic lateral
a robotic architecture. IEEE Robot. Autom. Mag. 20, 65–73. sclerosis patients,” in Conference Proceedings - IEEE International Conference
doi: 10.1109/MRA.2012.2229936 on Systems, Man and Cybernetics (Singapore: IEEE), 2401–2406.
Casarotto, S., Bianchi, A., Cerutti, S., and Chiarenza, G. (2005). Dynamic time Mak, J. N., McFarland, D. J., Vaughan, T. M., McCane, L. M., Tsui, P. Z., Zeitlin,
warping in the analysis of event-related potentials. IEEE Eng. Med. Biol. Mag. D. J., et al. (2012). EEG correlates of P300-based brain-computer interface
24, 68–77. doi: 10.1109/MEMB.2005.1384103 (BCI) performance in people with amyotrophic lateral sclerosis. J. Neural Eng.
Chavarriaga, R., Fried-Oken, M., Kleih, S., Lotte, F., and Scherer, R. (2017). 9:026014. doi: 10.1088/1741-2560/9/2/026014
Heading for new shores! Overcoming pitfalls in BCI design. Brain Comput. McCane, L. M., Heckman, S. M., McFarland, D. J., Townsend, G., Mak,
Interfaces 4, 60–73. doi: 10.1080/2326263X.2016.1263916 J. N., Sellers, E. W., et al. (2015). P300-based brain-computer interface
Clerc, M., Bougrain, L., and Lotte, F. (2016). Brain-Computer Interfaces, (BCI) event-related potentials (ERPs): people with amyotrophic lateral
Technology and Applications 2 (Cognitive Science). London: ISTE Ltd.; Wiley. sclerosis (ALS) vs. age-matched controls. Clin. Neurophysiol. 126, 2124–2131.
De Vos, M., and Debener, S. (2014). Mobile EEG: towards brain activity doi: 10.1016/j.clinph.2015.01.013
monitoring during natural action and cognition. Int. J. Psychophysiol. 91, 1–2. Michel, C. M., and Murray, M. M. (2012). Towards the utilization of EEG as a brain
doi: 10.1016/j.ijpsycho.2013.10.008 imaging tool. NeuroImage 61, 371–385. doi: 10.1016/j.neuroimage.2011.12.039
Farwell, L. A., and Donchin, E. (1988). Talking off the top of your head: toward Mortensen, E. N., and Shapiro, L. (2005). “A sift descriptor with global context,”
a mental prosthesis utilizing event-related brain potentials. Electroencephalogr. in 2005 IEEE Computer Society Conference on Computer Vision and Pattern
Clin. Neurophysiol. 70, 510–523. doi: 10.1016/0013-4694(88)90149-6 Recognition (CVPR’05), Vol. 1 (San Diego, CA), 184–190.
Guger, C., Allison, B. Z., and Lebedev, M. A. (eds.). (2017). “Introduction,” in Brain Nam, C. S., Li, Y., and Johnson, S. (2010). Evaluation of P300-based brain-
Computer Interface Research: A State of the Art Summary 6 (Cham: Springer), computer interface in real-world contexts. Int. J. Hum. Comput. Interact. 26,
1–8. 621–637. doi: 10.1080/10447311003781326
Guger, C., Daban, S., Sellers, E., Holzner, C., Krausz, G., Carabalona, R., Nijboer, F., and Broermann, U. (2009). “Brain computer interfaces for
et al. (2009). How many people are able to control a P300-based brain- communication and control in locked-in patients,” in Brain-Computer
computer interface (BCI)? Neurosci. Lett. 462, 94–98. doi: 10.1016/j.neulet. Interfaces. The Frontiers Collection, eds B. Graimann, G. Pfurtscheller, and B.
2009.06.045 Allison (Berlin; Heidelberg: Springer), 185–201.
Hartman, A. L. (2005). Atlas of EEG Patterns, Vol 65. Philadelphia, PA: Lippincott Novak, D., Sigrist, R., Gerig, N. J., Wyss, D., Bauer, R., Gotz, U., et al. (2018).
Williams & Wilkins. Benchmarking brain-computer interfaces outside the laboratory: the cybathlon
Hu, L., Mouraux, A., Hu, Y., and Iannetti, G. D. (2010). A novel approach 2016. Front. Neurosci. 11:756. doi: 10.3389/fnins.2017.00756
for enhancing the signal-to-noise ratio and detecting automatically Polich, J. (2007). Updating P300: an integrative theory of P3a and P3b. Clin.
event-related potentials (ERPs) in single trials. NeuroImage 50, 99–111. Neurophysiol. 118, 2128–2148. doi: 10.1016/j.clinph.2007.04.019
doi: 10.1016/j.neuroimage.2009.12.010 Rakotomamonjy, A., and Guigue, V. (2008). BCI competition III: dataset II-
Hubel, D. H., and Wiesel, T. N. (1962). Receptive fields, binocular interaction ensemble of SVMs for BCI P300 speller. IEEE Trans. Biomed. Eng. 55, 1147–
and functional architecture in the cat’s visual cortex. J. Physiol. 160, 106–154. 1154. doi: 10.1109/TBME.2008.915728
doi: 10.1113/jphysiol.1962.sp006837 Ramele, R., Villar, A. J., and Santos, J. M. (2016). “BCI classification based on
Huggins, J. E., Alcaide-Aguirre, R. E., and Hill, K. (2016). Effects of text generation signal plots and SIFT descriptors,” in 4th International Winter Conference on
on P300 brain-computer interface performance. Brain Comput. Interfaces 3, Brain-Computer Interface, BCI 2016 (Yongpyong: IEEE), 1–4.
112–120. doi: 10.1080/2326263X.2016.1203629 Ramele, R., Villar, A. J., and Santos, J. M. (2017). P300-dataset rrid scr_015977.
Jure, F., Carrere, L., Gentiletti, G., and Tabernig, C. (2016). BCI-FES Available online at: https://www.kaggle.com/rramele/p300samplingdataset
system for neuro-rehabilitation of stroke patients. J. Phys. 705, 1–8. Rao, R. P. N. (2013). Brain-Computer Interfacing: An Introduction. New York, NY:
doi: 10.1088/1742-6596/705/1/012058 Cambridge University Press.

Ramele et al. Histogram of Gradient Orientations of Signal Plots

Renard, Y., Lotte, F., Gibert, G., Congedo, M., Maby, E., Delannoy, V., et al. Van De Sande, K., Gevers, T., and Snoek, C. (2010). Evaluating
(2010). OpenViBE: an open-source software platform to design, test, and use color descriptors for object and scene recognition. IEEE Trans.
brain-computer interfaces in real and virtual environments. Presence 19, 35–53. Pattern Anal. Mach. Intell. 32, 1582–1596. doi: 10.1109/TPAMI.
doi: 10.1162/pres.19.1.35 2009.154
Rey-Otero, I., and Delbracio, M. (2014). Anatomy of the SIFT method. Image Van Drongelen, W. (2006). Signal Processing for Neuroscientists: An Introduction
Process. Line 4, 370–396. doi: 10.5201/ipol.2014.82 to the Analysis of Physiological Signals. London: Academic Press.
Riccio, A., Simione, L., Schettini, F., Pizzimenti, A., Inghilleri, M., Belardinelli, Vedaldi, A., and Fulkerson, B. (2010). VLFeat - An open and portable library
M. O., et al. (2013). Attention and P300-based BCI performance in of computer vision algorithms. Design 3, 1–4. doi: 10.1145/1873951.18
people with amyotrophic lateral sclerosis. Front. Hum. Neurosci. 7:732. 74249
doi: 10.3389/fnhum.2013.00732 Wolpaw, J., and Wolpaw, E. W. (2012). Brain-Computer Interfaces: Principles and
Riener, R., and Seward, L. J. (2014). “Cybathlon 2016,” 2014 IEEE Practice. New York, NY: Oxford University Press.
International Conference on Systems, Man, and Cybernetics (SMC) (San Yamaguchi, T., Fujio, M., Inoue, K., and Pfurtscheller, G. (2009). “Design method
Diego, CA), 2792–2794. of morphological structural function for pattern recognition of EEG signals
Schalk, G., McFarland, D. J., Hinterberger, T., Birbaumer, N., and Wolpaw, J. R. during motor imagery and cognition,” in Fourth International Conference
(2004). BCI2000: a general-purpose brain-computer interface (BCI) system. on Innovative Computing, Information and Control (ICICIC) (Kaohsiung),
IEEE Trans. Biomed. Eng. 51, 1034–1043. doi: 10.1109/TBME.2004.827072 1558–1561.
Scholkopf, B., and Smola, A. J. (2001). Learning With Kernels: Support Vector
Machines, Regularization, Optimization, and Beyond. Cambridge, MA: MIT Conflict of Interest Statement: The authors declare that the research was
Press. conducted in the absence of any commercial or financial relationships that could
Schomer, D. L., and Silva, F. L. D. (2010). Niedermeyer’s Electroencephalography: be construed as a potential conflict of interest.
Basic Principles, Clinical Applications, and Related Fields. Philadelphia, PA:
Walters Klutter; Lippincott Williams & Wilkins. Copyright © 2019 Ramele, Villar and Santos. This is an open-access article
Sellers, E. W., Kübler, A., and Donchin, E. (2006). Brain-computer interface distributed under the terms of the Creative Commons Attribution License (CC BY).
research at the University of South Florida cognitive psychophysiology The use, distribution or reproduction in other forums is permitted, provided the
laboratory: The P300 speller. IEEE Trans. Neural Syst. Rehabil. Eng. 14, 221– original author(s) and the copyright owner(s) are credited and that the original
224. doi: 10.1109/TNSRE.2006.875580 publication in this journal is cited, in accordance with accepted academic practice.
Tibon, R., and Levy, D. A. (2015). Striking a balance: analyzing unbalanced event- No use, distribution or reproduction is permitted which does not comply with these
related potential data. Front. Psychol. 6:555. doi: 10.3389/fpsyg.2015.00555 terms.

Ramele et al. Histogram of Gradient Orientations of Signal Plots

APPENDIX

APPENDIX • Octave selection: A gradient image is used to obtain the

oriented gradients and calculate the histogram of gradient
This section describes the differences between the orientations. In SIFT, these gradient images are downsampled
HIST algorithm proposed in this work and the SIFT and smoothed by a Gaussian filter. The SIFT Descriptor calls
Descriptor (Vedaldi and Fulkerson, 2010). octave to each downsampling level (Lowe, 2004; Rey-Otero
The two most important modifications are: and Delbracio, 2014). The standard SIFT Descriptor estimates
• SIFT Detector and custom frame: The SIFT Detector provides the octave to use on the gradient image based on the image size
the keypoint localization information in the standard SIFT and patch parameters. The HIST method uses only the zero
method. The keypoint localization information is stored in octave which means that the gradient image has the same size
a frame data structure which is composed of the keypoint as the original image, without any downsampling operation.
center location (xkp , ykp ), patch scale s and patch orientation • Gradient image smoothing: Additionally, the SIFT Descriptor
φ: (xkp , ykp , s, φ). In the HIST proposal the keypoint location performs an initial smoothing operation by applying a
and patch parameters are directly specified over the plot image Gaussian filter on the gradient image regardless of the octave.
in order to detect the signal waveform (see section 2.2.3) and In the HIST method, this operation is not implemented.
the SIFT Detector is not used. • Descriptor Gaussian weighting: On the standard SIFT
• Patch scale: Whereas in the standard SIFT implementation Descriptor, a Gaussian weighting operation is performed on
the patch is a squared region and there is only one SIFT the calculated SIFT descriptor to increase the importance of
scale parameter, in HIST a different scale parameter can gradients from pixels closer to the center of the patch. For the
be assigned to the horizontal and vertical axis. This is HIST method, this is found to be in detriment of the waveform
a very important modification because otherwise signal characterization and is not used.
plots which extend only on the horizontal direction • SIFT descriptor codification: The SIFT descriptor d is a
of the plot image could not be entirely covered. By 128-dimension feature vector, as described in section 2.1.3.
using a rectangular patch, there isn’t any constraint on Histogram values are floating point numbers, all positive, and
its size and it can be adjusted by neurophysiological they are accumulated on each coordinate of this vector. Once
priors to map any expected waveform based on equations all the gradients are summarized, the following operations are
(10) and (11). performed:

Additionally, other changes are implemented where specific steps – The descriptor is ℓ-2 normalized (i.e., all the values are
of the SIFT Descriptor were not found to be useful to characterize divided by the euclidean norm of the descriptor).
signal waveforms: – Each value is clamped to 0.2. This means that any value
above 0.2 is set to 0.2.
• Patch orientation: We verified experimentally that the patch – The descriptor is ℓ-2 re-normalized again (Rey-Otero and
orientation φ does not provide any extra utility for the Delbracio, 2014).
extraction of characteristic waveforms from plots. Hence, this
patch orientation is fixed to zero (vertical, pointing upwards This
 generates a 128-vector of floating point numbers, between
in Figure 4). 0, 1 . In the
 HIST implementation, these values are rescaled
• Rotations: SIFT was designed to allow affine invariance, i.e., to − 1, 1 in order to use the cosine distance (Arandjelovic
to be robust to rotations and scale modifications of patterns and Zisserman, 2012) on Equations (9) and (8). Finally, output
in images. It was not found, so far, of any utility to rotate the values are cast to floating point numbers
 (i.e., floats). yielding an
patch to capture the signal waveform. effective 128-vector of floats between − 1, 1 .

