

# An Analog-to-Information VGA image sensor architecture for support vector machine on compressive measurements

Wissam Benjilali, William Guicquero, Laurent Jacques, Gilles Sicard

#### ▶ To cite this version:

Wissam Benjilali, William Guicquero, Laurent Jacques, Gilles Sicard. An Analog-to-Information VGA image sensor architecture for support vector machine on compressive measurements. ISCAS 2019 - IEEE International Symposium on Circuits and Systems, May 2019, Sapporo, Japan. 10.1109/IS-CAS.2019.8702325. cea-04548791

### HAL Id: cea-04548791 https://cea.hal.science/cea-04548791v1

Submitted on 16 Apr 2024

**HAL** is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers.

L'archive ouverte pluridisciplinaire **HAL**, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d'enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

## An Analog-to-Information VGA Image Sensor Architecture for Support Vector Machine on Compressive Measurements

Wissam Benjilali<sup>1</sup>, William Guicquero<sup>1</sup>, Laurent Jacques<sup>2</sup> and Gilles Sicard<sup>1</sup>

<sup>1</sup>Univ. Grenoble Alpes, CEA, LETI, F-38000 Grenoble, France

<sup>2</sup>ISPGroup, ICTEAM/ELEN, UCLouvain, Louvain-la-Neuve, Belgium

Email: {wissam.benjilali, william.guicquero, gilles.sicard}@cea.fr; laurent.jacques@uclouvain.be

Abstract—This work presents a compact VGA  $(480 \times 640)$  CMOS Image Sensor (CIS) architecture with dedicated end-of-column Compressive Sensing (CS) scheme allowing embedded object recognition. The architecture takes advantage of a low-footprint pseudo-random data mixing circuit and a first order incremental Sigma-Delta  $(\Sigma\Delta)$  Analog to Digital Converter (ADC) to extract compressed features. The proposed CIS achieves an object recognition accuracy of  $\simeq 93\%$  on the Georgia Tech face recognition database (GIT, 10 classes out of 50) thanks to a linear Support Vector Machine (SVM) classifier implemented by an optimized Digital Signal Processing (DSP). We stress that the signal independent dimensionality reduction performed by our dedicated CS scheme (1/480) allows to dramatically reduce memory requirements ( $\approx 32$  kbit) related –in our case– to the ex-situ learned affine function of the linear SVM.

*Index Terms*—embedded object recognition, compressive sensing, pseudo-random permutations, Sigma-Delta, SVM

#### I. INTRODUCTION

The last decade has testified a deep theoretical study of CS [1] for both signal recovery and decision making problems [2]. In particular, CS has emerged as a hardware-friendly data acquisition scheme. However, in contrast to the commonused sensing scheme, reconstructing a signal from its CS measurements basically requires nonlinear operations making signal recovery more complex. In fact, a wide variety of applications only requires to extract the meaningful information. This statement can enable great achievements in the design of smart sensors that are more compact and energy efficient.

Inspired by the potential of CS, the CIS community has focused on implementing on-chip sensing scheme to deal with either hardware or algorithm constraints for image rendering. As one of the earlier CIS implementations, [3], [4] exploit random convolutions in the focal plane to extract CS measurements. The proposed sensing model gives the priority to a fast and efficient image reconstruction but involving high on-chip complexity. In [5], the concept of incremental  $\Sigma\Delta$  is introduced to perform summation/averaging operation during A/D conversion. This sensor has the advantage of using an optimized 4T pixel architecture while performing end-of-column CS without major modification of a canonical sensor design. Finally, [6] describes a scalable and low-complexity

Laurent Jacques is funded by the Belgian F.R.S.-FNRS and by the project AlterSense (MIS-FNRS).



Fig. 1: Image sensor top-level architecture.

column based CS using a Cellular Automaton (CA) that shows a chaotic behavior to on-the-fly generate the sensing matrix.

To meet decision making, analog pre-processing as well as dedicated System-on-Chip (SoC) are explored to deal with inference problems in the context of low-power CIS. For example, [7], [8] propose an event driven face recognition SoC based on Haar-like filtering and a Convolutional Neural Network (CNN) processor. On the other hand, [9] deals with memory requirements to implement face recognition using a Principal Component Analysis (PCA) to extract features combined with a nonlinear SVM. We note, however, that these works focus on optimizing circuit design to achieve low-power processing; they do not address design constraints related to image acquisition such as data dimensionality or ADC clock.

In this paper we propose a compact VGA CIS architecture to address the embedded object recognition task in the context of smart low power vision systems. Our contribution is twofold: First, an end-of-column dedicated CS sensing scheme is proposed based on independent per-row permutations allowing the reuse of a standard rolling shutter acquisition scheme as well as an array of optimized 4T pixels. Second, a DSP architecture is proposed to perform embedded decision making thanks to a linear SVM. Our architecture successfully recognizes VGA images from a single scan read-out (*i.e.*, **only** 640 measurements) while reducing memory requirements to the ex-situ learned patterns. This on-chip decision making scheme is thus an appealing approach for highly-constrained applications.

Decision making on CS measurements refers to solving an inference problem directly in the CS domain. Let us consider the random sensing matrix  $A \in \mathbb{R}^{m \times n}$  made of  $m \ll n$ measurement vectors. It allows, for a vector  $x \in \mathbb{R}^n$ , to acquire CS measurements using the sensing model described as  $\tilde{x} = Ax \in \mathbb{R}^m$ . In particular, if we aim to reconstruct x from  $\tilde{x}$  we must ensure that the matrix A satisfies the Restricted Isometry Property (RIP) with high probability [10]. Moreover, from a decision making point of view, the RIP guarantees to preserve the Euclidean distance between low complexity signals (e.g., k-sparse) in the CS domain [11]. This allows to perform decision algorithms in the CS domain since pairwise distance is a primitive operation in numerous classification and machine learning algorithms. For instance, when dealing with linearly separable convex sets, the rare eclipse problem [12], [13] explains the minimal m to reach to preserve the disjointness of two classes in the CS domain. These theoretical results, as well as general considerations on the on-chip complexity and power consumption issues lead us to propose a novel CS-driven CIS combined with a linear SVM to perform near sensor inference in the CS domain.

To perform our supervised embedded inference, we propose a two-stages processing: First, an SVM classifier is learned in an off-line system on a compressed training set  $\tilde{X} \in \mathbb{R}^{m \times n_1 C}$ composed of C classes, each with  $n_1$  samples, associated with labels  $l \in \{1, \dots, C\}$ . Second, the embedded inference is performed on a compressed test set  $ilde{m{Y}} \in \mathbb{R}^{m \times n_2 C}$  made of  $Cn_2$  samples with unknown labels. Here, both the training and test sets are acquired by the proposed architecture using the specific sensing matrix A described in Sec. III. In the following,  $\tilde{x}$  and  $\tilde{y}$  refer to an arbitrary column of X and Y respectivelly. In order to learn a multiclass SVM classifier [14], C binary soft margin classifiers are trained to construct a boundary decision for each class versus the others. Using this strategy called one-vs-all, a positive label is assigned to the samples in a class and a negative to the others, i.e., the label  $l_i^i$  equals 1 if the  $j^{th}$  sample belongs to class i, and -1otherwise. Mathematically, for each  $1 \le i \le C$ , we estimate a normal vector  $\hat{m{d}}_i \in \mathbb{R}^m$ , an offset  $\hat{b}_i \in \mathbb{R}$  and penalties  $\hat{\boldsymbol{\xi}} \in \mathbb{R}^{n_1}$ , where  $\lambda$  is an inner regularization parameter:

$$\begin{split} \{\hat{\boldsymbol{d}}_i, \hat{b}_i, \hat{\boldsymbol{\xi}}_i\} &= \underset{\boldsymbol{d} \in \mathbb{R}^m, b, \boldsymbol{\xi} \in \mathbb{R}^{n_1}}{\arg\min} \ \left(\frac{1}{2} \left\| \boldsymbol{d} \right\|_2^2 + \lambda \left\| \boldsymbol{\xi} \right\|_1 \right) \\ \text{s.t.} \quad l_j^i(\boldsymbol{d}^\top \tilde{\boldsymbol{x}}_j^i + b) &\geq 1 - \xi_j, \ \xi_j \geq 0, 1 \leq j \leq n_1. \end{split} \tag{1}$$

Let us define the gain matrix  $\hat{D} := (\hat{d}_1, \cdots, \hat{d}_C)^\top$ , *i.e.*, the vertical concatenation of  $\hat{d}_i$  and  $\hat{b} := (\hat{b}_1, \cdots, \hat{b}_C)^\top$  the offset vector. Once the C classifiers are constructed, a winner-takesall strategy allows to assign a compressed sample  $\tilde{y}$  to the class c maximizing the margin, *i.e.*,

$$c = \arg\max_{1 \le i \le C} \ \hat{\boldsymbol{d}}_i^{\mathsf{T}} \tilde{\boldsymbol{y}} + \hat{b}_i. \tag{2}$$

In the next section, we present our proposed architecture to implement SVM inference on extracted CS features.

#### III. PROPOSED IMAGE SENSOR ARCHITECTURE

In this work, we propose an architecture for which the sensing matrix A corresponds to applying a random modulation  $\varphi \in \{\pm 1\}^{n_v n_h}$  to the  $n_v \times n_h$  observed image, and performing a column random permutation which is different for each sequentially selected row (see Fig. 1). The purpose of the modulation is to center CS matrix expectation and thus center measurements distribution, while the permutations increase the information content (diversity) of each measurement. Therefore, for a per-row vectorized image  $\alpha \in \mathbb{R}^{n_v n_h}$ :

$$\tilde{\boldsymbol{\alpha}} = \boldsymbol{P}(\boldsymbol{\varphi} \circ \boldsymbol{\alpha}) \in \mathbb{R}^{n_h}, \tag{3}$$

where  $P=(P_1,\ldots,P_{n_v})\in\{0,1\}^{n_h\times n_h n_v}$  is the horizontal concatenation of  $n_v$  random permutation matrices  $P_k$   $(1\leq k\leq n_v)$ , and  $\circ$  is the Hadamard product. We easily show that, thanks to the  $\pm 1$  pre-modulation,  $\mathbb{E}\|P(\varphi\circ(\alpha-\alpha'))\|^2=||\alpha-\alpha'||^2$ , i.e., the implemented CS scheme preserves in expectation the distance between two different images (showing that the resulting sensing matrix respects the RIP is postponed to a future work).

For the sake of simplicity, the permutations are implemented before modulation, which is equivalent in terms of functionality, *i.e.*, the complete matrix A such that  $\tilde{x} = Ax$  can be written using the Kronecker product  $\otimes$ , where  $\mathbb{1}_{n_h}$  is the vector of ones:

$$m{A} = (\mathbb{1}_{n_h} \otimes m{ar{arphi}}^{ op}) \circ m{P}, ext{ with } m{ar{arphi}}_k = m{P}_k m{arphi}_k \in \{\pm 1\}^{n_h}, \quad (4)$$

where for any  $\boldsymbol{u} \in \mathbb{R}^{n_v n_h}$  (e.g.,  $\varphi$  or  $\bar{\varphi}$ ),  $\boldsymbol{u} := (\boldsymbol{u}_1^\top, \cdots, \boldsymbol{u}_{n_v}^\top)^\top$ , with  $\boldsymbol{u}_k \in \mathbb{R}^{n_h}$ . We note that the models (3) and (4) are easily extended to a multi-scan mode, *i.e.*, by collecting observations for different generations of  $\boldsymbol{P}$  and  $\varphi$ . Moreover, sub-scan mode is reached by randomly sub-sampling  $\tilde{\alpha}$ .

The proposed architecture comprises a  $(n_v = 480) \times (n_h =$ 640) pixel array combined with a Shift Register (SR) for rolling shutter, a Pseudo Random-Permutations (PRP) circuit, a column parallel dedicated pseudo-random modulation first order incremental  $\Sigma\Delta$  (RM $\Sigma\Delta$ ), and an optimized DSP for embedded classification on CS measurements (using pseudorandom realization of P and  $\varphi$ ). Thus, in a rolling shutter acquisition mode, the object recognition is achieved according to the following steps. First, a pseudo-random columns permutation of the selected row is performed by the PRP circuit. As shown in Fig. 2(a), a pseudo-random permutation is accomplished using a multi-level permutation process composed by a fixed pseudo-random scrambling and a 6-stages butterfly network [15]. For each butterfly stage, voltage values are partitioned into blocks and swapped or not via a series of 2: 1 mux-based circuits (i.e., Btfly 64...Btfly 2 in 2(b)). The block size varies from 64 (Btfly\_64) to 2 (Btfly\_2). In addition, to generate independent permutations, a 18 bit CA Pseudo-Random Generator (PRG) following the 30<sup>th</sup> Wolfram rule [16] is used to activate or not the different stages (Fig.

<sup>1</sup>The random model here corresponds to pick each  $P_k$  independently and uniformly at random amongst the  $n_h!$  possible permutations of  $\{1, \ldots, n_h\}$ 



(a) Pseudo-Random Permutations (PRP) circuit.



(d) Random modulation  $\Sigma\Delta$  converter (RM $\Sigma\Delta$ ).



(f) Digital signal processing (DSP).



(b) Butterfly circuits of the first and second stage.



(c) 18 bit cellular automaton PRG following the  $30^{th}$  Wolfram rule.



(e) 5-bit ↑↓ conditional counter.



(g) Argmax circuit.

Fig. 2: The proposed architecture: (a) Pseudo-Random Permutations (PRP) circuit; (b) First and second stages of the Butterfly network; (c) Pseudo-random bit generator; (d) Modulated  $\Sigma\Delta$ ; (e) Conditional counter (f) Dedicated DSP; (g) arg max circuit.

2(c)) via a single control bit per-stage. In terms of circuit area, the proposed PRP reduces dramatically connection lines  $(640+6\times1280\simeq8k)$  compared to a pseudo-random 640-to-640 multiplexer  $(640\times640\simeq409k$  connections).

Second, inspired by the incremental  $\Sigma\Delta$  [17], [18] which has the potential to perform both averaging and quantization simultaneously [5], [19], [20], a dedicated incremental  $RM\Sigma\Delta$  is proposed to perform pseudo-random modulation, per-column summation and A/D conversion (Fig. 2(d)). Thus, each column of the PRP is connected to one RM $\Sigma\Delta$  allowing a column-parallel processing. However, the main advantage of the proposed architecture is the ability to deal with pseudorandom  $\pm 1$  modulations, highly desirable in CS applications. This is achieved thanks to a double-path integration (one integrator for each sign) controlled by a  $n_v$ -bit SR (each cell control one RM $\Sigma\Delta$ ). Thus, for a column i, the voltage outputs  $V_{p_i}$  of the PRP are integrated sequentially by the desirable integrator following the  $SR_i$  bit. The output of the comparator then enables the incrementation or decrementation (for a +/modulation respectively) of the  $\uparrow\downarrow$  counter. After  $n_v$  cycles of the rolling shutter SR, 640 (i.e., 1/480 compression ratio) 9bits  $(\log_2(n_v))$  CS measurements are produced with **only one** clock cycle for each row, meaning that we can dramatically reduce power consumption to perform the inference.

Once the CS measurements are extracted, the SVM inference problem in (2) can be performed by the DSP (Fig. 2(f)). Thus, the CS measurements vector is first multiplied by the gain matrix  $\hat{D}$  and then added to the offset vector  $\hat{b}$  to extract a vector of length C. Finally, the arg max operation is implemented following a dichotomic approach [21] using a series of 2:1 multiplexers, each one controlled by a bitwise comparator. As depicted in Fig. 2(g), in the first stage we compare two-by-two the values of the resulting vector (20-bits) and output the max value and its position (20 + 1 bits) based on the output of the comparator. This is repeated until the last stage to get a 24-bits value where the 4 MSB bits represent the position of the max and thus the predicted class.

#### IV. SIMULATIONS AND PERFORMANCE OPTIMIZATION

To demonstrate the efficiency of the proposed architecture, the GIT database [22] is used to learn the SVM patterns (see Sec. II). To fit within specifications of the proposed architecture, each image is resized to a VGA resolution via bicubic interpolation and then subsampled using a simulated RGB Bayer filter. For high-level simulations, we randomly select C=10 classes, 10 samples per class to construct the training set and  $n_2=5$  samples per class for the test. The training set is then used to train the SVM on compressively acquired images according to the CS modeled by (3).



Fig. 3: Extracted plots of the simulated architecture: (a) Data distribution at the output of the column parallel Sigma-Delta; (b) The probability of error at the output of each Sigma-Delta; (c) The probability of classification error vs. accuracy; (d) The classification accuracy in function of the number of measurements; (e) and (f) represent the distribution of the entries of  $\hat{D}$  and the components of  $\hat{b}$  respectively.

|   | RGB Bayer                 | CS measurements        | Our sensing scheme           | Our sensing scheme   | Our sensing scheme       | Our simulated architecture |
|---|---------------------------|------------------------|------------------------------|----------------------|--------------------------|----------------------------|
|   | without CS                | Bernoulli distribution | with Matlab randperm implem. | without quantization | without saturation       | (with quant. & sat.)       |
| ĺ | 91.6 % ( $\approx 300k$ ) | 87.8 % (≈ 600)         | 93.4 % (≈ 600)               | 94 % (≈ 600)         | 93.2 % ( $\approx 600$ ) | 93.2 % (≈ 600)             |

TABLE I: Recognition accuracy for different simulations (levels of description of our architecture). Between parenthesis reported the number of measurements.

As the proposed CIS is designed to meet requirements of highly constrained hardware, its performance can typically be optimized thanks to the prior knowledge on the distribution of the CS measurements (Fig. 3(a)), the enteries of  $\hat{D}$  (Fig. 3(e)) and the components of  $\hat{b}$  (Fig. 3(f)). First, given the distribution of CS measurements, the resolution of the RMSD can advantageously be reduced by saturating the  $\uparrow\downarrow$  counter in Fig. 2(d) to a lower number of bits instead of 9-bits by benefiting of the intrinsic property of the incremental  $\Sigma\Delta$  (log<sub>2</sub>( $n_v$ )) [5]. Thus, as shown in Fig. 3(b), the probability of error at the output of RMSD tends to 0 for a resolution of 5-bit of the CS measurements. Moreover, the trade-off between CS measurements resolution and the classification accuracy is also taken into account in Fig. 3(c). We clearly observe that the classification error floors to 6% from a 5-bit resolution.

On the other hand, to perform embedded inference, the matrix  $\hat{D}$  and the offset vector  $\hat{b}$  have to be stored within an on-chip memory. Thus, as the histogram of the matrix  $\hat{D}$  have a peaked distribution (cf., Fig. 3(e)), we have chosen a uniform quantizer using a dynamic range limited to 2/3 of the whole dynamic of the matrix. However, as the offset vector  $\hat{b}$  has a flattened distribution, the uniform quantizer is applied on the whole range covered by the components  $\hat{b}$ . Thus, regarding the distribution and the dynamic range of  $\hat{D}$  and  $\hat{b}$ , we have empirically chosen to set a signed 4-bit resolution for the entries of  $\hat{D}$  and a signed 12-bit for the components of  $\hat{b}$ . Finally, the memory requirements to store the SVM affine

function in order to perform object recognition on the GIT database (10 classes) is limited to  $10 \times 640 \times 5 + 10 \times 13$  bits  $\simeq 32$  kbit for a single-scan readout while achieving a satisfactory classification accuracy ( $\simeq \%93$ , Tab. I). Fig. 3(d) stands for the classification accuracy as a function of the number of measurements (scans). It shows that the accuracy ceils to  $\simeq \%93$  from 640 measurements (i.e., a single-scan). For only 64 measurements (i.e., randomly sub-sampling the one-scan measurements at 1/10 sampling rate), the accuracy still reaches  $\simeq \%70$  for the 10-class inference problem.

#### V. CONCLUSION

This paper presents a dedicated end-of-column CS sensing scheme to reduce data dimensionality using a low-footprint pseudo-random permutations circuit and a one-clock cycle low resolution (5-bit) RM $\Sigma\Delta$ . The signal independent dimensionality reduction of CS (1/480 compression ratio) allows to reduce memory requirements to perform SVM inference  $(\approx 32 \text{ kbit})$  while keeping an acceptable classification accuracy of  $\approx 93\%$ . Interestingly, the proposed architecture can even reduce the number of extracted measurements by subsampling the one-scan measurements while still achieving a reasonable accuracy for example for an ultra-low power target application. In the future work a circuit implementation will be developed at schematic and layout level using well optimized 4T pixels and a standard image sensor technology design (e.g.,  $0.18 \ \mu m$ ). In this test chip we plan to embed several additional features such as CS on image descriptors.

#### REFERENCES

- [1] E. J. Candès and M. B. Wakin. An introduction to compressive sampling. *IEEE Signal Processing Magazine*, 25(2):21–30, March 2008.
- [2] M. A. Davenport, P. T. Boufounos, M. B. Wakin, and R. G. Baraniuk. Signal processing with compressive measurements. *IEEE Journal of Selected Topics in Signal Processing*, 4(2):445–460, April 2010.
- [3] L. Jacques, P. Vandergheynst, A. Bibet, V. Majidzadeh, A. Schmid, and Y. Leblebici. Cmos compressed imaging by random convolution. In 2009 IEEE International Conference on Acoustics, Speech and Signal Processing, pages 1113–1116, April 2009.
- [4] V. Majidzadeh, L. Jacques, A. Schmid, P. Vandergheynst, and Y. Leblebici. A (256256) pixel 76.7mw cmos imager/ compressor based on real-time in-pixel compressive sensing. In *Proceedings of 2010 IEEE International Symposium on Circuits and Systems*, pages 2956–2959, May 2010.
- [5] Y. Oike and A. El Gamal. Cmos image sensor with per-column add and programmable compressed sensing. *IEEE Journal of Solid-State Circuits*, 48(1):318–328, Jan 2013.
- [6] W. Guicquero, A. Dupret, and P. Vandergheynst. An algorithm architecture co-design for cmos compressive high dynamic range imaging. *IEEE Transactions on Computational Imaging*, 2(3):190–203, Sept 2016.
- [7] C. Kim, K. Bong, I. Hong, K. Lee, S. Choi, and H. J. Yoo. An ultra-low-power and mixed-mode event-driven face detection soc for always-on mobile applications. In ESSCIRC 2017 43rd IEEE European Solid State Circuits Conference, 2017.
- [8] K. Bong, S. Choi, C. Kim, D. Han, and H. J. Yoo. A low-power convolutional neural network face recognition processor and a CIS integrated with always-on face detector. *IEEE Journal of Solid-State Circuits*, 2018.
- [9] D. Jeon, Q. Dong, Y. Kim, X. Wang, S. Chen, H. Yu, D. Blaauw, and D. Sylvester. A 23-mw face recognition processor with mostly-read 5t memory in 40-nm cmos. *IEEE Journal of Solid-State Circuits*, 2017.
- [10] E. J. Candès and T. Tao. Decoding by linear programming. IEEE Transactions on Information Theory, 2005.
- [11] M. A. Davenport, P. T. Boufounos, M. B. Wakin, and R. G. Baraniuk. Signal processing with compressive measurements. *IEEE Journal of Selected Topics in Signal Processing*, 2010.
- [12] S. Afonso, G. Mixon, and B. Recht. Compressive classification and the rare eclipse problem. arXiv preprint arXiv:1404.3203, 2014.
- [13] V. Cambareri, C. Xu, and L. Jacques. The rare eclipse problem on tiles: Quantised embeddings of disjoint convex sets. In 2017 International Conference on Sampling Theory and Applications, 2017.
- [14] C. M. Bishop. Pattern recognition and machine learning. Information science and statistics. Springer, 2006.
- [15] Y. Hilewitz and R. B. Lee. A new basis for shifters in generalpurpose processors for existing and advanced bit manipulations. *IEEE Transactions on Computers*, 58(8):1035–1048, Aug 2009.
- [16] S. Wolfram. Cellular automata and complexity: collected papers. CRC Press, 2018.
- [17] P. Boufounos and R.G. Baraniuk. Sigma delta quantization for compressive sensing. In SPIE the international society for optical engineering, volume 6701, page 6701. International Society for Optical Engineering; 1999, 2007.
- [18] H. Wang and W. D. Leon-Salas. An incremental sigma delta converter for compressive sensing applications. In 2011 IEEE International Symposium of Circuits and Systems (ISCAS), pages 522–525, May 2011.
- [19] W. Guicquero, A. Verdant, and A. Dupret. High-order incremental sigmadelta for compressive sensing and its application to image sensors. *Electronics Letters*, 51(19):1492–1494, 2015.
- [20] H. Lee, D. Seo, W. Kim, and B. Lee. A compressive sensing-based cmos image sensor with second-order  $\sigma\delta$  adcs. *IEEE Sensors Journal*, 18(6):2404–2410, March 2018.
- [21] B. Yuce, H. F. Ugurdag, S. Gren, and G. Dndar. Fast and efficient circuit topologies forfinding the maximum of n k-bit numbers. *IEEE Transactions on Computers*, 63(8):1868–1881, Aug 2014.
- [22] Georgia tech face database. Available [online]: http://www.anefian.com/research/face\_reco.htm.