Analysing the robustness of finger vein recognition: cross-dataset reliability and vein utility

Arican, Tugce; Veldhuis, Raymond; Spreeuwers, Luuk

doi:10.1186/s13640-024-00643-2

Research
Open access
Published: 08 October 2024

Analysing the robustness of finger vein recognition: cross-dataset reliability and vein utility

EURASIP Journal on Image and Video Processing volume 2024, Article number: 35 (2024) Cite this article

425 Accesses
Metrics details

Abstract

Finger vein recognition is an emerging biometric trait known for its privacy features. Despite the remarkable performance of deep learning methods like convolutional neural networks on challenging finger vein datasets, their reliability and robustness need further examination. This study evaluates the robustness of three recognition methods—the traditional Miura Method, a supervised convolutional neural network, and an unsupervised convolutional auto-encoder—through the challenging and more realistic scenario of cross-dataset comparisons. We also analyse the reliability of these methods in terms of sample quality. We introduce a novel vein quality metric to measure vein clarity and complexity and compare it against an existing image quality metric, natural image quality evaluator. Our findings reveal differences in how these recognition methods utilise finger vein images for comparisons, highlighting the need for robust recognition techniques in more realistic scenarios. In addition, our vein quality metric effectively detects defective images, reducing the zero false-match rate from 34.98% to 8.18% on the SDUMLA-HMT dataset. These results indicate the need for metrics more focussing on finger vein image characteristics for effective quality assessment for finger vein images.

1 Introduction

Finger veins are intrinsic biometric traits that help enhance the security and privacy of biometric recognition systems, making them more resilient against forgery attacks. Owing to the privacy-centric nature of finger veins, they are particularly well-suited for applications demanding high levels of security and privacy, such as financial transactions and access control to secure facilities like data centres or power plants.

A finger vein recognition system typically involves three steps: data acquisition, feature extraction, and comparison. Data acquisition uses near-infrared (NIR) light with wavelengths ranging in between 750 and 950 nm. While the finger vein acquisition devices converge around 850 nm wavelength [1], the devices can vary in NIR module type and position, contact properties of the device, and image resolution. After acquisition, unique features are extracted from finger vein images, and compared to assess the similarity or dissimilarity between finger vein images. Feature extraction and comparison methods can be tailored for the acquisition device, assuming the same device will be used for both enrolment and verification.

Generalisation and reliability are paramount for a finger vein recognition system, ensuring seamless operation under diverse conditions such as varying illumination, pose, and acquisition device properties. While it may seem feasible to tailor the recognition method to each device, this approach becomes prohibitively costly in systems where new devices are frequently introduced. The expenses associated with development, optimisation, and data collection can escalate rapidly for such environments. Moreover, in systems aiming for interoperability across devices, a recognition method tailored for a specific device would struggle to effectively compare images form different resources. Therefore, adopting a recognition method capable of robust generalisation across devices not only enhances the scalability of finger vein recognition systems but also has the potential to reduce development costs, thereby facilitating seamless deployment in large-scale applications. Existing literature on finger vein recognition [2,3,4,5] acknowledges generalisation and reliability issues, often demonstrating performance drops observed in cross-dataset evaluations. However, the literature fails to fill the gap in understanding the factors contributing to this disparity. A deeper analysis of these factors is crucial for better understanding and advancing the robustness of finger vein recognition systems in diverse operational environments.

Finger vein image quality is another critical aspect of reliability of a recognition system, as low-quality images could compromise system security and reliability. In the previous study of Arican et al. [5], it was observed that the P-CAE may falsely accept non-mated image pairs with minimal or no visible vein patterns due to the lack of distinct patterns. Moreover, in mated pairs where both images lack visible veins, the comparison score can be high owing to similar bone structures. Such comparisons are unreliable due to lack of clear and complex vein patterns. This discrepancy implies that the presence and structure of vein patterns are crucial for accurate and reliable finger vein recognition. Considering the vein quality while assessing finger vein sample quality could further improve the robustness and reliability of recognition models. However, the literature lacks robust quality metrics that focus exclusively on the quality of the vein patterns.

This study makes two significant contributions to the field of finger vein recognition.

First, it expands upon the disparities in the performance of the CNN as reported in Ref. [5]. We systematically analyse the impact of different input formats on the recognition performance of the CNN to investigate which aspects of the input domain are most influential for the model. Furthermore, we extend our investigation through a comprehensive cross-dataset analysis integrating a traditional method alongside the supervised CNN and unsupervised P-CAE proposed in Ref. [5]. These analyses aim to provide deeper insights into the robustness of recognition methods across varying conditions, essential for deploying effective and robust finger vein recognition methods in diverse real-world applications.
Second, we introduce a novel metric to quantify vein clarity and complexity in images to analyse the relation between clarity of vein patterns and comparison performances across various recognition methods. Our proposed metric distinguishes itself from existing approaches by exclusively evaluating local vein information. We compare our vein quality metric against a statistical image quality assessment metric, natural image quality evaluator (NIQE) [6]. This comparison enables a nuanced exploration of different quality aspects of finger vein images in conjunction with diverse recognition methods. To the best of our knowledge, this study represents the first comprehensive comparison of different quality aspects, considering dataset and recognition method characteristics. We believe that the insights gained from this analysis, alongside the introduction of the vein quality metrics, will inspire new research directions and enhance robustness and reliability of finger vein recognition systems.

This paper is structured as follows: Sect. 2 reviews existing literature on finger vein recognition, focussing on generalisation and image quality aspects. Section 3 details the methodology employed in this study, describing the three recognition methods and the finger vein quality assessment approach. The datasets used in this research, along with an explanation of the conducted experiments, are presented in Sect. 4. Results of these experiments are discussed in Sect. 5. Section 6 delves into the implications of our findings. Finally, Sect. 7 concludes the study by summarising the key findings and outlining potential avenues for future research.

2 Related work

NIR light, primarily absorbed by haemoglobin in blood cells, effectively revealing veins as dark, ghost-like lines, while the other tissues create a contrast by scattering and absorbing NIR at different levels, forming a finger vein image [7]. Vein patterns are effectively observed in the 700–1000 nanometre (nm) range [8, 9], while the literature converges around 850 nm for finger vein acquisition [1]. Despite using similar NIR wavelengths, finger vein acquisition devices vary in resolution, type and placement of illumination modules, design (open vs. close), and contact requirements. Research indicates that while similar vein patterns are acquired on different devices, slight variations in patterns can occur. Some recognition methods struggle with these variations, leading to lower comparison performances when comparing across different acquisition devices [5, 10].

Earlier studies on finger vein recognition rely on traditional feature extraction and comparison methods. One notable approach, proposed by Miura et al. [11, 12], leverages the curvature behaviour of vascular patterns to extract the vein patterns as binary templates. The similarity between two vein patterns is determined by correlation of the two finger vein templates. The curvature behaviour has proven to be reliable under varying illumination conditions, and the correlation method introduced by Miura et al. [11] can effectively compensate small translation errors. Owing to its simplicity, robustness against illumination and translations, and competitive performance on public datasets [13], this method has become one of the most commonly used baselines in finger vein recognition.

Recently, deep learning methods, particularly CNNs, demonstrate how well they can generalise over illumination and finger pose variations on recognition. Tang et al. [3] achieve impressive recognition performances on four public finger vein datasets using a Siamese CNN architecture, establishing the state-of-the-art for some of these databases. The contrastive loss employed in their study effectively separates inter and intra-class distributions, demonstrating a superior performance against illumination and pose variations. Another contrastive learning approach is introduced by Kuzu et al. [14, 15]. In their work, the authors train a custom embedding layer on top of a DenseNet backbone and achieve notable results on a challenging finger vein dataset. The authors highlight the effectiveness of transfer learning and the importance of choosing the appropriate loss function to improve recognition performance of vascular patterns.

While the recognition performances are boosted by deep learning methods, only a limited number of studies delve into the reliability of these methods across multiple datasets. Tang et al. [3] are among the first to demonstrate the cross-dataset performance of their proposed Siamese CNN. The Siamese architecture and contrastive loss struggle to generalise learned representations across different datasets, resulting in significantly varying comparison performances in cross-database experiments. Subsequently, Prommegger et al. [4] present cross-dataset segmentation results using a CNN model. The authors conclude that significant differences between training and evaluation datasets result in poorer segmentation results, with most experiments yielding unacceptable outcomes. Some researchers address this challenge through domain adaptation. Noh et al. [16] propose a generative model, CycleGAN, learning to adapt style of finger vein images from one dataset to another. While the approach demonstrates an acceptable comparison performance between two public finger vein datasets, it lacks validation on unseen datasets. Chen et al. [17] introduce an auto-encoder that maps greyscale finger vein images to single-modality binary templates, leveraging the universality of vein patterns. They utilise a U-Net to learn a mapping from greyscale images to binary templates, showing promising results on domain adaption. However, the study does not conclusively address the reliability of the learned mapping across different finger vein datasets. Arican et al. [5] utilise a patch-based auto-encoder aiming to achieve better generalised vein representations. The local approach is proved to be more effective in cross-dataset comparisons compared to a traditional baseline and a state-of-the-art supervised method without requiring any fine-tuning.

Another factor influencing both the performance and reliability of finger vein recognition is quality of captured finger vein samples. Peng et al. [18] argue that contrast, entropy, and luminance contribute to the quality, and propose a triangulation method to estimate a quality score for finger vein images. Ma et al. [19] incorporate metrics such as effective area, clarity, finger shifting along with contrast, entropy, and luminance. The authors propose a signal-to-ratio based human visual system index to assess finger vein image quality. Remy et al. [20] use natural scene statistics (NSS) to estimate high- and low-quality image distributions, assessing the quality of a new image based on these distributions. Although these studies claim that the proposed metrics are effective in detecting low-quality images, they often evaluate overall image quality rather than exclusively focussing on vein patterns. Some researchers define sample quality based on vein properties. Nguyen et al. [21] estimate vein depth information though cross-sectional profiles of finger vein images, arguing more vein points found at higher depths indicate higher image quality. Qin et al. [22] exploits curvature information extracted through radon transform to estimate sample quality. Qin et al. [23] combine metrics such as connectivity, smoothness, reliability on binary finger vein images with the curvature information calculated on grey image through radon transform. Despite the inclusion of a separate vein quality measure, methods utilised for vein extraction in these studies can be sensitive to background noise, which can affect the accuracy of analysing vein patterns for quality estimation. Qin et al. [24] and Zeng et al. [25] use deep neural networks to identify low- and high-quality images. In these studies, high- and low-quality samples are labelled automatically through comparison scores, and a deep network is trained on these labels. Unlike other sample quality research presented here, these two studies directly aim to label images as high and low quality instead of assessing a quality score to images. While deep learning show promise in generalising image quality assessment, they often lack detailed analysis of factor contributing to image quality beyond classification.

This study extends the previous work of Arican et al. [5] by further investigating the factors that contribute to the performance of the CNN at the input level, aiming to gain more insights about these models in finger vein recognition. In addition, we incorporate the traditional Miura Method into the cross-dataset comparisons to examine the generalisation of model parameters across different datasets. Furthermore, we introduce a novel metric to measure vein clarity and complexity using entropy of local vein structures to analyse the relationship between vein clarity and complexity and comparison performances across different recognition methods. We compare our metric with a statistical approach using NSS to further analyse the contribution of different image properties across various recognition methods using different datasets.

3 Methodology

3.1 Maximum curvature and Miura match

The maximum curvature [12] method proposed by Miura et al. exploits the curvature attributes of finger vein patterns (Fig. 1a). This method searches for curvature information across horizontal, vertical, and diagonal directions. Local maxima of the curvatures are determined as vein points. Owing to the reliability of curvature behaviour under varying illumination conditions, the maximum curvature method enhances robustness of vein extraction against illumination variations. The granularity of the extracted vein patterns is closely tied to the standard deviation of the directional filters; smaller standard deviations reveal finer veins, and larger values capture more prominent ones. Optimising this parameter according to the dataset properties is essential for achieving optimal recognition performance. Miura match [11] is a correlation-based comparison method proposed by the same authors. It calculates point-wise correlations between reference and probe binary vein templates. A correlation score is computed for a pair of vein patterns by dividing the number of correlated vein points by the total number of vein points in both templates. Therefore, the correlation score ranges from 0.5 (perfect correlation) to 0.0 (no correlation). This method offers robustness against small translation errors by correlating only the central portion of the probe image (Fig. 1a). Owing to its simplicity and effectiveness in handling illumination and translation variations, the maximum curvature method with Miura match serves a widely accepted baseline method for finger vein recognition.

3.2 Convolutional neural networks

Convolutional neural networks (CNNs) are a class of deep learning algorithms that have demonstrated exceptional performance in numerous computer vision tasks, including biometric recognition. These architectures are intricately designed to capture spatial relationships between image pixels through successive layers of convolution kernels. Each layer processes features extracted by its predecessor, progressively refining them and generating a more abstract representation of the input data. This succession of kernels enable CNNs to extract complex and sophisticated features directly from raw data, significantly enhancing generalisation in vision tasks compared to traditional feature extraction methods. In the domain of finger vein recognition, CNNs have established state-of-the art performance on multiple public datasets.

The CNN architecture used in this study, proposed by Kuzu et al. [15], demonstrates an exceptional recognition performance on a challenging finger vein dataset. The architecture (Table 1) extends the DenseNet-161 backbone by incorporating a Custom Embedder and a classifier. During training, only the Embedder and classifier are trained to learn tailored finger vein representations, while the backbone remains frozen throughout. The model is trained on the SDUMLA-HMT dataset [26], known for its significant translation and rotation errors, to learn robust representations of finger vein images. In the evaluation phase, the classification layer is removed, and the output of the Embedder is utilised to compare two finger vein images. Euclidean distance has yielded the best comparison performance in the original study [15]. Our objective is to explore how a recognition method performs under different dataset characteristics than it was trained on. Therefore, we maintain fidelity to the implementation described in the original paper [15], without introducing any modifications or optimisations to the architecture, training procedures, or comparison metrics.

Table 1 The CNN architecture

Full size table

3.3 Patch-based convolutional auto-encoder

Auto-encoders are a class of neural networks designed for reconstructing their input data. The architecture comprises an encoder and a decoder sub-network that collaboratively compress the input into a lower dimensional latent space and then reconstruct it back to its original dimensions. The objective of minimising the reconstruction error encourages the encoder to capture the most salient features of the input in the latent representation. This unsupervised learning process enables the auto-encoder to learn a compressed representation of the input data, which can be utilised for comparing finger vein images.

This study employs the auto-encoder architecture introduced in the previous work [5]. The architecture (Table 2) incorporates convolution layers in the encoder and de-convolution layers in the decoder, complemented by ReLU activation and batch normalisation. The primary objective of this design is to learn local representations of finger vein patterns, aiming to achieve improved encoding and enhanced generalisation of the learned finger vein representations. By emphasising local information, the CAE reduces the impact of illumination patterns on the learned representations, as well as focuses on capturing generic vein structures, such as branches or bifurcations, rather than the entire vein pattern. This approach facilitates seamless application of the learned representations across various datasets.

Table 2 CAE architecture

Full size table

The patch-based convolutional auto-encoder (P-CAE) compresses 65 x 65 pixel patches into a 32-dimensional latent vector, which is subsequently used for comparing pairs of finger vein images. The model is trained on the UTFVP dataset [13], owing to its high-quality images, using mean absolute error (MAE) as the loss function, as proposed in the original study [5]. Given that the P-CAE focuses on local information, image pairs are also compared at the local level. After pairing finger vein images, both the reference and probe images are divided into overlapping patch pairs extracted from corresponding locations. Cosine similarity is employed to measure the similarity between patch pairs. The average cosine similarity of these patch pairs serves as the similarity measure for the image pair.

3.4 Finger vein pattern quality estimation

In this study, we propose a vein quality assessment approach to measure the clarity and complexity of vein patterns. We leverage entropy of local vein patterns extracted using the Frangi vein enhancement method [27]. This method utilises Hessian filters to capture curvature structure in vascular images. By examining the relationships between eigenvalues of the filtered image, the method assigns each pixel a probability of belonging to a vascular structure. Unlike gradient information proposed by Nguyen et al. [21], Hessian filters are more robust to noise and effectively suppress background information. Combined with eigenvalue analysis, this allows for an effective analysis of vein patterns. Furthermore, the Frangi method integrates multi-scale extraction of tubular structures, enabling analysis of veins with different widths. To compute quality score for a finger vein image, we first extract the Region of Interest (ROI), completely eliminating the finger edges from the quality metric, and then resize it to 128 x 256 pixels. We compute the entropy of vein structures within 32 x 32 pixel non-overlapping blocks using Eq. 1, where x$_i$ represents the vein probability in the extracted vein image. It is assumed that complex vein structures will exhibit higher entropies (Fig. 2), thereby resulting in higher vein quality score:

$$\begin{aligned} H(X) = - \sum _{i=0}^{{255}} p(x_i) \log p(x_i) \end{aligned}$$

(1)

The vein quality is calculated as cumulative sum of local entropy values. This is done by counting the number of blocks that exceed varying entropy thresholds and then normalising the counts by the total number of blocks within the image (Eq. 2). N, the number of blocks in an image, is 32 in our experiments. The cumulative sum of normalised counts is utilised as the quality score for finger vein images (Eq. 3). i and j represent the upper and lower bounds of quality thresholds, which are determined dynamically based on the dataset’s quality scores. In essence, the more complex vein pattern, the greater number of blocks with high entropy values, leading to a higher cumulative sum and a higher quality score for the image (Fig. 3):

$$\begin{aligned} R_{\text {thresh}}&= \frac{1}{N} \sum _{b=1}^{{32}} \left[Block\ Entropy_{b} > \text {thresh}\right] \end{aligned}$$

(2)

$$\begin{aligned} U&= \sum _{\text {thresh}=i}^{j} R_{\text {thresh}} \end{aligned}$$

(3)

Although more complex blocks exhibit higher entropy values, it has also been noted that strong and straight vein patterns, often found around finger edges, can also exhibit high entropies without significant complexity. To mitigate the influence of these structures on the quality assessment, we employ a spatial weight mask (Fig. 4). The mask, matching the dimensions of finger region (128 x 256 pixels), is derived from a 2-dimensional generalised Gaussian distribution with a standard deviation of 13 and a scale of 0.05. Figure 4b illustrates a cross-section of the mask, showing weights concentrated around the finger centre and tapering towards zero at the upper and lower edges. The gradual reduction minimises contributions from edge regions. After multiplying extracted veins with these masks, we compute local vein entropies.

Through our observations on image pairs and their comparison scores, we have identified four potential groups of finger vein images, each representing an observed quality level. These observations are primarily based on the observed clarity and complexity of vein patterns. The observed quality groups, namely Optimal, Adequate, Limited, and Inadequate, are explained below.

Optimal: veins with good contrast and complex structures. Images in this group result in the most reliable comparisons since they have enough vein complexity.
Adequate: veins with acceptable contrast and complexity, though patterns may be partially captured or slightly inferior to the Optimal group. Images in this group still provide sufficient confidence for comparisons.
Limited: images featuring weak structures, such as simple line-like veins, which fail to generate enough confidence when compared to similar images.
Inadequate: images lacking visible vein patterns, significantly reducing comparison performance and posing a security threat to the system.

Figure 5 shows samples from each category. It is important to note that these groups do not serve as quality labels for finger vein images. Instead, they provide a visual baseline for validating of our vein quality metric, ensuring it can assess vein clarity and complexity in a manner that aligns with the observed image quality. Therefore, we utilise these groups exclusively to validate the effectiveness of our vein quality metric against the observed finger vein image quality.

We also compare our vein quality metric with a statistical image quality estimation approach, natural image quality evaluator (NIQE) [6]. NIQE is a no-reference image quality assessment algorithm that estimates image quality by analysing NSS features extracted from high-quality and distortion-free images. A multivariate Gaussian model is fitted to these features during its training phase, defining the expected image quality distribution. In the evaluation phase, NIQE computes the distance of NSS features of new images from this learned distribution, with a lower score indicating higher image quality. Unlike our vein quality metric, which focuses on evaluating the vein clarity and complexity, NIQE primarily exploits local contrast and variation in greyscale images. This means, it includes background statistics in addition to vein information. In this study, we utilise MATLAB’s implementation of NIQE with default parameters. Following the recommendation by Remy et al. [20], we train the NIQE model on the Optimal images from each dataset, which represent the highest quality images. We specifically focus on the region between the finger edges to ensure that the model learns statistics relevant to finger region.

It is important to note that the aim of this study is not to directly address all observed quality issues in finger vein images, nor to establish a definitive finger vein image quality metric. Our goal in this study is to explore an aspect of finger vein sample quality related to vein information. We believe that this study introduces a different perspective in understanding and evaluating finger vein image quality.

4 Datasets and experimental setup

4.1 Datasets

This subsection introduces the finger vein datasets utilised in this study. Table 3 presents a summary of each database and Fig. 6 shows few samples from each dataset.

4.1.1 University of Twente (UTFVP)

The University of Twente Finger Vein Pattern Dataset (UTFVP) [13] consists of high-quality finger vein images from 60 subjects. For each subject, index, middle, and ring fingers from both hands are captured four times, resulting in a total of 1440 finger vein images. The device captures 380 x 672 pixel images using 850 nm NIR top-side illumination. While the dataset generally exhibits good contrast, some images show slight translations and rotations.

4.1.2 Shandong University (SDUMLA-HMT)

The SDUMLA-HMT [26] is a multi-modal biometric dataset by Shandong University. The finger vein subset used in this study comprises images from 106 subjects, with the index, middle, and ring fingers of both hands are captured six times for each subject. In total, the dataset includes 4416 finger vein images. Images in this dataset have a resolution of 240 x 320 pixels, and the device utilises 890 nm NIR illumination modules. The quality of this dataset is considered low; a significant portion of the images exhibit low contrast, and there are noticeable translations and rotations between different captures of the same finger. In addition, it is observed that the second set of 53 subjects experience more severe translation and rotation errors compared to the first set of 53 subjects.

4.1.3 Peking University (PKU)

The finger vein dataset collected by Peking University comprises images from 200 subjects. In this dataset, only one finger per subject is captured eight times. However, some of the subjects have defective images that are unusable for recognition purposes. After filtering out defective ones, the dataset contains 1528 finger vein images. The resolution of the images is 384 x 512 pixel. Overall, the dataset features bright but low contrast finger vein images, along with noticeable translation and rotation errors.

Table 3 Finger vein datasets

Full size table

4.2 Experiments and evaluation protocols

This section provides an overview of the experiments, evaluation protocols and metrics employed in this study. Detailed descriptions of the experiments are presented in Sect. 4.2.1, while the explanation of the evaluation protocol and metrics can be found in Sect. 4.2.2.

4.2.1 Experimental setup

The previous work [5] suggests that finger edges may contribute to performance of the CNN model. To investigate this further, a set of experiments are conducted with various input formats. Figure 7 demonstrates some samples for each input format.

Original finger vein image finger vein image with the edges. Image background is removed by setting anything around the finger to zero. This format serves as a baseline for subsequent experiments.

Enhanced finger vein image vein patterns are enhanced using maximum curvature. The enhanced image is a weighted sum of the grey scale image and binary veins (see Eq. 4). Sigma value for vein extraction is set to 5.0, and alpha is determined as 0.2 through visual inspection:

$$\begin{aligned} image_{enhanced} = image_{grey} - \alpha * image{{vein}} \end{aligned}$$

(4)

Blurred finger vein image finger vein image is blurred to reduce the appearance of veins using Gaussian blur with a standard deviation of 2.

ROI image defined as the finger region between the maximum of the upper finger edge and the minimum of the lower edge. Sobel filter is utilised to detect finger edges.

Enhanced ROI image the same vein enhancement (Eq. 4) is applied on the ROI image.

Blurred ROI image the same Gaussian blur is applied on the ROI image.

The CNN architecture introduced in Sect. 3.2 is trained separately on six different versions of the input images, for each dataset, with each model trained for 120 epochs, as suggested in the original study [15]. This results in 6 trained model for each dataset, making a total of 18 models across all datasets. Input images are resized to 228 x 228 pixels to align with the input dimensions of the DenseNet architecture.

In this study, we repeat the cross-dataset comparison experiments in the previous work [5], also including the traditional method. Although the traditional method is not trainable, both maximum curvature and Miura match require hyper-parameter tuning for optimal recognition performance on each dataset. Once the parameters are optimised for one dataset, they are used to evaluate a new evaluation dataset in cross-dataset experiments. This approach aims to emulate the training and evaluation steps of the CNN and P-CAE models.

The proposed vein quality assessment approach is validated on each dataset using the quality groups described in Sect. 3.4. During validation, vein quality scores of selected images are computed, and we observe how well our quality metric aligns with observed image quality through vein quality score histograms. Based on these observations, we establish quality thresholds to generate evaluation sets with quality constraints. Similarly, NIQE is evaluated using the same quality groups and protocol to determine its thresholds and generate quality constrained evaluation sets with this metric. Subsequently, all recognition models are evaluated on these vein quality evaluation sets to asses the effectiveness of both our vein quality metric and NIQE. Due to time constraints, evaluations are conducted using only two thresholds.

4.2.2 Evaluation protocols and metrics

All recognition models are trained/optimised and evaluated separately on the exact same subset of images for each dataset. Each dataset is used independently for training and evaluation; models are not trained on combined datasets. A summary of the training and evaluation sets is provided in Table 4. This study follows an open-set evaluation protocol, meaning the evaluation sets consist of subjects different from those in the training sets. For the SDUMLA-HMT and PKU datasets, half of the subjects are allocated to the training set, while the remaining half are used to generate the evaluation set. Due to the smaller size of the UTFVP dataset, one-third of the subjects are reserved for training, and the remaining two-thirds are used for evaluation.

Table 4 Summary of training and evaluation data

Full size table

The performance of the models is assessed using false non-match rate at false-match rate below 1% (FMR100), below 0.1% (FMR1000), and zero false-match rate (ZeroFMR). FMR100 indicates the false non-match rate when the false-match rate is 1% or less, while FMR1000 measures it at 0.1% or below. FMR100 is used to represent more typical industrial standards, whereas FMR1000 reflects a higher security level for the system. ZeroFMR measure the false non-match rate where the false-match rate is zero, indicating the point where all non-matching individuals are correctly identified.

In addition to these performance metrics, comparison score histograms are used to examine how different setups affect comparison scores on each model. Detection error trade-off (DET) curves are employed to visualise the impact of the quality assessment approach on the false non-match and false-match errors.

5 Results

The previous study by Arican et al. [5] highlights significant performance discrepancies with the CNN on cross-dataset comparisons, suggesting a potential correlation with the input format. Section 5.1 presents the findings on the influence of different input formats on the performance of the CNN model across datasets. It then proceeds in Subsection 5.2 to examine the impact of the input format on the generalisation of the CNN model. The analysis further extends to compare the generalisation capabilities of the three recognition methods in relation to cross-dataset comparisons. Finally, in Sect. 5.3, it delves into the impact of the vein quality assessment on the performance of these recognition methods.

5.1 Impact of the input format on the performance of the CNN

Table 5 displays the performance of the CNN model across different image formats, measured in terms of FMR100 and FMR1000 (%). The increase in FMR100 indicates the significant impact of finger edges on the UTFVP and PKU datasets. For the UTFVP dataset, FMR100 increases from 3.75% to 11.91% when using the ROI image. Similarly, for the PKU dataset, FMR100 raises from 6.95% to 43.32%, demonstrating the importance of the finger edges in image comparisons for this dataset. Histogram plots (Figs. 8, 9, 10) reveal that the tails of both mated and non-mated score histograms become denser with the omission of finger edges across all three datasets. This implies that without edges, mated pairs are slightly less similar, and non-mated pairs appear slightly more similar.

Table 5 Performance comparison of the CNN with different input formats in FMR100 and FMR1000 (%)

Full size table

Vein enhancement positively impacts the CNN’s performance when the ROI images are used as input. Table 5 shows improved performance across all datasets with the enhanced ROI images compared to ROI images, in terms of FMR100. Specifically, the performance significantly improves on the PKU dataset, where the FMR100 decreases from 43.32% to 17.16% with the enhanced ROI images. Despite the difference is being minor, Fig. 10a, b reveals slightly longer and denser tails in the mated pair histograms when the vein information is enhanced on the finger images.

Unlike vein enhancement, blurring the images adversely affects the recognition performance across all datasets using ROI image. On the PKU dataset, the FMR100 nearly doubles when the blurred ROI images are used as input. Histogram plots (Figs. 8, 9, 10) reveal a slight decrease in both mated and non-mated distances with blurred input images. This is primarily observed in the tails of the mated and non-mated histograms, suggesting that mated and non-mated pairs become more similar with deteriorated vein information.

However, when evaluated using the stricter FMR1000 metric, only the SDUMLA-HMT dataset demonstrates a relatively acceptable performance, whereas the other datasets fall significantly short of higher security standard. For example, on the UTFVP dataset, using the Enhanced ROI format results in a 23.06% FMR1000. Despite the improvement compared to the Enhanced Image format, the performance falls short meeting stringent security requirements. The CNN fails to demonstrate a competitive performance when higher security is a concern, regardless of the input format.

5.2 Generalisation capability of the methods on the cross-dataset comparisons

Before assessing the cross-dataset performances of the three models, a series of experiments are conducted to explore the compatibility of different input formats of the CNN in the cross-dataset comparisons. Table 6 presents the performance of various input formats on both single and cross-dataset evaluations across three finger vein datasets in terms of FMR1000 (%). The rows present the evaluation dataset, while the first column of the table indicates the training dataset. Upon initial inspection, neither finger images nor ROI images effectively handle variations across datasets with the CNN model. Almost in all cases, the recognition performance notably deteriorates in cross-dataset setups.

Table 6 Cross-dataset comparison performances of input formats on UTFVP, SDUMLA-HMT, and PKU datasets with the CNN in FMR1000 (%)

Full size table

On the SDUMLA-HMT dataset, despite its impressive single-device performance, recognition performance significantly deteriorates in cross-dataset comparisons, irrespective of the presence of finger edges. The FMR1000 increases from 4.53% to 45.73% and 33.07% when the training sets are chosen as UTFVP and PKU, respectively. While the ROI input format narrows this performance gap between and single- and cross-dataset evaluations, achieving 25.85% and 20.99% FMR1000 on UTFVP and PKU datasets, it remains incompatible with the 10.85% FMR1000 presented with the single-dataset setup. A similar trend is observed on the PKU dataset as well. The performance of the CNN decreases from 26.08% FMR1000 to 81.87% and 66.74% on UTFVP and SDUMLA-HMT datasets. However, unlike the SDUMLA-HMT, the ROI input format does not lead any improvement on the performance of the PKU dataset. Conversely, on the UTFVP dataset, the ROI format proves to be the most compatible for cross-dataset comparisons, particularly when the CNN is trained on the PKU dataset. The FMR1000 decreases from 23.06% to 17.15% when the PKU is the training set.

The cross-dataset experiments in the previous work [5] are extended to include the traditional Miura method. The CNN is evaluated using the enhanced ROI format for cross-dataset comparisons, as both the Miura method and P-CAE exclude finger edges and background information when comparing finger vein images. Table 7 presents the cross-dataset comparison performances of all three models in terms of FMR1000(%). The rows of the table represent the training dataset, while the first column indicates the evaluation set.

Table 7 Comparison performances of the models for single and cross-dataset evaluations in FMR1000 (%)

Full size table

The Miura method demonstrates a similar performance on the UTFVP dataset in cross-dataset comparisons. The FMR1000 indicates a similar performance, even though the model parameters vary across experiments. Figure 11a also demonstrates almost identical behaviour with the PKU dataset. However, when using the SDUMLA-HMT parameters for evaluation, image pair correlation scores increase, especially prominent in the non-mated pairs. This behaviour is similarly observed in the PKU dataset (Fig. 11c). While the performance is more similar with the UTFVP dataset, using SDUMLA-HMT parameters increases the correlation scores, particularly on the non-mated pairs. Conversely, SDUMLA-HMT demonstrates notable changes in the performance with the varying model parameters. The FMR1000 rises from 16.9% to 34.3% and 72.5% when using the parameters from UTFVP and PKU datasets for evaluation, respectively. In addition, a significant change in the non-mated correlation scores are observed, especially when using the PKU dataset parameters for evaluation (Fig. 11b).

The cross-dataset comparison performances of the trainable models align with the results presented in the previous study by Arican et al. [5]. The performance of the CNN fluctuates significantly with differences between training and evaluation set characteristics. For instance, the FMR1000 of UTFVP dataset decreases from 23.06% to 17.15% when the CNN is trained using the PKU dataset, despite the PKU being of lower quality compared to UTFVP. Conversely, training with higher quality data worsens the performance on the PKU, increasing the FMR1000 from 56.40% to 77.92% when the model is trained using the UTFVP. Furthermore, the histogram plots (Fig. 12) reveal that in the cross-dataset setup, both mated and non-mated pair distances decrease. This suggests that, in cross-dataset comparisons, the CNN tends to increase the similarity between all image pairs, regardless of the training—evaluation set pairs.

On the other hand, the P-CAE exhibits remarkably consistent comparison performance across all three datasets, regardless of the variations between training evaluation set pairs. Although non-mated pair similarities slightly vary with the change of training dataset, histogram plots (Fig. 13) demonstrate a stable behaviour of the model across different training datasets. Specifically, on the SDUMLA-HMT dataset, the histogram plot (Fig. 13a) shows nearly identical behaviour regardless of the training dataset. Moreover, it is observed that the non-mated similarity scores tend to be lower when using the SDUMLA-HMT as the training data.

5.3 Finger vein image quality assessment

5.3.1 Validation of the proposed vein quality metric

We validate our vein quality metric on the quality groups introduced in Sect. 3.4 to determine if the metric can effectively measure the observed image vein quality. The box plots (Fig. 14) show the quality scores of each group across each dataset. Our quality metric clearly separate Optimal and Adequate images from Limited and Inadequate images. In addition, Inadequate images, which lack clear vein patterns, scored lower than the other groups, indicating the metric’s ability to distinguish between images with and without vein patterns. Furthermore, our vein quality metric consistently excels in distinguishing between quality groups across diverse datasets, regardless of their varied characteristics.

When comparing our quality metric with NIQE, notable differences emerge. NIQE struggles to efficiently distinguish between vein quality groups in both the SDUMLA-HMT (Fig. 15b) and PKU (Fig. 15c) images. While NIQE can distinguish Inadequate images in the UTFVP dataset, Fig. 15a also reveals the limitation of the model in discerning between different levels of vein clarity, as the score ranges for Optimal and Limited images completely overlap.

5.3.2 Effectiveness of the vein quality metric in the comparison performance

We evaluated three recognition methods on quality-constrained evaluation sets to assess the effectiveness of our vein quality and NIQE metrics. Three quality thresholds are defined for the evaluation sets. T0 denotes the scenario without quality assessment, while T1 and T2 correspond to thresholds of 0.20 and 0.25 for the vein quality metric, and 5.0 and 4.0 for NIQE. Thresholds are determined based on the separation observed among vein quality groups for both metrics (Figs. 14, 15). NIQE achieves a sub-optimal separation of quality classes, effectively distinguishing Inadequate images from the other quality groups only on the UTFVP dataset. Although our goal is also to distinguish Limited images, the separation achieved on the UTFVP is more acceptable compared to the other datasets. Therefore, the thresholds for NIQE are based on this dataset.

Table 8 Performance comparison with vein quality assessment for two thresholds

Full size table

Table 8 presents the FMR1000 and ZeroFMR performances across three datasets using three recognition methods under varying quality constraints on the input images. Overall, our quality metric demonstrates improved performance as the vein quality increases. Notably, both the CNN and P-CAE models show significant enhancement in ZeroFMR when the images exhibit vein quality above 0.25. For instance, on the SDUMLA-HMT dataset, ZeroFMR decreases markedly from 34.98% to 8.18% with the higher vein quality constraint. Similarly, on the PKU dataset, ZeroFMR nearly halves under similar conditions. Conversely, the UTFVP dataset shows more modest improvements with increasing vein quality constraints. It is worth noting that the image quality in this dataset is already high, and the evaluation set includes fewer image pairs compared to the other two datasets.

Compared to our vein quality metric, NIQE shows less improvement in comparison performance. In some instances, applying a higher quality constraint with NIQE even results in worse performance than using no constraint at all. For example, on the UTFVP and PKU datasets using the CNN method, FMR1000 metric indicates poorer performance with higher quality constraints. On the UTFVP dataset using the CNN, FMR1000 is 13.68% without any quality constraint. However, when selecting images with a NIQE score below 4 for evaluation, the FMR1000 increases to 20.69%. Similarly, on the PKU dataset, the FMR1000 increases from 29.71% to 38.14% when higher quality images are used for evaluation.

DET curves (Fig. 16) visualise the findings presented in Table 8. Generally, both quality metrics improve the recognition performance, as evidenced by reduced error rates. However, for some dataset-recognition method pairs, performance worsens under higher quality constraints. For example, when evaluating the PKU dataset using the Miura method (Fig. 16g) both metrics result in overall worse performance compared to the scenario without any quality constraints.

6 Discussion

6.1 Impact of the input format on the performance of the CNN

The experiments conclude that excluding finger edges from the input results in higher dissimilarity scores for mated pairs, leading to decreased performance on the UTFVP and PKU datasets. When analysing Finger image and ROI image pairs, it becomes evident that finger edges often overshadow other features such as vein patterns. If the edges are similar, variations in vein patterns tend to be overlooked. Figure 17 demonstrates such a mated pair from PKU dataset. In the presence of finger edges, the pair is considered as true-match (Fig. 17a) with respect to the equal error rate (EER) threshold. In contrast, the ROI image (Fig. 17b) is mis-classified as a false non-match due to the emphasised rotation in the vein pattern.

Although finger edges contain identity information, they can lead to inaccuracies in certain situations. For example, if one image lacks vein structures yet the edges still preserve similar features—potentially due to a capture error—the CNN tends to consider such pairs more similar than they actually are. Figure 18 exemplifies such a mated pair from the UTFVP dataset. The reference image exhibits much less vein patterns compared to the probe image, and in the presence of the edges (Fig. 18a), the similarity in strong edge information conceals the differences in vein patterns, leading the pair to be classified as a true match under the EER threshold, despite insufficient overall vein similarity. Conversely, in the ROI image (Fig. 18b), the disparities are highlighted, resulting in a score of 1.22 Euclidean distance, which exceed the EER threshold.

Although vein enhancement improves recognition performance in ROI images, it slightly degrades performance on the SDUMLA-HMT and PKU datasets when the finger edges are present. Figure 19 showcases a mated pair from the SDUMLA-HMT dataset. The vein enhancement step effectively strengthens the appearance of veins (Fig. 19b), revealing the rotated vein pattern that is not prominent in the unprocessed image (Fig. 19b). With the enhancement, the CNN begins to focus more on the differences in vein patterns, resulting in an increase in the Euclidean distance from 0.52 to 0.70.

6.2 Generalisation capabilities of the models

The CNN experiments suggest that employing different formats does not enhance generalisation abilities of the model in cross-dataset comparisons. All training—evaluation set pairs with all input formats, except the UTFVP—PKU pair with ROI images, demonstrates highly incompatible performances compared to the single-dataset setup. Furthermore, the quality of the training sets appear to have an inverse impact on cross-dataset performance: training model on high-quality data does not ensure compatible performance when evaluated on low-quality data. This could be related to the strength of the features in high- and low-quality images. For example, high-quality images like those in the UTFVP dataset exhibit stronger veins with higher contrast, allowing the model to learn more pronounced features that might not be triggered by the subtler features of low-quality images. In addition, histogram plots (Fig. 12) reveal that the CNN tends to reduce dissimilarity scores, particularly on non-mated pairs, in the cross-dataset comparisons, regardless of the training—evaluation pairs. This suggests that the CNN may be overly reliant on generic features from the finger vein dataset, which perform well in single-dataset comparisons owing to the similar distribution, but fail to capture more specific attributes of the finger vein images necessary for generalisation across other datasets. The relatively small size of finger vein datasets may hinder the CNN from learning more specific features, and the architecture may not be suitable to capture such nuanced attributes. Furthermore, the Euclidean distance, while effective in single-dataset comparisons, may not be the most suitable when distributions vary across datasets. Addressing these concerns requires further investigation.

With the Miura Method, the UTFVP and PKU datasets demonstrate greater compatibility in cross-dataset comparisons, whereas SDUMLA-HMT behaves differently. Observations on this method reveals a strong relationship between model parameters and dataset characteristics. For example, the sigma parameter for vein extraction should be adjusted higher for higher resolution images to achieve optimal performance, and conversely for lower resolution data. In addition, the window size of the comparison method is optimised based on the overall translation error in the dataset. While the UTFVP and PKU datasets share similar resolutions, SDUMLA-HMT comprises of lower resolution images. Both SDUMLA-HMT and PKU datasets exhibit higher translation errors compared to UTFVP, requiring a smaller window for comparison on these datasets. When the sigma value optimised for PKU is applied to SDUMLA-HMT, the maximum curvature tends to overlook finer veins, resulting in information loss. Furthermore, the smaller comparison window of PKU limits vein information, leading to increased correlation scores for the SDUMLA-HMT dataset. Figure 20 exemplifies the impact of PKU parameters on a non-mated pair from SDUMLA-HMT.

6.3 Vein quality assessment

In this study, we propose a novel vein quality metric designed to measure the observed clarity and complexity of finger vein images based on visual observations and comparison scores with different recognition methods. In that sense, our approach represent a traditional method for assessing finger vein sample quality, which may be critiqued for being overlook certain aspects, like vein clarity and complexity, while potentially missing other facets of sample quality. However, traditional methods can offer a deeper understanding and a better explainability. For example, our vein quality assessment approach can pinpoint regions posing challenges in vein clarity and complexity (Fig. 3). Moreover, it is observed that it can effectively represent observed image quality across varying datasets (Fig. 14). Conversely, such nuanced analysis is less straightforward with NIQE, a trainable model capable of exploring various dimensions of image quality.

Despite its proven effectiveness, our vein quality assessment approach exhibits limitations in effectively detecting Limited images. These images often feature strong simplistic vein patterns. Due to their strong appearance, the entropy measure sometimes fails to adequately capture complexity. The primary risk with Limited images lies in their simple vein patterns, which increase the chance of accidental similarities. Due to absence of complex vein structures that typically contribute to dissimilarities, such incidental similarities result in higher similarity scores than expected, particularly on non-mated pairs. Figure 21 provides an illustration of such a non-mated pair from the PKU dataset using the Miura method.

Unlike our vein quality metric, NIQE is primarily proposed for natural images, and has demonstrated effectiveness in that context [6]. However, in our experiments, despite being trained on each dataset, NIQE struggles to accurately assess finger vein properties and distinguish between high- and low-quality images, particularly on low-quality datasets (Fig. 15). NIQE score is computed based on local contrast and variance, in which vein structures likely to contribute minimally due to their low contrast and high sparsity. Therefore, the metric seems like preferring overall brightness and contrast in an image over vein clarity. Figure 22 shows some images from the SDUMLA-HMT and PKU dataset where NIQE fails to assess image quality. With the selected quality thresholds, both images on the left, lacking clear vein structures, are retained in the evaluation set, whereas images on the right, which clearly exhibit veins, are omitted due to receiving higher scored from NIQE. This misinterpretation of finger image quality, especially, affects models like the Miura Method that exclusively rely on vein structures, leading to poorer comparison results even with higher quality constraints applied to the evaluation set. While NIQE may suggests a higher quality on the evaluation set, the actual quality likely to be much lower especially with low-quality datasets. In such cases, our quality metric successfully measures vein quality in the images, providing more accurate assessments.

7 Conclusion and future work

In this study, we analyse the disparities reported by Arican et al. [5] regarding the CNN performance, and introduce a novel vein quality metric which exclusively quantifies vein clarity and complexity in finger vein images. Our findings suggest that the CNN may overly rely on features other than vein patterns for finger vein comparison. In addition, our vein quality metric effectively estimates observed image quality properties such as vein clarity and complexity, demonstrating its effectiveness by reducing ZeroFMR on a low-quality dataset, SDUMLA-HMT, from 34.98% to 8.18%.

Cross-dataset comparisons conducted in this study support the findings reported in Ref. [5], highlighting the P-CAE as the most adaptable model among the three recognition methods. We also observed that the parameters of the traditional Miura method can be transferred across datasets that share similar characteristics such as resolution or displacement. Otherwise, the method exhibits limitations when datasets vary significantly. The CNN consistently underperforms across all three recognition methods. Our findings suggest that the CNN may rely on generic features which allows single-dataset comparisons, but limits its ability to generalise for cross-dataset comparisons.

Comparison between our vein quality metric and NIQE demonstrates that metrics designed to analyse veins patterns provide a promising approach to understanding quality of finger vein images. The observed improvements with our vein quality metric over NIQE across diverse datasets and recognition methods highlights the effectiveness of focussing vein characteristics. NIQE, originally developed for assessing quality on natural images, reveals limitations when applied to finger vein images, emphasising the need for quality metrics considering the specific characteristics of finger vein images.

As a future direction of research, the unpredictable behaviour of the CNN in cross-dataset comparisons warrant further investigation. One potential factor influencing the behaviour of the CNN is the limited variability in small finger vein datasets. Increasing dataset variability by integrating multiple finger vein datasets can be a promising approach to gaining deeper insights into these models for finger vein recognition. In addition, investigating domain adaptation techniques to normalise dataset distributions could improve generalisation across datasets. Techniques such as variational auto-encoders can help aligning diverse dataset distributions into a cohesive, normalised distribution, potentially enhancing the CNN’s performance in cross-dataset comparisons.

The proposed vein quality measure has limitations in detecting images in the Limited group. Entropy measure fails to capture the complexity of vein structures adequately. Integrating additional complexity-related measures, such as vein direction, dominant direction, or vein branching could enhance the effectiveness of the vein quality metric. Moreover, the proposed approach does not fully exploit the multi-scale nature of Frangi vein enhancement method. The current implementation combines information from multiple scales, leading to information loss on fine veins. Future research could address this by processing these scales separately and analysing the quality measure at multiple scales, allowing finer veins to contribute more significantly to the assessment.

Overall, this study underlines the differences in how the CNN interprets input images compared to the other two methods for finger vein recognition. It also demonstrates that existing recognition methods may struggle to demonstrate robustness in more challenging and realistic scenarios, such as cross-dataset comparisons. Furthermore, preliminary findings on finger vein quality assessment indicate that existing image quality metrics may fail to capture the nuances in finger vein images, highlighting the need for metrics that consider specific characteristics of these images. We believe the findings of this study will stimulate further research in finger vein recognition and pave the way for more robust and reliable recognition systems.

Availability of data and materials

Finger vein datasets are publicly available.

Code availability

The code will be publicly available.

Abbreviations

CAE:: Convolutional auto-encoder
CNN:: Convolutional neural network
DET:: Detection error trade-off
FMR:: False match rate
FMR100:: False non-match rate at false-match rate below 1/
FMR1000:: False non-match rate at false-match rate below 0.1%
MAE:: Mean absolute error
NIQE:: Natural image quality evaluator
NSS:: Natural scene statistics
P-CAE:: Patch-based convolutional auto-encoder
PKU:: Peking University Finger Vein dataset
ReLU:: Rectified linear unit
ROI:: Region of Interest
SDUMLA-HMT:: Shandong University Multi-Modal Biometric dataset
thresh:: Threshold
UTFVP:: University of Twente Finger Vein Pattern dataset
ZeroFMR:: Zero false-match rate

References

C. Kauba, B. Prommegger, A. Uhl, Focussing the beam-a new laser illumination based data set providing insights to finger-vein recognition. In: 2018 IEEE 9th International Conference on Biometrics Theory, Applications and Systems (BTAS). IEEE. 1–9. (2018)
L. Yang, G. Yang, L. Zhou, Y. Yin, Superpixel based finger vein ROI extraction with sensor interoperability. In:ICB’15 (IEEE, Piscataway, 2015), pp.444–451
Google Scholar
S. Tang, S. Zhou, W. Kang, Q. Wu, F. Deng, Finger vein verification using a Siamese CNN. IET Biom. 8(5), 306–315 (2019)
Article Google Scholar
B. Prommegger, D. Söllinger, G. Wimmer, A. Uhl, CNN based finger region segmentation for finger vein recognition. In: 2022 International Workshop on Biometrics and Forensics (IWBF). IEEE. 1–6 (2022)
T. Arican, R. Veldhuis, L. Spreeuwers, Exploring the untapped potential of unsupervised representation learning for training set agnostic finger vein recognition. In: 2023 International Conference of the Biometrics Special Interest Group (BIOSIG), IEEE. 1–6 (2023)
A. Mittal, R. Soundararajan, A.C. Bovik, Making a “completely blind’’ image quality analyzer. IEEE Signal Process. Lett. 20(3), 209–212 (2012)
Article Google Scholar
P. Normakristagaluh, G.J. Laanstra, L. Spreeuwers, R. Veldhuis, Understanding and modeling finger vascular pattern imaging. IET Image Process. 16(5), 1280–1292 (2022)
Article Google Scholar
R. Veldhuis, S. Marcel, C. Busch, A. Uhl, Handbook of vascular biometrics (Springer Nature, Berlin, 2020)
Google Scholar
B. Hou, H. Zhang, R. Yan, Finger-vein biometric recognition: A review. IEEE Trans. Instrum. Meas. 71, 1–26 (2022)
Google Scholar
T. Arican, R. Veldhuis, L. Spreeuwers, Fingers crossed: an analysis of cross-device finger vein recognition. In: 2022 International Conference of the Biometrics Special Interest Group (BIOSIG). IEEE. 1–5 (2022)
N. Miura, A. Nagasaka, T. Miyatake, Feature extraction of finger-vein patterns based on repeated line tracking and its application to personal identification. Mach. Vision Appl. 15(4), 194–203 (2004)
Article Google Scholar
N. Miura, A. Nagasaka, T. Miyatake, Extraction of finger-vein patterns using maximum curvature points in image profiles. IEICE Trans. Inf. Syst. 90(8), 1185–1194 (2007)
Article Google Scholar
B.T. Ton, R.N. Veldhuis, A high quality finger vascular pattern dataset collected using a custom designed capturing device. In: ICB 13 (IEEE, Piscataway, 2013), pp.1–5
Google Scholar
R.S. Kuzu, E. Maiorana, P. Campisi, Vein-based biometric verification using transfer learning. In: 2020 43rd International Conference on Telecommunications and Signal Processing (TSP), IEEE. 403–409 (2020)
R.S. Kuzu, E. Maiorana, P. Campisi, Loss functions for CNN-based biometric vein recognition. In: 2020 28th European Signal Processing Conference (EUSIPCO), IEEE. 750–754 (2021)
K.J. Noh, J. Choi, J.S. Hong, K.R. Park, Finger-vein recognition using heterogeneous databases by domain adaption based on a cycle-consistent adversarial network. Sensors 21(2), 524 (2021)
Article Google Scholar
Z. Chen, J. Liu, C. Cao, C. Jin, H. Kim, FV-UPatches: Enhancing universality in finger vein recognition. arXiv (2022). https://doi.org/10.48550/arXiv.2206.01061
Article Google Scholar
J. Peng, Q. Li, X. Niu, A novel finger vein image quality evaluation method based on triangular norm. In: 2014 Tenth International Conference on Intelligent Information Hiding and Multimedia Signal Processing, IEEE. 239–24 (2014)
H. Ma, K. Wang, L. Fan, F. Cui, A finger vein image quality assessment method using object and human visual system index. In: Intelligent Science and Intelligent Data Engineering: Third Sino-foreign-interchange Workshop, IScIDE 2012, Nanjing, China. Springer 15-17 2012. Revised Selected Papers 3, 498–506 (2013)
O. Remy, J. Hämmerle-Uhl, A. Uhl, Fingervein sample image quality assessment using natural scene statistics. In: 2022 International Conference of the Biometrics Special Interest Group (BIOSIG), IEEE. 1–6 (2022)
D.T. Nguyen, Y.H. Park, K.Y. Shin, K.R. Park, New finger-vein recognition method based on image quality assessment. KSII Trans Internet Inform Syst (TIIS) 7(2), 347–365 (2013)
Google Scholar
H. Qin, S. Li, A.C. Kot, L. Qin, Quality assessment of finger-vein image. In: Proceedings of the 2012 Asia Pacific Signal and Information Processing Association Annual Summit and Conference, IEEE. 1–4 (2012)
H. Qin, Z. Chen, X. He, Finger-vein image quality evaluation based on the representation of grayscale and binary image. Multimed Tools Appl 77, 2505–2527 (2018)
Article Google Scholar
H. Qin, M.A. El-Yacoubi, Deep representation for finger-vein image-quality assessment. IEEE Trans Circuits Syst Video Technol 28(8), 1677–1693 (2017)
Article Google Scholar
J. Zeng, Y. Chen, C. Qin, Finger-vein image quality assessment based on light-CNN. In: 2018 14th IEEE International Conference on Signal Processing (ICSP), IEEE. 768–773 (2018)
Y. Yin, L. Liu, X. Sun, SDUMLA-HMT: a multimodal biometric database. In: Biometric Recognition: 6th Chinese Conference, CCBR 2011, Beijing, China. Springer. 260–268 (2011)
A.F. Frangi, W.J. Niessen, K.L. Vincken, M.A. Viergever, Multiscale vessel enhancement filtering. In: Medical Image Computing and Computer-Assisted Intervention-MICCAI’98: First International Conference Cambridge, MA, USA. Springer. 130–137 (1998)

Download references

Funding

Research is funded by Republic of Turkey Ministry of National Education.

Author information

Authors and Affiliations

Data Management and Biometrics, University of Twente, 7522NB, Enschede, The Netherlands
Tugce Arican, Raymond Veldhuis & Luuk Spreeuwers
Norwegian Biometrics Laboratory, Norwegian University of Science and Technology, Gjøvik, Norway
Raymond Veldhuis

Authors

Tugce Arican
View author publications
You can also search for this author in PubMed Google Scholar
Raymond Veldhuis
View author publications
You can also search for this author in PubMed Google Scholar
Luuk Spreeuwers
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All the authors equally contributed.

Corresponding author

Correspondence to Tugce Arican.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

No competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Arican, T., Veldhuis, R. & Spreeuwers, L. Analysing the robustness of finger vein recognition: cross-dataset reliability and vein utility. J Image Video Proc. 2024, 35 (2024). https://doi.org/10.1186/s13640-024-00643-2

Download citation

Received: 02 May 2024
Accepted: 21 August 2024
Published: 08 October 2024
DOI: https://doi.org/10.1186/s13640-024-00643-2

Analysing the robustness of finger vein recognition: cross-dataset reliability and vein utility

Abstract

1 Introduction

2 Related work

3 Methodology

3.1 Maximum curvature and Miura match

3.2 Convolutional neural networks

3.3 Patch-based convolutional auto-encoder

3.4 Finger vein pattern quality estimation

4 Datasets and experimental setup

4.1 Datasets

4.1.1 University of Twente (UTFVP)

4.1.2 Shandong University (SDUMLA-HMT)

4.1.3 Peking University (PKU)

4.2 Experiments and evaluation protocols

4.2.1 Experimental setup

4.2.2 Evaluation protocols and metrics

5 Results

5.1 Impact of the input format on the performance of the CNN

5.2 Generalisation capability of the methods on the cross-dataset comparisons

5.3 Finger vein image quality assessment

5.3.1 Validation of the proposed vein quality metric

5.3.2 Effectiveness of the vein quality metric in the comparison performance

6 Discussion

6.1 Impact of the input format on the performance of the CNN

6.2 Generalisation capabilities of the models

6.3 Vein quality assessment

7 Conclusion and future work

Availability of data and materials

Code availability

Abbreviations

References

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords