Soluble Solids Content and PH Prediction and Maturity Discrimination of Lychee Fruits Using Visible and Near Infrared Hyperspectral Imaging

DOI 10.1007/s12161-015-0186-7

Soluble Solids Content and pH Prediction and Maturity

Discrimination of Lychee Fruits Using Visible
and Near Infrared Hyperspectral Imaging
Hongbin Pu 1 & Dan Liu 1 & Lu Wang 1 & Da-Wen Sun 1,2

Received: 9 November 2014 / Accepted: 23 April 2015

# Springer Science+Business Media New York 2015

Abstract Hyperspectral imaging (HSI) technique has shown Introduction

promise as a rapid and nondestructive tool to evaluate various
internal quality attributes of fruits and vegetables. The objec- Fruits are one of the main components of human diet. They
tive of this study was to investigate the nondestructive predic- provide abundant nutritional elements for human body.
tion of soluble solids content (SSC) and pH of lychees and Therefore, both buyers and consumers attach great importance
maturity discrimination. Two hyperspectral imaging systems to the quality of fruits. Therefore, preservation techniques such as
of visible/short-wave near infrared range (600–1000 nm, refrigeration (Sun 1997; Sun and Eames 1996; McDonald et al.
Spectral Set I) and long-wave near infrared range (1000– 2001; Wang and Sun 2004; Kiani and Sun 2011; Zheng and Sun
2500 nm, Spectral Set II) were employed. Results showed that 2004) and drying (Cui et al. 2008; Delgado and Sun 2002) are
Spectral Set II (SSC: rp =0.877, RMSEP=0.911 °Brix; pH: commonly employed to maintain their qualities. The quality of
rp =0.745, RMSEP=0.291) performed better than Spectral fruits can be determined by many physical, physiological, nutri-
Set I (SSC: rp =0.790, RMSEP=1.279 °Brix; pH: rp =0.701, tional, and pathological attributes that affect fruit shelf life. These
RMSEP=0.308) for the internal quality prediction of litchi factors not only affect the taste and color of fruit but also act as a
and maturity discrimination. The partial least square discrim- prerequisite for synthesis of fruit vitamin. Lychee or litchi
inant analysis (PSL-DA) model had a discrimination rate of (Lychee chinensis Sonn.), a member of the family
90.63 % for Spectral Set I and 96.88 % for Spectral Set II. β- Sapindaceae, is a subtropical to tropical fruit with high commer-
Coefficients of partial least squares regression (PLSR) models cial value in international trade, and its consumption has in-
were used to choose optimal wavelengths for quality predic- creased significantly in recent years due to its flavor and nutritive
tions. The performance of optimized PLSR in both spectral components. China is the leading lychee producer followed by
sets were comparable to the models developed using the India and Taiwan (Ghosh 2001). In recent years, other countries
whole spectral range. have recognized the market potential of lychees internationally
and have increased fruit import from both southern and northern
hemisphere production areas (e.g., China, Israel, Australia,
Keywords Nondestructive . Hyperspectral imaging . Thailand, India, Vietnam, and Africa) by taking advantage of
Lychee . Maturity discrimination . Soluble solids content . pH seasonal differences in production. However, the highly perish-
able nature of the fruit makes it difficult for the lychees to with-
* Da-Wen Sun
stand long-distance shipment. Therefore, higher quality and more
[email protected]; www.ucd.ie/refrig; www.ucd.ie/sun consistent fresh lychees at the origin country are required in order
to meet the quality standards upon arrival at the destination.
The quality attributes of a fresh lychee include external
College of Light Industry and Food Sciences, South China
University of Technology, Guangzhou 510641, People’s Republic of
appearance (size, shape, color, gloss, and free from defects
China and decay) and internal quality aspects (firmness, organic acid,
Food Refrigeration and Computerised Food Technology (FRCFT),
soluble solids content, pH, and vitamins). While external ap-
Agriculture and Food Science Centre, University College Dublin, pearance can be assessed by modern imaging or computer
National University of Ireland, Belfield, Dublin 4, Ireland vision technology (Sun and Brosnan 2003; Jackman et al.
2008; Costa et al. 2011; Sun 2004; Wang and Sun 2002), optimal wavelengths) demonstrated good prediction perfor-
internal quality aspects are normally evaluated by traditional mance. On the other hand, (Rajkumar et al. 2012) studied
analytical methods. Among the internal quality attributes of a banana fruit quality and maturity stages using hyperspectral
lychee, soluble solids content (SSC) and pH are probably the imaging technique, and with the MLR models based on the
most important internal quality indicator of fruit maturity and optimal wavelengths a coefficient of determination of 0.85 was
postharvest quality of a lychee. Traditional analytical methods achieved for predicting TSS of the banana fruits. Recently, a
used for these quality measurements are based on complex relatively lower correlations (rp =0.69–0.79) were obtained for
preparation of samples, using expensive chemical reagents SSC prediction of blueberries using NIR hyperspectral imag-
and involving a considerable amount of manual work. ing in the spectral range of 500–1000 nm (Leiva-Valenzuela
Furthermore, traditional methods are also destructive. To meet et al. 2013). In addition, some researchers have also investigat-
the quality requirements and to improve competitiveness of the ed the possibility of using fluorescence mode for fruit SSC
fruit production industry, rapid, nondestructive, and automated evaluations, the prediction accuracy of SSC obtained by
inspection techniques and grading systems are essential. (Noh and Lu 2007) using “Golden Delicious” apple was with
Among them, spectroscopic and hyperspectral imaging sys- r=0.66 and RMSEP=1.19 and by (Liu et al. 2008) in navel
tems have many advantages compared to classical chemical orange was with r = 0.96 and SEP = 0.28. Furthermore,
and physical analytical methods in the assessment of fruit in- hyperspectral imaging was used to assess SSC of Golden
ternal qualities (Nicolaï et al. 2007; Lorente et al. 2012; Delicious apples (Peng and Lu 2008; Mendoza et al. 2011),
Magwaza et al. 2012; Cen et al. 2014; Yu et al. 2014), in with r=0.883 and SEP=0.73 %, and r=0.88 and SEP=0.7
particular hyperspectral imaging has recently been extensively being obtained by (Peng and Lu 2008) and (Mendoza et al.
investigated for food quality evaluation (ElMasry et al. 2011; 2011), respectively. All these studies demonstrated the feasi-
Barbin et al. 2012a; Wu and Sun 2013; Wu et al. 2012; bility of HSI to measure internal quality attributes of fruit.
Elmasry et al. 2012; Barbin et al. 2012b). These advantages Despite the increased application of HSI technique in fruit
include short measuring time with limited sample preparation quality assessment, few studies have sought to compare HSI
and allowing several constituents to be evaluated at the same instruments in different spectral ranges, in particular no
time. A number of independent research groups are exploring hitherto-published research has dealt with the prediction of
nondestructive visible/near infrared (Vis/NIR) spectroscopy SSC and pH in lychees and its maturity discrimination by
technology for fruit analysis (McGlone and Kawano 1998; HSI. Therefore, this study aimed to compare the performance
McGlone et al. 2002, 2003). Additionally, many other newly of two commercially available HSI instruments in different
published papers have also investigated the feasibility of Vis/ spectral ranges and the corresponding models established in
NIR and NIR spectroscopy for predicting the SSC and pH of predicting pH and SSC of lychees as well as the possibility of
various fruits, such as apple (Bobelyn et al. 2010; Bertone et al. maturity discrimination.
2012; Mendoza et al. 2012), pear (Sun et al. 2009; Paz et al.
2009), grape (Cao et al. 2010; Parpinello et al. 2013), pineap-
ple (Seng Chia et al. 2012), kiwifruit (Moghimi et al. 2010), Materials and Methods
etc, showing the capability of spectroscopic technique for the
prediction of fruit internal quality characteristics. However, Fruit Samples and Reference Measurements
spectroscopic method has a drawback compared with
hyperspectral imaging because it acquires the spectral data Fresh lychee fruits (Lychee chinensis Sonn. “Guiwei”) at com-
from a single point or from a small portion of the tested fruit. mercial maturation season were obtained from the commercial
Hyperspectral imaging, on the contrary, has advantages of be- orchards in Guangdong Province, China, from June to
ing able to receive distributed spectral responses from the July 2013. It is essential to have sound appearance of the
whole surface of a fruit image (Sun 2010). Attempts on using tested fruits for the experiments and therefore all abnormal
hyperspectral imaging as a nondestructive method for fruits were discarded. Ninety-six fruits free from any abnor-
assessing internal quality attribute of fruits have been investi- mal features such as blemish, bruises, diseases, and contami-
gated by some authors, the majority of which have focused on nations were selected. Before the analysis of the samples, all
soluble solids content prediction (ElMasry et al. 2007; Peng fruits were manually classified into two groups for
and Lu 2008; Mendoza et al. 2011; Rajkumar et al. 2012; representing immature and mature stages with each groups
Leiva-Valenzuela et al. 2013). (ElMasry et al. 2007) investigat- consisting of 48 lychee fruits, the fruits containing equal or
ed the possibility of using hyperspectral imaging (400– more than 30 % green area were classified as immature fruits
1000 nm) for nondestructive determination of total soluble (Fig. 1b). The average size of the lychee fruit was 3–3.5 cm in
solids (TSS) and acidity (expressed as pH) in strawberry. The diameter. The group was selected randomly for hyperspectral
optimized MLR models for TSS (SEP=0.211, r=0.80, and six imaging and subsequent quality parameters measurement.
optimal wavelengths) and pH (r=0.94, SEP=0.091, and eight The tested fruits in each maturity stage were randomly divided
Hyperspectral Imaging Systems

There were two hyperspectral imaging systems used in this

study for the acquisition of hyperspectral images of lychee.
The working spectral ranges of two systems were 400–1000
(System I) and 1000–2500 (System II). Both systems were
line-scanning configuration. As shown in Fig. 1a, System I
consisted of a line-scan spectrograph (Imspector V10E,
Spectral Imaging Ltd., Oulu, Finland) covering the spectral
range of 400–1000 nm, a high-resolution 1004×1002 camera
(DL-604 M, Andor, Ireland), a camera lens with focal length
of 23 mm (OLE23, Schneider, Germany) for the spectral
range of 400–1000 nm, an illumination source (150-W halo-
gen lamp source attached to a fiber optic line light positioned
at an angle of 45° to the moving belt), a conveyer belt operated
by a stepper motor (IRCP0076-1COMB, Isuzu Optics Corp.,
Taiwan, China), and a computer equipped with data acquisi-
tion software (Spectral Image software, Isuzu Optics Corp.,
Taiwan, China). The main components of System II consisted
of a line-scan spectrograph (Specim V25E, Spectral Imaging
Ltd., Oulu, Finland) covering the spectral range of 1000–
2500 nm, a high-resolution 320 × 256 camera (XC403,
Xenics Infrared Solutions, Leuven, Belgium), a camera lens
with focal length of 30.7 mm (OLES30, Xenics Infrared
Solutions, Leuven, Belgium) for the spectral range of 1000–
2500 nm, two 150-W halogen lamps forming the illumination
Fig. 1 Hyperspectral imaging systems and example of lychee RGB
images. a Configuration of the hyperspectral imaging systems: 1, unit (2900-ER, Illumination Technologies Inc., New York,
1004×1002 CCD camera; 2, spectrograph (400–1000 nm); 3, camera USA), a conveyer belt operated by a stepper motor
lens (23-mm focal length); 4, fiber optic line light; 5, 320×256 CCD (IRCP0076-1COMB, Isuzu Optics Corp., Taiwan, China),
camera; 6, spectrograph (1000–2500 nm); 7, camera lens (30.7-mm
and a computer equipped with data acquisition software
focal length); 8, halogen lamp; 9, litchi sample; 10, translation belt; 11,
motor; and 12, dark chamber. b RGB image of litchi fruit for immature (a) (Spectral Image software, Isuzu Optics Corp., Taiwan, China).
and mature (b) In experiments, the systems were able to scan lychees placed
on the translation belt. In order to obtain good-quality images,
the exposure time was adjusted to 30 ms and speed of displace
platform was 1.4 mm/s for System I, while the exposure time
into two subgroups. Subgroup 1 consisted of 64 fruits with 32 was adjusted to 2 ms and speed of displace platform was
samples for each maturity stages, which were used as a train- 22 mm/s for System II. Each collected spectral images was
ing set for developing calibration models. Subgroup 2 stored as a three-dimensional image (x, y, λ). For correction of
consisted of 32 fruits with 16 samples for each maturity stages light source effects, the original hyperspectral images (R0) were
and were used for validation of the models. corrected into the reflectance mode (Rc) based on white refer-
Two quality attributes of each fruit (SSC and pH) were ence images W for a standard teflon white tile (~100 % reflec-
measured and used as indicators of fruit quality. After acquir- tance) and black reference images B for dark current (~0 %
ing the spectral images, each fruit was peeled and the pulp reflectance). The formula applied was as follows:
portion was juiced to determine pH and SSC at room temper-
R0 −B
ature of 25 °C. The pH of the lychee juice was determined Rc ¼  100% ð1Þ
W −B
with a pH meter (Model FE20, Mettler-Toledo Co., Zurich,
Switzerland), and soluble solids content (°Brix) was deter- The corrected images formed the basis for the subsequent
mined using a digital refractometer with a range of 0–53 image analysis to extract information about the spectral prop-
°Brix (Model PAL-1 3810, Atago Co., Tokyo, Japan). The erties of each fruit for selection of effective wavelengths and
SSC of the lychee fruit varied between 15.8 and 21.4 °Brix multivariate analysis purposes. Due to the decreased CCD
with a mean of 18.9 °Brix and a standard deviation (SD) of detector sensitivity in the wavelength regions of 308–
1.44 °Brix. The pH of the lychee fruit varied between 4.43 and 598 nm with noises, subsequent analysis of hyperspectral im-
5.69 with a mean of 5.12 and a SD of 0.34. ages acquired by System I was performed only on data in the
spectral range of 600–1000 nm. Finally, two sets of spectral selecting key wavelengths, particularly when a large number
data were obtained, which were the one within the wavelength of variables are involved. In this study, β-coefficients
ranges of 600–1000 nm (Spectral Set I, 248 wavelength var- resulting from the PLS calibration model were used for iden-
iables) and the one within the wavelength ranges of 1000– tifying the optimal wavelengths. The wavelengths corre-
2500 nm (Spectral Set II, 236 wavelength variables). sponding to the highest absolute values (regardless of the sign)
of β-coefficients were considered as optimal wavelengths.
Extraction of Spectral Data The PLS algorithm determines a set of orthogonal projection
axes W, called PLS weights, and wavelength scores T. The
After the image acquisition, an algorithm was implemented to regression coefficients (β) can be obtained by regressing Y
segment each lychee from the background for each hypercube onto the wavelength scores T as follows (Chong and Jun
of lychees and calculate their mean spectra (Fig. 2). For 2005):
hyperspectral images in Spectral Set I, a binary mask was built
y ∧ ¼ X Wkβ ¼ Tβ ð2Þ
to recognize the fruit from the background using threshold
segmentation in the hypercube. This was accomplished in β ¼ W * qT ¼ W P T W qT ð3Þ
the spectral image at 710 nm, which gave the maximum con-
trast between the lychees and the background. The white where ŷ is the predicted value of the attribute of interest, X
pixels in the mask were used as a region of interest (ROI) to is an n×m matrix of spectral data (n is the number of samples
extract the spectral data from the calibrated hyperspectral im- or number of spectra, m is the number of wavelengths), β is
age. In total, 96 average spectra (600–1000 nm) representing the vector containing the regression coefficients (m×1) ob-
the 96 tested lychees were calculated and stored for develop- tained by the calibration model, T is the wavelength scores,
ing regression models. Similar segmentation process was also W* is the m×k matrix of X-weights defining the common
conducted for hyperspectral images in Spectral Set II as latent variable space T (n×k) relating X and y (k is the number
shown in Fig. 2e–h. Background segmentation and extraction of latent variables), q is the y-loadings (1×k), p is the m×k
of reflectance spectra from the hyperspectral images were car- matrix of X-loadings and W is PLS weights.
ried out using the software ENVI 4.8 (ITT Visual Information Prediction models between the spectral reflectance and the
Solutions, Boulder, CO, USA). quality parameters (SSC and pH) of the fruits were developed
by using PLS analysis. PLSR is a robust and reliable method
Selection of Optimal Wavelengths and Regression Model for constructing empirical predictive models (Wold et al.
Development for Predicting Quality Attributes 2001). The spectra utilized here included two categories, i.e.,
full-wavelength spectra and simplified spectra. Therefore, two
From a multivariate calibration perspective, the problem of categories of PLSR calibration models were built. All steps
multicollinearity among contiguous wavelengths makes described for spectral analysis were carried out in multivariate
wavelength selection necessary. Several wavelength selection analysis software (Unscrambler version 9.7, CAMO,
approaches have been discussed in recently published reviews Trondheim, Norway). The quality of calibration models were
(Zou et al. 2010; Liu et al. 2013). Partial least square (PLS) evaluated by root mean square error of calibration (RMSEC),
regression is one of the commonly used procedures for root mean square error of prediction (RMSEP), and the

Fig. 2 Main steps involved in

segmentation of hyperspectral
image: a, selecting 710 nm image;
b, binarization (defining the ROI);
c, applying the mask I; d,
extracting litchi spectra (600–
1000 nm); e, selecting 1675 nm
image; f, binarization; g, applying
the mask II; and h, extracting
litchi (1000–2500 nm)
correlation coefficient (r) between the predicted and measured lychees had a SSC of 15–25 % and an estimated water content
value of the attribute. In addition, ratio of performance devia- of 80–90 %.
tion (RPD), i.e., the ratio of the standard deviation of the In the long wave near infrared spectral region (1000–
reference values (SD) over the root mean square error of 2500 nm), the observed absorption peaks (Fig. 3b) were due
cross-validation (RMSECV) was employed. to absorptions by hydrogen containing bonds (O–H, C–H, and
N–H) (Osborne et al. 1993), which are basics of the constitu-
ents of lychee samples and are critical for the attribute predic-
Spectral Analysis for Identifying Maturity Stage tion. The presence of water in lychee caused an absorption
peak appeared at 1050 (O–H stretching second overtones).
Principal component analysis (PCA) models were generated Another local absorption maximum appeared at 1250 nm
in an effort to classify the immature and mature fruits into two was due to C–H stretching second overtone. Besides, the peak
distinct classes. The first few principal components resulting at 1661 nm and 1839 nm was associated with the C–H stretch
from PCA are usually used to examine the common features and O–H stretch of sugar. It is clear that the main cause of
among samples and their grouping. Plotting the principal com- spectral variation was due to changes in water, amino acid and
ponent score plot of the first few principal components carbohydrate composition or contents during maturity stages.
resulting from PCA will give common features among sam-
ples and their grouping (Abdi and Williamns 2010). Prediction Results for SSC and pH of Lychee Samples
Partial least square discriminant analysis (PLS-DA) was
also performed to develop supervised classification models PLSR Models Using the Whole Spectral Range
for the immature and mature lychee fruit. PLS-DA is a classi-
fication method based on modeling the differences between PLS prediction results for SSC and pH in the two spectral sets
several classes with PLS by assigning the reference value are presented in the scatter plots (Fig. 4a–d). In all figures, the
(dummy variable) for each sample (Pholpho et al. 2011). ordinate and abscissa represent the predicted and measured
The evaluation of model performance was achieved through fitted values of the appropriate parameters, respectively. The
the analysis of discrimination rate (%) by the following equa- values of correlation coefficient obtained in Vis/NIR region
tion (Teye et al. 2013): (rp =0.79) were less than those obtained by (ElMasry et al.
N1 2007) with strawberries with rp =0.85 and RESEP=0.184
DR ¼  100% ð4Þ °Brix; and those obtained by (Leiva-Valenzuela et al. 2013)
with blueberries with the rp values ranging between 0.78 and
where DR is the discrimination rate (%), N1 is the number of 0.82 and the RMSEP values between 1.30 and 1.55 %, which
samples correctly identified in either the training set or predic- was similar to other authors’ results. The limited prediction
tion set, and N2 is the total number of samples used in either accuracy of PLSR model with Vis/NIR spectra is possibly due
the training set or prediction set. to the presence of uneven thickness and roughness of lychee
pericarps as well as complex pulp texture and composition. In
addition, previous studies (Lu 2004; Ruiz-Altisent et al. 2010)
have also suggested that the visible spectral region was less
Results and Discussion useful for SSC prediction. This assumption was in line with
the finding of the current research. Compared to the spectral
Typical Characteristics of the Reflectance Spectra region of 600–1000 nm, the use of 1000–2500 nm did result in
of Lychee significant improvements for the prediction of SSC (Fig. 4a,
b). The rp and RMSEP for 1000–2500 nm were 0.877 and
The typical spectral pattern of lychee sample in two spectral 0.911 °Brix, respectively, in comparison with 0.790 and 1.279
sets is shown in Fig. 3. In case of Spectral Set I, the reflectance °Brix for 600–1000 nm. The choice of the spectral range
curve had three broadband absorption regions around 680 and 1000–2500 nm was made to focus on those wavelengths that
960 nm in addition to small absorption region at 840 nm proved more sensitive to soluble solid contents.
(Fig. 3a). The region around 680 reflects the maturity of ly- The pH prediction results using the two spectral regions
chees accompanied with the change of fruit color caused by were not as good as those for SSC, with the rp values equaling
anthocyanins and chlorophyll contents and the peak was larg- 0.701 and 0.745 and the RMSEP of 0.308 and 0.291 for
er for immature fruit (McGlone et al. 2002; Peshlov et al. Spectral Set I and Spectral Set II, respectively (Fig. 4c, d).
2009). The absorption regions in the short wave NIR at 840 The predictability of pH obtained in this study was lower than
and 960 nm were likely attributed to the sugar absorption those obtained by (ElMasry et al. 2007) using strawberry with
bands and the combination effect of OH groups from carbo- rp =0.87 and RMSEP=0.129; by (Baiano et al. 2012) with rp =
hydrates and water (McGlone and Kawano 1998), since 0.893 and 0.947 using red and black grapes; and by (Moghimi
Fig. 3 Mean spectra extracted

from litchi samples in wavelength
range of 600–1000 nm (a) and
1000–2500 nm (b)

et al. 2010) using kiwifruits with rp =0.943 and RMSEP= (1000–2500 nm), which contain information on both sugar
0.076. However, these differences could be due to the differ- (carbohydrate) and water components. In addition, (Liu et al.
ent structure of lychee from the other fruits and lower levels of 2004) have proved that sugar content and acidity in fruit could
organic acids in lychees with respect to soluble solids be more accurately determined using near infrared spectros-
concentrations. The PLSR model exhibited a relatively copy (800–2500 nm) and (Cozzolino et al. 2006) have con-
good capability in predicting pH in System II spectral firmed that the regions around 1400, 1900, and 2170 nm were
sets (Fig. 4c, d). Therefore, the spectral range of 1000– related to O–H and C–H bonds that would be linked to SSC in
2500 nm was selected in order to focus on those wave- wine grapes.
lengths that proved more sensitive to pH. The results
showed that for predicting both SSC and pH in lychees, PLSR Models Using the Selected Wavelengths
the NIR spectral region was more useful.
The models shown in Fig. 4a, c were not good because of In many situations, wavelength selection can improve
the visible spectrum stayed relatively flat from 720 to 960 nm. model performance and model characteristics (e.g.,
As the regions around 840 and 960 nm are likely related to the higher speed and greater cost-effectiveness) by identify-
sugar absorption and the OH groups from carbohydrates and ing and removing useless, noisy, and redundant wave-
water (McGlone and Kawano 1998), however the pH and lengths. Appropriate wavelength selection can not only
SSC of lychee are dependent on both carbohydrate and water expedite data processing and improve model accuracy
information contained in the spectra. It was clear that the vis- and robustness but also facilitate the establishment of
ible spectrum contained little carbohydrate and water informa- consistent hyperspectral imaging systems with simple
tion, whereas in near infrared spectral region (1000– structure, short acquisition time, and low cost for real-
2500 nm), the peaks at 1050 and 1250 nm were O–H stretch time applications. In this study, β-coefficients resulting
and C–H stretching second overtones of water. Besides, 1661 from PLSR models were employed to allocate important
and 1839 nm were associated with the C–H stretch and O–H wavelengths aiming to establish simplified regression
stretch of sugar. The main spectral variations were due to the models. For predicting SSC, six feature wavelengths in
changes in water, amino acid, and carbohydrate composition visible region and short-wave near infrared region (641,
in lychee samples. Therefore, the prediction of the models for 686, 742, 855, 939, 964 nm) and 12 optimal wave-
pH and SSC was the best for the region of long wavelengths lengths in long-wave near infrared region (1031,1089,

Fig. 4 The score plots of PCA

conducted using Spectral Set I (a)
and Spectral Set II (b)
1357, 1477, 1744, 1890, 2023, 2218, 2275, 2351,2383, (rp <0.9) was achieved, possibly due to its unique structure
2445 nm) were identified, which corresponded to the and properties.
highest absolute values of β-coefficients in the plot in
spite of its negative or positive sign (figure not shown). Prediction Results for Maturity Discrimination of Lychee
It is clear that the regions around 680 nm reflected the Fruit
change of fruit color caused by anthocyanins and chlo-
rophyll. The major absorbance regions of water around PCA Analysis
960 nm were found in the selected visible range. In
addition, six optimal wavelengths of 681, 715, 761, PCA is a technique used to interpret spectral data by identify-
816, 907, and 960 nm in Spectral Set I and 11 wave- ing the most important directions of variability in the multi-
lengths of 1089, 1165, 1274, 1357, 1478, 1763, 1890, variate data space. This technique has been employed by
2023, 2276, 2326, and 2382 nm in Spectral Set II were (Reichel et al. 2010) for classification of harvest maturity of
found to be responsible for pH prediction of the fruit. lychee fruit in terms of its quality changes during cold storage,
Once the optimal wavelengths were selected, these wave- and by using this analysis, it was realized the differentiation of
lengths were then used as effective wavelengths to replace the four maturity classes from premature to full-mature lychee
full range spectra for the prediction of SSC and pH of lychees. fruit. PCA was carried out in this study to visualize the trends
PLSR models were established based on the reduced spectral or clusters among samples from their spectral profile. Figure 5
data and the results are shown in Table 1. It can be seen that shows the score plots of the first two PCs from PCA conduct-
the performance of optimized PLSR models in both spectral ed on all full spectral data in Spectral Set I (accounted for 63 %
sets were almost similar or better to the models developed of the difference in maturity stages of lychee fruits, Fig. 5a)
using the whole spectral range. This is attributed to the fact and Spectral Set II (accounted for 63 %, Fig. 5b). It was no-
that the problems of colinearity and overfitting were alleviated ticed that two groups, or clusters, can be observed in the scat-
in optimized models that utilized only the essential wave- ter plot, meaning that the profiles of objects in the same cluster
lengths and neglected the useless wavelengths that do not are very similar and the profiles of objects in different clusters
carry much spectral information (Zou et al. 2010). The SSC are distinct. There presented also some overlaps in the score
prediction accuracy of optimal PLSR models with six and 11 plot. The maturity discrimination of lychee fruit was not very
wavelengths were higher in the validation set respectively satisfactory in the principal component space. The adoption of
(Spectral Set I: rp =0.777, RMSEP=1.227; Spectral Set II: supervised classification method has potential for further im-
rp =0.874, RMSEP=0.886) than in the training set (Spectral provement of the classification accuracy.
Set I: rc =0.737, RMSEP=0.849; Spectral Set II: rc =0.849,
RMSEP=0.665), indicating a relatively good performance PLS-DA Discrimination Model
of the models for predicting SSC nondestructively. However,
for pH, the results of the validation sets were not as good as PCA is an unsupervised method, and it cannot be used for
those in the calibration sets. It was also shown that optimal building predictive model, for instance, to classify samples
PLSR model performed better in Spectral Set II than in into one or another category (Lin et al. 2011). In such situa-
Spectral Set I, with the highest RPD of 1.877 and1.741 for tions, the supervised classification method PLS-DA was
SSC and pH, respectively. Although the PLS models with adopted to develop discrimination models (Pérez-Enciso and
Spectral Set II were slightly better than Spectral Set I, no Tenenhaus 2003). PLS-DA model was thus built considering
satisfactory prediction result of SSC and pH of lychee fruit the spectra as X variables, while y variables were associated

Table 1 Performance of PLSR

(full and simplified) models for Spectral Attributes No LVs RPD Calibration Prediction
SSC and pH prediction based on
different spectral sets set of W. rc RMSEC rp RMSEP

I SSC 248 6 1.492 0.713 0.881 0.790 1.279

6 4 1.539 0.737 0.849 0.777 1.227
PH 248 3 1.576 0.754 0.198 0.701 0.308
6 4 1.537 0.737 0.204 0.672 0.315
II SSC 236 6 1.873 0.850 0.659 0.877 0.911
12 6 1.877 0.849 0.665 0.874 0.886
PH 236 5 1.771 0.842 0.162 0.745 0.291
11 4 1.741 0.812 0.176 0.774 0.265
Fig. 5 Prediction results of SSC

and pH using full wavelength
PLSR models in the two spectral
sets: a SSC prediction based on
Spectral Set I, b SSC prediction
based on Spectral Set II, c pH
prediction based on Spectral Set I,
and d pH prediction based on
Spectral Set II

with the lychee maturity (one different y variable with −1 or 1 models were obtained for SSC (rp = 0.877, RMSEP =
for immature and mature stage, respectively, depending on 0.911 °Brix) and pH (rp =0.745, RMSEP=0.291) when
whether the tested sample belonged or not to the considered the reflectance spectra in the range 1000–2500 nm were
data group). The classification results for unknown samples used, but lower regression coefficients of 0.790 and
are shown in Table 2. The correct identification rate of 90.63 0.701 for SSC and for SSC were obtained when the
and 96.88 % for maturity discrimination was achieved for range was limited to that of the Vis/NIR equipment
Spectral Set I and Spectral Set II, respectively. The discrimi- (600–1000 nm). β-Coefficients of PLSR models were
nation of maturity stages among lychee fruits can be reflected used to choose optimal wavelengths for each quality
by their specific fingerprint of spectra coupled with multivar- parameter (SSC and pH). The performance of optimized
iate classification method. PLSR in both spectral sets was almost similar or better
to the models developed using the whole spectral range.
Spectral features were used to identify maturity stages
Conclusions of lychee samples. High classification accuracy of
90.63 % for Spectral Set I and 96.88 % for Spectral
Measurements of soluble solids content (SSC) and acid- Set II for correctly identifying lychee ripeness stage
ity (pH) and maturity discrimination of lychees were was achieved using the PLS-DA. The results indicated
achieved using hyperspectral imaging. Reasonable that HSI is more useful to predict the state of maturity

Table 2 Confusion matrix for

maturity stage discrimination Spectral set No. of samples Maturity stages Prediction results
using PLS-DA
Mature Immature DR

I 16 Mature 13 3 90.63 %
16 Immature 0 16
II 16 Mature 16 0 96.88 %
16 Immature 1 15
of lychee fruit than to predict pH and SSC of lychee ElMasry G, Wang N, ElSayed A, Ngadi M (2007) Hyperspectral imaging
for nondestructive determination of some quality attributes for
fruits. Further improvements in the spectral acquisition
strawberry. J Food Eng 81:98–107
mode and use of more refined physical models may ElMasry G, Iqbal A, Sun D-W, Allen P (2011) Quality classification of
achieve better prediction of lychee internal qualities. cooked, sliced turkey hams using NIR hyperspectral imaging sys-
tem. J Food Eng 103(3):333–344
Elmasry G, Kamruzzaman M, Sun D-W, Allen P (2012) Principles and
Acknowledgments The authors gratefully acknowledge the Guang- applications of hyperspectral imaging in quality evaluation of agro-
dong Province Government (China) for its support through the program food products: a review. Crit Rev Food Sci Nutr 52(11):999–1023
“Leading Talent of Guangdong Province (Da-Wen Sun).” This research Ghosh SP (2001) World trade in litchi: past, present and future. Acta
was also supported by the National Key Technologies R&D Program Horticult 558:23–30
(2014BAD08B09) and the International S&T Cooperation Projects of Jackman P, Sun D-W, Du C-J, Allen P (2008) Prediction of beef eating
Guangdong Province (2013B051000010). quality from colour, marbling and wavelet texture features. Meat Sci
Conflict of Interest Hongbin Pu declares that he has no conflict of Kiani H, Sun D-W (2011) Water crystallization and its importance to
interest. Dan Liu declares that she has no conflict of interest, Lu Wang freezing of foods: a review. Trends Food Sci Technol 22(8):407–426
declares that she has no conflict of interest. Da-Wen Sun declares that he Leiva-Valenzuela GA, Lu R, Aguilera JM (2013) Prediction of firmness
has no conflict of interest. This article does not contain any studies with and soluble solids content of blueberries using hyperspectral reflec-
human or animal subjects. tance imaging. J Food Eng 115:91–98
Lin H, Zhao J, Sun L, Chen Q, Zhou F (2011) Freshness measurement of
eggs using near infrared (NIR) spectroscopy and multivariate data
analysis. Innovative Food Sci Emerg Technol 12(2):182–186
Food Anal. Methods

