The objective of this study was to examine the accuracy of water reflectance measurements acquired by citizens using the HydroColor application, aiming to provide recommendations for extending the use of Hydrocolor to “fisher scientists”. Although other mobile sensing applications exist, HydroColor is the focus of this research. Water reflectance samples from HydroColor and SAS solar tracker were collected on the BC Ferry Queen of Oak Bay crossing the Strait of Georgia daily, during July to September 2016. The main findings show that the HydroColor citizen data are accurate compared with hyperspectral instrument data for most bands and band ratios; however, citizen level of training and environmental conditions play a role in the data quality.
4.1. Citizen Participation and Data Quality
During the study period, which corresponds to the regional tourist season, over 200,000 passengers per month travelled along the ferry route from Departure Bay to Horseshoe Bay (traffic data provided by BC Ferries:
http://www.bcferries.com/about/traffic.html). The total number of passengers participating in the data collection was lower than the number attending the educational talk just before the measurements. In addition, the total sample number acquired by regular citizens (FP) was lower than the trained citizen (CN) (
Table 3, 446 versus 824). This result suggests that either: (1) volunteer participation for this type of data acquisition may not be as effective, similar to the findings by Kotovirta et al. [
42] studying the surface algal bloom citizen monitoring data; or (2) the used methodology prevents larger volunteer participation. The first can be dealt with using an incentive mechanism, such as micro-payments [
69]. The second, most likely in this case, could be prevented by having more trained citizens aboard the ferry to help passengers with data acquisition.
In our methodological framework, only one trained citizen, “the coastal naturalist”, was allowed to guide passengers interested in participating in the experiment (a requirement from BC Ferries). The CN frequently had several people asking questions after the presentation, making it difficult to help all passengers who were willing to acquire data. It would have been useful to have more than one trained citizen (CN) to help with the data collection. In addition, only adults (over age 18) were allowed to participate in data collection. Interestingly, children showed a large interest in using HydroColor for data acquisition, and it would be a valuable experiment to obtain data quality measures for various age groups in future studies.
The CN (trained citizen) collected a higher number of samples of higher quality than the passengers (untrained citizen) (77% versus 35%,
Table 3). This result was not a surprise as the CN had an educational background in environmental science and received training on the use of HydroColor before working on this program. When classifying the untrained citizens’ samples, the main errors were associated with capturing the ship’s shadow and white foam in water photos (
Figure 5c). Given the required geometry for data acquisition azimuth angle of 135
from the Sun [
41], and the size of the ferry structure, the ship’s shadow and foam are likely to appear in some of the water photos if care is not taken. Focused training such as that given to the CN, can help with improving skills to avoid these errors. The training effect is highly significant (
p < 0.001), which implies that a certain degree of training is necessary for high-quality water reflectance sample acquisition using HydroColor. This finding is similar to other studies working with plant species/phenophase [
26] and terrestrial invertebrate biodiversity [
28]. Volunteers, having around six hours of previous training, could correctly identify 91% of the plant phenophases for a variety of species compared with experts (Fuccillo et al., 2015). The volunteers’ sampling performance for invertebrates, with training in the field, is similar to (
p > 0.05) expert researchers [
28]. Thiel et al. [
34] indicated that appropriate training varies from study to study and should be considered before instructing volunteers.
Besides training, the patience of citizen scientists is also a concern for data quality; for instance, some volunteers did not finish crabs’ size measurements in a study by Delaney et al. [
32] since these volunteers thought this process was tedious. In our case, passengers needed the patience to find the correct direction for taking high quality samples. Another approach to improve crowdsourcing data quality suggested by Rogstadius et al. [
70] is to emphasize intrinsic motivation such as helping other people.
4.2. Environmental Variables and Data Quality
Similar to many studies [
53,
57,
60,
71], our results demonstrated the importance of environmental variables (sun zenith and sky condition) on above-water reflectance (
) measurement quality. HydroColor data quality was found to be better (
p < 0.05) during the noon (time: 12:50–2:30 p.m., solar zenith: 17.4
to 39.3
) ferry run than the morning (time: 8:30–10:10 a.m., solar zenith: 40.5
to 67.6
) run (
Table 4). A higher percentage of bad samples acquired in the morning (21% versus 13%) was most often due to contamination by the shadow of the ship on the water photos. As sun zenith angle increases, the shadow of the vessel becomes larger and consequently more likely to be detected as part of a sample [
71]. In addition to the shadowing effect, the water-leaving radiance is highly related to the time of day or solar zenith [
57,
72]. Atmospheric attenuation of solar irradiance in the visible and near-infrared spectra decreases as the solar zenith angle decreases [
72]. At lower zenith angles, higher irradiance reaches the ocean surface, and therefore more of the water-leaving radiance is detected by the sensor [
57]. In our study, morning runs in late August and the beginning of September happened when zenith angles were between 60.0
and 67.6
. At these angles, less irradiance reaches the ocean surface, and consequently less radiance from the water is available for detection by the camera. As a result, it is difficult to detect and calculate satisfactory water reflectance samples in these runs. These findings concur with other studies, showing that a lower sun zenith angle reduces the effects of sun glint, low solar irradiance, weak water-leaving radiance, and wave-shadowing [
53,
57,
71,
72].
Sky condition also affected the quality of the data. Higher quality HydroColor data were collected under clear sky conditions than under cloudy and overcast conditions, and the cloudiness effect was significant (
p < 0.05,
Table 5). This finding is consistent with existing studies recommending a clear sky condition for
measurement [
53,
60,
71]. In principle, atmospheric light attenuation in clear sky conditions, mostly due to Raleigh scattering, is relatively more constant than in cloudy or overcast conditions [
72]. Further, the reflection of clouds in the water is brighter than the reflectance of blue sky; therefore, more skylight reflected from the water surface was detected by the water-viewing sensor under cloudy conditions [
53,
71]. Hooker and Morel [
60] show that the cloudiness effect is not systematically detected if cloud cover is under a certain threshold. Larger and lower-level cloud cover has a higher effect on the
measurements by increasing the magnitude of radiance reflected into the sensor [
53].
In addition, to affirm the importance of collecting data under a clear sky, we also found that HydroColor data acquired under overcast conditions were overall the poorest quality; in particular, a relatively large percentage of bad samples (89%) showed water reflectance values equal to zero. This is likely due to the lower sensitivity of the iPad mini 4 camera at low irradiance conditions. This result corroborates the work of Salisbury [
72] that illustrated that less measurement radiance signal can be detected under a hazy sky because of blurry shadows, and the work of Garaba and Zielinski [
71] that showed that light is more diffuse in hazy skies than clear skies.
Beyond assessing HydroColor data quality, we examined comparisons between HydroColor and the gold standard SAS solar tracker to determine how accurate HydroColor can be as an optical instrument to measure
. The evaluation of the accuracy of qualified (classified as perfect and good quality) HydroColor citizen data revealed the larger mean and higher variability (
Table 6) for the observed three bands in relation to the SAS solar tracker data (
). A possible reason for this result is that different
values were used to calculate
and
. The
is used to remove the proportion of the sky radiance that is reflected off the water surface and detected by the water radiance sensor [
53]. In a flat sea surface, the reflected sky radiance can be prescribed by viewing geometry alone, according to the Fresnel reflectance [
6]. However, the sea surface is usually wavy due to wind, and therefore the surface reflects sunlight from a range of directions other than only the viewing angle [
53]. The
value used to calculate
was set to a constant 0.028 in the HydroColor application [
41], while the value used for
was computed by Equations
2 and
3, which is related to wind speed and sky condition. The comparisons of HydroColor data and calibrated Water Insight Spectrometer (WISP) data by Leeuw and Boss [
41] does not show that HydroColor tends to overestimate
when the same
is applied to calculate the above-water reflectance.
According to the results, the observed differences between
and
for red and green bands were not statistically significant (
p > 0.05); however, the difference was significant for the blue band (
p < 0.05,
Table 7). The result of lower accuracy in the blue band is also shown in the work of Leeuw and Boss [
41], which indicates that
in the blue band has the highest median percent error compared to the WISP among the three bands. Interestingly, when only perfect quality
was evaluated, the statistical analysis showed that the difference of blue band was no longer significant (
p < 0.05,
Table 8). These results imply that minor contaminations in the HydroColor photos still introduce errors in the accuracy of the
for the blue band. The high sensitivity of minor contaminations for the blue band may be caused by: (1) comparably weak blue signals, and therefore subject to more variability due to noise (low signal-to-noise) [
73]; (2) relatively high difference and variability between the true spectral sensitivity function and the substitute one; or (3) the fixed
programmed in HydroColor, making it difficult to precisely correct the effect of skylight; however, the skylight is mostly contributed by blue band radiance due to Rayleigh scattering [
74]. As such, for accurate retrievals of above-water reflectance at the blue spectra using HydroColor, perfect data quality is required.
The average values and variability (
Table 7) of the difference between
and
were higher for band ratios than for individual bands. The results of permutation
F tests for band ratios (
Table 7) showed the difference between
and
of blue/green ratio was significant (
p < 0.05), while the other two ratios were not. The significant differences of the blue/green ratio were first considered as an effect of the bad performance of the blue band with minor contaminations. However, we note that, even discarding samples with contaminations, the significance was not eliminated (
p < 0.05,
Table 8). Thus, a linear model with environmental factors as variables was developed to correct the bias (Equation
6). Although this correction model was significant (
p < 0.05), the low adjusted
(0.26) and the high percentage of correction error (124%) illustrated that it was not satisfactory. Therefore, the magnitudes of individual bands are recommended to be used for scientific purposes rather than band ratios, especially the blue/green ratio. The most likely explanation for this negative finding is that the blue/green band ratio
is highly sensitive to the differences between
and
measurement protocols, which may result in a slight difference in measuring location (different field of view—angle of the lens and different footprint on the water) and time. To clarify, the SAS solar tracker was installed on the same side of the ship where the citizens acquired data, but still at approximately 15 m distance horizontally. In addition, a Hydrocolor sample usually takes one minute to finish collecting the grey card, sky, and water photos, whereas SAS solar tracker collected the data with three sensors simultaneously. Atmospheric properties and water optical constituents are continually changing, and therefore measured reflectance might be slightly different depending on location and time; for instance, clear blue skies with fast cirrus clouds passing in front of the Sun may suddenly change the downwelling irradiance. The studies of Toole et al. [
6] and Leeuw and Boss [
41] have also mentioned that the deviations between two instruments are partly caused by imperfect matching on time and location.