Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Memory colors and the assessment of color quality in lighting applications

Open Access Open Access

Abstract

Due to their potential use as an internal reference, memory colors may provide an excellent approach for the color rendition evaluation of white light sources in terms of predicting visual appreciation. Because of certain limitations in the design of existing memory-related color quality measures, a new metric based on the outcome of a series of recently conducted memory color appearance rating experiments is proposed in this work. In order to validate its predictive performance, a meta-correlation analysis on a comprehensive set of preference rating data collected from literature is performed. Results indicate that the new metric proposal outperforms established color quality measures and is capable of correctly predicting the rank order of light sources in different lighting scenarios. The future inclusion of this new metric into a comprehensive lighting quality model may serve as a valuable tool for the lighting designer to create optimally lit environments for humans that do not only support the visual task fulfillment but also increase the users’ well-being and emotional comfort by rendering the perceived space in such a way that it complies with the people’s inherent memory references.

© 2021 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

1. Introduction

For many lighting applications, e.g., in home and office lighting, museum lighting, or the shopping and retail industry, the goal of achieving a high user acceptability and visual appreciation is of major importance when developing new state-of-the-art lighting solutions. However, such a real life optimization problem can only be solved satisfyingly, when suitable tools and algorithms are available for developers and manufacturers that are capable of adequately modeling visual perception and observer preference.

To assess for example the perceived color quality of conventional and solid-state light sources illuminating a given environment, many different colorimetric approaches and color quality metrics have been proposed. When being applied for modeling the more subjective aspects of color quality, such as color preference and scene attractiveness, the most promising of these methods are those that are based on internal references such as preferred [1,2] or memory colors [37]. However, certain limitations can still be identified in the design of these measures. For instance, none of the preference- or memory-based color quality metrics known from literature were derived from experiments conducted under immersive viewing and adaptation conditions. Both aspects, though, are supposed to be crucial components for predicting observer preference in real-world lighting scenarios, in particular since color perception in lighting has been shown to vary substantially with the observers being adapted to different viewing environments and contexts [811]. Besides their lack in realistic viewing and adaptation conditions of the experiments used for their development, the existing metrics also do not consider the differences in importance of certain object colors for the assessment of visual appreciation in the illuminated scene. Instead, they simply assume an equal weighting for all their test objects or colors used for calculating preferred color shifts and, thus, the respective metric values. However, as recently shown by Bodrogi et al. [12], groups of different object colors are assigned different weights of importance for subjective color preference judgements in lighting. In addition, Huang et al. [13] found that familiarity to the presented test objects including their color features generally influences color preference judgments. Thus, for an appropriate modeling of observer preferences in lighting it seems expedient to develop a metric that explicitly accounts for these variations in the subjective weighting of different test objects and object colors for an overall rating of visual appreciation.

Based on a series of previously conducted color appearance rating experiments of familiar objects [14,15], this work therefore proposes a new memory-related preference metric for the color rendition evaluation of white light sources that, in contrast to its predecessors, was developed from subject ratings collected for different light settings by ensuring realistic viewing and adaptation conditions for an appropriate, more application-related experimental design. Its basic idea is to evaluate the degree of similarity between the color appearance of the familiar test objects rendered by an arbitrary test light source and their respective memory color representations derived from these experiments. Variations in the importance of the considered test objects and their respective colors are additionally modelled in a data-driven manner by introducing corresponding sets of weighting factors as an essential feature of the new color quality metric.

The research question that should be answered in this context is whether such a more sophisticated and better founded metric definition is capable of providing an improved prediction of the observers’ preference ratings collected for various different lighting scenarios. If this is the case, the new metric proposal may serve as a valuable tool for the optimization of multi-channel LED lighting systems intended to create modern living and working conditions that do not only support the visual task fulfillment but also increase the users’ well-being and emotional comfort in the lit environment, as these outcomes are both directly related to the degree of preference, attractiveness, and visual appreciation of the lighting conditions experienced by the users [1620]. Lighting quality models intended for a human-centric lighting design [21] may therefore benefit from the inclusion of such improved memory-related approaches, because for the majority of indoor lighting applications the lit environment is usually judged in relation to an inherent reference recalled from memory that basically defines how people expect the perceived objects and the space to optimally look like. Thus, matching the people’s expectations by providing correspondingly optimized lighting conditions can eventually result in high ratings of visual appreciation for an increased well-being and emotional comfort. Note that in this context, following the argumentation of Smet et al. [5,22], visual appreciation is considered as an umbrella term encompassing both preference and attractiveness as two similar, well-correlating aspects of perceived lighting quality. This assumption is further supported by the fact that, when comparing these descriptors on real subject data, they have been shown to yield the same light source rank order over a broad range of lighting conditions, suggesting that observers generally prefer what they find most attractive [5]. In the present work, we will therefore use the terms "visual appreciation", "preference", and "attractiveness" in lighting as interchangeable synonyms.

The current paper is organized as follows: Sec. 2 presents the mathematical formalism of the new metric proposal as well as a brief overview of the experimental design used for its derivation. The general metric framework was developed as part of S.B.’s doctoral thesis [23], but has completely been re-worked and extended accordingly to come up with the final metric proposal presented in this work. To validate its predictive performance, a meta-correlation method making use of a large database of visual preference rating data extracted from various published sources will be discussed. In Sec. 3, the results of this analysis will be presented and compared to those obtained for other known color quality metrics. Sec. 4 eventually concludes with a summary of the main findings and provides an outlook on future research intentions.

2. Materials and methods

2.1 Previous experiments and memory color representations for metric proposal

Twelve different real familiar test objects had been selected to perform a series of color appearance rating experiments, in which their apparent color, as defined by CAM$02$-UCS chromaticities, should be rated with respect to the observers’ intrinsic memory Ref. [14,15]. These test objects were Asian skin, banana, blueberry, blue jeans, broccoli, butternut squash, carrot, Caucasian skin, concrete flowerpot, green salad, red cabbage, and red rose, which were selected based on the outcome of an online survey intended to find representative memory color objects for certain pre-defined hue regions [14]. Their characteristic spectral reflectances, which were measured on real samples, are available for download from our institutional website [24]. Note that for the object samples of Asian and Caucasian skin, real human models of the respective ethnicity (here: from China and Germany) were recruited to conduct the corresponding color appearance rating experiments, where, in each case, the backside of the model’s right hand was assessed by the subjects [14].

Color appearance ratings were collected for two different adaptation conditions at 3200 K and 5600 K ambient illumination depicted in Fig. 1 and two different cultural observer groups of Chinese and German participants. To guarantee realistic viewing and adaptation conditions, the experiments were performed in a real-sized room, where each of the twelve different familiar test objects was individually presented to the participants in a tabletop setting. In each case, the horizontal illuminance of the ambient illumination at the test object’s position was 2000 lx. An LCD projector, which was mounted to the ceiling of the experimental room and concealed from the observers’ view, was used to shift the perceived chromaticities of the test objects without changing the participants’ state of chromatic adaptation defined by the ambient illumination. This was ensured by using individually designed, object-specific RGB projection maskings to additionally illuminate only the test objects and not their respective surroundings. Note that, for each test object, the CAM02-UCS lightness component was fixated to a value that maximized the corresponding chromaticity gamut but was still smaller than the background lightness to avoid the impression of a self-luminous object. Within this maximally feasible chromaticity gamut, a sample grid of chromaticity coordinates was defined for each test object and assigned to the respective RGB code values of the LCD projector, yielding a total number of ~65 chromaticity variations per test object and test condition. During the experiments, the participants were then asked to rate for each projector setting the resulting test object’s color appearance according to how they thought the respective object should look like in reality using a semi-semantic five-level rating scale. For both adaptation conditions, each test object was eventually assessed by 15 Chinese and 15 German observers.

 figure: Fig. 1.

Fig. 1. Relative spectral power distributions of the 3200 K and 5600 K ambient illumination settings defining the two different adaptation conditions used in the memory color rating experiments of Babilon and Khanh [14,15]

Download Full Size | PDF

Following the argumentation of Yendrikhovskij et al. [25], who investigated memory color representations for a single memory color object (a ripe banana) in a display-based setting, similarity ratings, which in their work and also in the present case were collected for chromaticities selected from an approximately uniformly spaced grid, sample the underlying similarity distribution. This distribution basically describes the likelihood that the perceived color belongs to a certain object category and, consequently, is proportional to the function value of the corresponding category distribution. From general recognition theory [26], it is known that the structure of such natural categories can effectively be modeled using multivariate Gaussian functions. Thus, it follows that the similarity ratings of memory colors can be approximated by (bivariate) Gaussian distributions of the form

$$\begin{aligned} f_{i}(\boldsymbol{x}_{i}) &=a_{i,1} + a_{i,2} \cdot \mathrm{exp}\left( - \frac{1}{2} \left( \left(\boldsymbol{x}_{i}-\boldsymbol{\mu}_{i}\right)^{\intercal}\boldsymbol{\Sigma}^{{-}1}_{i}\left(\boldsymbol{x}_{i}-\boldsymbol{\mu}_{i}\right) \right) \right) \\ &= a_{i,1} + a_{i,2} \cdot \mathrm{exp}\left( - \frac{1}{2} \left( \left(\boldsymbol{x}_{i}-\begin{pmatrix} a_{i,3} \\ a_{i,4} \end{pmatrix}\right)^{\intercal}\begin{pmatrix} a_{i,5} & a_{i,7} \\ a_{i,7} & a_{i,6} \end{pmatrix}\left(\boldsymbol{x}_{i}-\begin{pmatrix} a_{i,3} \\ a_{i,4} \end{pmatrix}\right) \right) \right) \\ &= a_{i,1} + a_{i,2} \cdot S_{i}(\boldsymbol{x}_{i}), \end{aligned}$$
where $a_{i,3}$ to $a_{i,7}$ define the similarity distribution function $S_{i}(\boldsymbol {x}_{i})$ of the $i^{\mathrm {th}}$ test object, while $a_{i,1}$ and $a_{i,2}$ are only used to adjust the Gaussians to the specific rating scale and, therefore, are not required for similarity evaluation. The functions’ centroids $\boldsymbol {\mu }_{i}$ give the most likely representations of the test objects’ memory colors, whereas the size, shape, and orientation of the respective similarity distributions are determined by the inverse of the covariance matrices $\boldsymbol {\Sigma }_{i}$. Tables 1 and 2 summarize the relevant fitting parameters for all twelve familiar test objects as assessed by the Chinese and German observers under 3200 K and 5600 K ambient illumination, respectively. Figure 5 in the Appendix A exemplarily visualizes the model fit results for the German observer group adapted to the 5600 K condition including various goodness-of-fit statistics that further emphasize the appropriateness of the Gaussian approach. For additional details on the Gaussian modeling of memory color representations and the original experiments including image representations of the experimental conditions and discussions on how they differed from other studies, the interested reader is referred to the previously published works of Babilon and Khanh [14,15] as well as to S.B.’s doctoral thesis [23].

Tables Icon

Table 1. Summary of the relevant fitting parameters describing the similarity distribution functions of the twelve familiar test objects assessed by the Chinese and German observers at 3200 K ambient illumination. Parameters $a_{3}$ and $a_{4}$ give the locations of the centroids of the distribution functions which are defined to be the most likely representations of the objects’ memory colors in CAM02-UCS chromaticity space. The size, shape, and orientation of the similarity distribution functions are determined by the parameters $a_{5}$ to $a_{7}$.

Tables Icon

Table 2. Summary of the relevant fitting parameters describing the similarity distribution functions of the twelve familiar test objects assessed by the Chinese and German observers at 5600 K ambient illumination. Parameters $a_{3}$ and $a_{4}$ give the locations of the centroids of the distribution functions which are defined to be the most likely representations of the objects’ memory colors in CAM02-UCS chromaticity space. The size, shape, and orientation of the similarity distribution functions are determined by the parameters $a_{5}$ to $a_{7}$.

2.2 Computing the memory color preference index (MCPI)

From the previous section, we have seen that memory color representations of familiar test objects can be obtained from corresponding color appearance ratings by applying bivariate Gaussian modeling. The assessment of the color quality of a certain test light source can thus be based on the evaluation of the degree of similarity between these memory color representations and the color appearance of the respective test objects under the test light source. Note that a similar approach was chosen by Smet et al. [5] for the development of their memory color rendition index (MCRI). For the similarity evaluation in the present case, the CAM02-UCS chromaticities $\boldsymbol {x}_{i}=\left (a^{\prime }_{\mathrm {M},i},b^{\prime }_{\mathrm {M},i}\right )^{\intercal }$ of the twelve familiar test objects are first calculated for the test light source by using their characteristic spectral reflectances and the CIE 10° standard observer. The spectral power distribution (SPD) of the test light source must be known or measured radiometrically. Here, the index "M" of the CAM02-UCS chromaticitiy coordinates indicates that the CIECAM02 attribute of colorfulness was used to construct the underlying color space rather than using the chroma "C" or saturation "s" attributes, which both would have been valid options, too. Reasons for this specific choice to define CAM02-UCS as well as further details on the mathematical formalism are given by Luo et al. [27,28].

Next, for each test object $i$, the function value $S_{i}$ of the corresponding similarity distribution $S_{i}(\boldsymbol {x}_{i})$ needs to be computed. The closer this value is to one, the better the agreement between the test object’s apparent color appearance and its respective memory color. The object-specific memory color preference indices $R_{\mathrm {MCPI},i}$ are then obtained by rescaling these similarity measures to a 0–100 range by using

$$R_{\mathrm{MCPI},i}=100\cdot S_{i}.$$
The general memory color preference index $R_{\mathrm {MCPI}}$, in the following also denoted simply as MCPI, is eventually defined as the weighted geometric mean of the twelve individual $R_{\mathrm {MCPI},i}$ values,
$$R_{\mathrm{MCPI}}=\prod_{i=1}^{12} \left( R_{\mathrm{MCPI},i} \right)^{p_{i}},$$
where the different $p_{i}$’s with $i=1,\ldots ,12$ denote their individual weighting factors (see also Sec. 3.1). As stated in the introduction section of this paper, these weights are considered as an essential feature of the new metric proposal. Their introduction basically accounts for the fact that certain object colors are found to be more important than others when it comes to the assessment of visual appreciation in an illuminated scene [12,13]. For an adequate prediction of observer preferences in lighting, the inclusion of these differences in the subjective weighting of the importance of certain memory colors therefore is a crucial aspect to be explicitly considered by the new metric proposal. The reason to further choose in Eq. (3) the geometric over the arithmetic mean is that the former is less susceptible to outliers than the latter, which in general makes the color quality metric more robust [5]. In addition, it better suits the exponential nature of the function values of the similarity distributions.

To account for the impact of different adaptation conditions potentially provided by different test light sources, a decision algorithm based on the calculation of correlated color temperatures (CCTs) according to Li et al. [29] has additionally been implemented, extending the formalism of Eq. (3). If the corresponding CCT is smaller than 4000 K, the test light source is considered to be rather warm white which implies the selection of the parameters of Table 1 to define the test objects’ similarity distribution functions. If, on the other hand, the calculated CCT is greater or equal 4000 K, the test light source is assumed to be rather cool white and the parameters given in Table 2 are chosen. In both cases, an additional distinction can be made between Chinese and German observers, which basically results in two different cultural-specific metric versions, $\mathrm {MCPI}_{\mathrm {Chinese}}$ (or $\mathrm {MCPI}_{\mathrm {Ch.}}$) and $\mathrm {MCPI}_{\mathrm {German}}$ (or $\mathrm {MCPI}_{\mathrm {Ger.}}$), intended to consider inter-cultural differences in the memory-based evaluation of color rendition.

In addition to this more cultural-specific approach, it may also be useful to define a global average observer as an approximation for a universally valid model, $\mathrm {MCPI}_{\mathrm {Global}}$ (or $\mathrm {MCPI}_{\mathrm {Gl.}}$). This kind of observer was determined by pooling for each test object and adaptation condition the rating data of both cultural observer groups, again followed by subsequent Gaussian fitting. The corresponding fit parameters are summarized in Table 3. This global average observer should serve as a reference to investigate whether a more universally valid color quality metric can be proposed or if the impact of the observed inter-cultural variations on the color appearance ratings of familiar objects reported elsewhere [15] may have a significant effect on the metric’s predictive performance.

Tables Icon

Table 3. Summary of the relevant fit parameters describing the similarity distribution functions of the twelve familiar test objects assessed by an assumed global average observer at both adaptation conditions. Parameters $a_{3}$ and $a_{4}$ give the locations of the centroids of the distribution functions which are defined to be the most likely representations of the objects’ memory colors in CAM$02$-UCS chromaticity space. The size, shape, and orientation of the similarity distribution functions are determined by the parameters $a_{5}$ to $a_{7}$.

Compared to Smet et al.’s MCRI, the new MCPI proposal shows some significant improvements. In contrast to the former, the MCPI development was based on experimental data collected under more realistic, immersive viewing and adaptation conditions instead of using a simple viewing-booth-like approach. Second, the familiar test objects used for the MCPI definition exhibit a better hue coverage than those selected by Smet et al. in their original work. In Fig. 2, the respective CAM$02$-UCS chromaticies of both sets of test objects are determined and compared for an assumed reference illumination of D56, where in each case the test objects’ measured reflectance spectra were used to perform the calculations (reflectance spectra of the MCRI objects were extracted from the LuxPy toolbox [30]). As can be seen, the MCPI test objects not only provide a better hue coverage but also a somewhat higher overall degree of saturation, in particular for reddish hues, which following the argumentation of Davis and Ohno [31,32] may enhance the metric’s predictive performance especially for peaked LED spectra. In addition, the MCPI metric, in contrast to Smet et al.’s MCRI, explicitly considers the varying degree of importance of different memory colors as well as the impact of different adaption conditions for an adequate prediction of visual appreciation in lighting. Finally, with regard to their corresponding mathematical formalism, the MCPI makes use of a uniform color space based on the CIECAM$02$ color appearance model, while Smet et al.’s MCRI uses a modified version of the IPT color space developed by Ebner and Fairchild [33].

 figure: Fig. 2.

Fig. 2. Comparison between the CAM$02$-UCS chromaticities of the twelve familiar test objects used for the new MCPI metric and those nine test objects selected by Smet et al. for their MCRI definition.

Download Full Size | PDF

2.3 Performance validation based on meta-correlation analysis

For validating the predictive performance of the proposed models, experimental data of several psychophysical studies on color preference and visual appreciation under different lighting conditions were collected in order to eventually perform a corresponding meta correlation-analysis. Studies were included in the analysis if they had i) explicitly asked subjects to rate either (color) preference, attractiveness, or visual appreciation of real objects or the lit environment and ii) provided access to both the SPDs of the tested light sources and the subjects’ (mean) ratings. In total, experimental data of 16 different studies could be collected. Depending on their principle experimental design, a distinction is made between studies of metameric lighting scenarios [5,3444] and studies that deal with multi-CCT comparisons [13,18,44,45], where the corresponding nomenclature was adopted from existing literature [13,4649]. In this context, ’metameric’ denotes psychophysical studies in which participants were asked to rate their color preference for a certain number of illumination conditions with almost the same CCTs and, therefore, similar white points. In all metameric studies included in this work, the differences between the white points of the illumination conditions to be compared for a given CCT were smaller than $\Delta u^{\prime }v^{\prime }_{\mathrm {max}} = 0.0419$ with an average of $\Delta u^{\prime }v^{\prime }_{\mathrm {mean}} = 0.0221$. The term ‘multi-CCT’ on the other hand summarizes those studies in which the light sources to be compared and rated by the participants were not limited to a specific CCT. The reason to make this distinction between metameric and multi-CCT lighting scenarios is that clear evidence is provided from literature that a color quality metric can correlate reasonably well with metameric color preference data but eventually fails when being applied to predict the light sources’ rank order for multi-CCT comparisons [45,47,48]. However, a good color quality metric must be able to predict color preference and visual appreciation for both cases, in particular with regard to practical applications where a certain variety in lighting solutions of different CCTs may occur.

Thus, the two final databases for running the corresponding meta-analyses consisted of 13 088 observer ratings of 655 different subjects collected on 341 test conditions from 12 metameric studies and of 3454 observer ratings of 284 different subjects collected on 91 test conditions from 4 multi-CCT studies. If not received from the authors themselves, the corresponding SPDs were digitized using the WebPlotDigitizer written by Rohatgi [50]. This was the case for two of the metameric studies with 22 test conditions in total and for one of the multi-CCT studies with 7 test conditions in total. For further details on the collected databases of preference rating data including a summary of the experimental parameters of each study the interested reader is referred to Ref. [23].

For each of the two different subgroups – metameric and multi-CCT – half of the collected data was randomly selected and used to determine the corresponding weighting factors of Eq. (3) for all three metric proposals, i.e., the two cultural-specific and the global versions, to account for the varying degree of importance of different colors in the assessment of visual appreciation. In each case, the global search algorithm proposed by Ugray et al. [51] was applied to find a set of weighting factors that maximizes the metric’s correlation with the selected data (see Table 4 for results). The remaining half of the data in the metameric and multi-CCT subgroups are then used to evaluate the predictive performance of the different MCPI metrics. For this purpose, considering each of the two subgroups separately, the respective metric scores are calculated for each test condition and correlated with the corresponding observer preference ratings. These final correlation results are eventually compared to those obtained for other known color quality measures on the same set of test data.

Tables Icon

Table 4. Summary of the optimized weighting coefficients introduced in Eq. (3) for the definition of the global and both cultural-specific MCPI color quality metrics. The individual $p_{i}$ values represent from left to right the weighting coefficients for the test objects of Asian skin (AS), banana (BA), blueberry (BB), blue jeans (BJ), broccoli (BR), butternut (BU), carrot (CA), Caucasian skin (CS), concrete (CO), green salad (GS), red cabbage (RC), and red rose (RR). For each kind of lighting scenario (metameric vs. multi-CCT) the appropriate set of weighting coefficients has been determined by running a global search algorithm intended to maximize the resulting correlations on a corresponding subset of randomly selected preference rating data.

The meta-correlation calculations underlying this evaluation procedure are performed by adopting the method of Hunter and Schmidt [52]. This method tries to estimate the true correlation between two variables by calculating a weighted average correlation based on data collected from several different studies while taking into account and correcting for differences in their experimental designs and conditions. In this context, the meta-analysis according to Hunter and Schmidt offers a range of different statistical approaches to rectify the sample data by explicitly considering resulting study artifacts that in general attenuate the true correlation. Based on the information that could be extracted from the collected studies, the following artifacts were included in the analysis presented in this work: i) bare-bones sampling error due to variations in sample size between different studies, ii) study heterogeneity due to different experimental protocols used for data acquisition, iii) different range restrictions on the independent variable, i.e., the metric scores, due to variations in the study design and the experimental conditions, iv) observer idiosyncrasy affecting their preference ratings, and v) statistical bias of the sample correlation. Further details on the mathematical structure of these corrections and how they were incorporated into the meta-correlation analysis can be found in the Appendix B.

Within this meta-analysis framework, the correlation between the predictions of a certain color quality metric and the corresponding observer preference ratings will be expressed in terms of Spearman correlation coefficients, which provide a measure for the metric’s ability to rank the assessed light sources correctly and in accordance with the observers’ ratings. Other color quality metrics to be included in the analysis are Sanders’ preferred color index $R_{\mathrm {p}}$ [3], Judd’s flattery index $R_{\mathrm {flatt.}}$ [1], Thornton’s color preference index (CPI) [2], Smet et al.’s MCRI [47], the general color rendition index $R_{\mathrm {a}}$ [53], the general color fidelity index $R_{\mathrm {f}}$ [54], the gamut area index (GAI) [55], the arithmetic mean of GAI and $R_{\mathrm {a}}$ [22,35], the color quality scale ($Q_{\mathrm {a}}$, $Q_{\mathrm {f}}$, $Q_{\mathrm {p}}$, $Q_{\mathrm {g}}$) [32], the CRI$2012$ [56], the feeling of contrast index (FCI) [57], and the IES TM-30 $R_{\mathrm {g}}$ measure [58]. This selection is assumed to represent a balanced cross-section of the three basic categories of color rendition which, according to Houser et al. [59], consist of fidelity-, preference-/memory-, and gamut-based color quality metrics, respectively.

3. Results and discussion

3.1 Cross-comparison of metric predictions

Table 5 summarizes the results of the meta-correlation analysis, differentiating between the two different lighting scenarios, i.e, metameric and multi-CCT. For each tested color quality metric, the weighted average artifact-corrected Spearman correlation coefficient $\hat {\overline {r}}_{\mathrm {c}}$ is indicated and provides a direct measure for the metric’s predictive performance. The intermediate results of the individual correction steps are also shown for each case, indicated by a dedicated labeling in the table’s first column. As can be seen, correcting for artifacts generally has a de-attenuating effect. Thus, when correcting for all study artifacts discussed in the previous section, the corresponding Spearman coefficients reported in the last row of each sub-table are assumed to give a good approximation of the true correlations.

Tables Icon

Table 5. Comparison of the various predictive metric performances in terms of weighted average artifact-corrected Spearman correlation coefficients obtained from meta-analysis. The correlation coefficients were calculated for both metameric and multi-CCT lighting scenarios. The intermediate results of the subsequently applied artifact correction steps are also tabulated (top to bottom), where the term "bare-bones" and the numbers in the first column represent the application of the correction formulae for sampling error, study heterogeneity (# 2), range restriction (# 3), and observer idiosyncrasy (# 4), respectively.

For a better overview, these values obtained for each of the two different kinds of lighting scenarios were plotted in Fig. 3 together with their respective 95 % confidence intervals (CIs). By comparing the extent and the range of the various CIs, it seems obvious that the different color quality metrics vary considerably in their predictive power. To determine which of them is best suited for evaluating color rendering properties of white light sources with respect to visual appreciation, the observed performance differences are examined for significance in a series of cross-comparisons following the procedure described by Zou [60]: Let us consider two different artifact-corrected Spearman correlation coefficients $\hat {\overline {r}}_{\mathrm {c},i}$ and $\hat {\overline {r}}_{\mathrm {c},j}$ supposed to give the best estimate of the true correlation with the observers’ preference ratings for two different color quality metrics $i$ and $j$, which were arbitrarily chosen from Table 5. For this constellation, the null hypothesis of equal correlations, i.e., $H_{0}:\; \hat {\overline {r}}_{\mathrm {c},i} - \hat {\overline {r}}_{\mathrm {c},j} = 0$, must be rejected at a 5 % significance level if the 95 % CI of the difference $\hat {\overline {r}}_{\mathrm {c},i} - \hat {\overline {r}}_{\mathrm {c},j}$ does not contain zero. Table 6 summarizes the corresponding results for both kind of lighting scenarios. Numbers with an asterisk indicate statistically significant differences.

 figure: Fig. 3.

Fig. 3. Comparison of the predictive performance of various color quality metrics expressed by artifact-corrected Spearman correlation coefficients obtained from meta-analysis which describe the individual metric’s ability of correctly ranking light sources in terms of visual appreciation based on human observers’ preference ratings. The indicated errorbars represent corresponding 95 % confidence intervals. Metameric as well as multi-CCT lighting scenarios have been analyzed separately.

Download Full Size | PDF

Tables Icon

Table 6. Overview of the results of the cross-comparison confidence interval test of Zou [60] intended to examine the predictive performance of the various color quality metrics for significant differences adopting a 5 % significance level. The null hypothesis $H_{0}:\; \hat {\overline {r}}_{\mathrm {c},i} - \hat {\overline {r}}_{\mathrm {c},j} = 0$ tests whether or not the observed correlations of two distinct metrics $i$ and $j$ are equal. The tabulated values are the confidence interval bounds of the correlation differences closest to zero. Significant correlation differences are indicated by asterisks (i.e., in case that zero was not within the corresponding confidence interval bounds). Results for the metrics’ correlation with observers’ preference ratings in metameric and multi-CCT lighting scenarios are shown in the lower and upper triangle of the table, respectively.

Metameric lighting scenarios: From Fig. 3 it seems clear that the new $\mathrm {MCPI}$ proposals perform as good as or even better than the best performing known color quality metrics for metameric lighting scenarios. The corresponding artifact corrected correlation coefficients are $\hat {\overline {r}}_{\mathrm {c},\mathrm {Global}} = 0.899^{+0.045}_{-0.079}$, $\hat {\overline {r}}_{\mathrm {c},\mathrm {Chinese}} = 0.869^{+0.059}_{-0.101}$, and $\hat {\overline {r}}_{\mathrm {c},\mathrm {German}} = 0.914^{+0.039}_{-0.069}$ for the global, the Chinese, and the German observer, respectively. Note that the latter even shows the largest predictive performance among all color quality metrics and significantly outperforms most of them (see CI hypothesis test results summarized in the lower grey-shaded triangle of Table 6).

In addition, with the cultural-specific and the global MCPI definitions showing comparably high correlations with no significant differences between each other, indication is given that the impact of the cultural background (at least for the two different observer groups discussed in this paper) on the metrics’ predictive performance seems to be negligibly small. This is in accordance to the findings of Refs. [15] and [61], who both reported the effect size of the inter-cultural variations in the color appearance ratings of familiar objects to be quite small, usually even smaller than the inter-observer variability within a single geographical region.

Thus, for most applications it will likely be sufficient to propose a single memory-based color quality measure. From the present data it seems clear that, for metameric lighting scenarios, the global MCPI version can be considered as a good approximation to such a universally valid metric that is capable of well predicting the rank order of the visual appreciation of different light sources across cultures. Potential (minor) errors in the absolute levels of prediction for different cultural observer groups are assumed to be of little or no practical relevance. The small effect sizes of the inter-cultural variations reported for the color appearance ratings of familiar objects as well as the observed non-significant differences in the predictive performance between the different MCPI versions emphasize these conclusions.

Comparing the MCPI results to Smet et al.’s MCRI ($\hat {\overline {r}}_{\mathrm {c},\mathrm {MCRI}} = 0.865^{+0.060}_{-0.103}$), which is also a metric based on the evaluation of memory colors, no significant differences in correlation are found between these models. Other well-performing metrics, such as Thornton’s CPI with $\hat {\overline {r}}_{\mathrm {c},\mathrm {CPI}} = 0.902^{+0.044}_{-0.077}$ and the $Q_{\mathrm {p}}$ metric with $\hat {\overline {r}}_{\mathrm {c},Q_{\mathrm {p}}} = 0.879^{+0.054}_{-0.093}$, have been defined to place additional weight on a preferred object color appearance by rewarding (to some extent) increases in chroma or saturation. However, not all of the memory- or preference-based color quality metrics considered in this work exhibit a comparably good predictive power. Indeed, Sanders’ $R_{\mathrm {p}}$ ($\hat {\overline {r}}_{\mathrm {c},\mathrm {Sanders}} = -0.387^{+0.291}_{-0.230}$) and Judd’s $R_{\mathrm {flatt.}}$ ($\hat {\overline {r}}_{\mathrm {c},\mathrm {Judd}} = 0.766^{+0.101}_{-0.162}$) show rather poor-to-moderate correlations. Whereas the latter at least succeeds to predict a positive trend in the rank order so that a more preferred light source also tends to result in a larger metric value, the former must be concluded to fail completely for metameric lighting scenarios because of its non-intuitive, opposing light source ranking.

Apart from some of the explicitly memory- and preference-related color quality measures, good-to-excellent performance in terms of predicting visual appreciation for metameric lighting scenarios is also observed for those metrics that can be summarized as either gamut- ($Q_{\mathrm {g}}$, $R_{\mathrm {g}}$, GAI, FCI) or chroma-enhancement-based ($Q_{\mathrm {p}}$). Both categories show correlation values comparable to those observed for the best performing memory- and preference-based color quality metrics. This is confirmed by the corresponding CI test results summarized in Table 6, where no or only barely significant differences in predictive performance are found between these metrics.

The overall lowest group performance for metameric lighting scenarios, on the other hand, can be noticed for the fidelity (CRI $R_{\mathrm {a}}$, $R_{\mathrm {f}}$, $Q_{\mathrm {f}}$, CRI$2012$) and the fidelity-oriented ($Q_{\mathrm {a}}$) metrics, all of which perform significantly worse than the memory- and preference-based approaches (with the exception of Sanders’ $R_{\mathrm {p}}$) and in most cases also worse than the gamut- and chroma-enhancement-based alternatives. Obviously, color fidelity metrics are not intended to measure color rendering properties of light sources in terms of visual appreciation. Indeed, they were originally developed to evaluate how similar an arbitrary test light source is to a certain reference standard and, therefore, should only be used for this purpose. This is further emphasized when looking at Table 5 from which it can be noticed that both the CIE $R_{\mathrm {a}}$ and the CQS $Q_{\mathrm {f}}$ exhibit correlation values close to zero indicating that both metrics completely fail to model the observers’ preference ratings properly. Consequently, such metrics should not be used for the evaluation and optimization of light sources and luminaires for achieving high user preference and visual appreciation. Unfortunately, the application of fidelity measures, such as the CIE $R_{\mathrm {a}}$, is still common practice for these cases in the industry, emphasizing the necessity of establishing a new and better standard in the near future.

As reported by Smet and Hanselaer [7] – and which can also bee seen from the present data – the predictive performance of visual appreciation tends to increase the more weight a color quality metric puts on chroma enhancement or gamut expansion, which both are commonly considered to increase preference up to a certain limit [32]. Regarding for example the role of chroma enhancement, a clear tendency can be derived from the four different CQS indices which show a considerable gain in performance as the reward for chroma enhancement in the respective calculation scheme increases. With $Q_{\mathrm {f}}$ explicitly excluding the saturation factor originally introduced in the $Q_{\mathrm {a}}$ calculation for not penalizing moderate increases in chroma caused by a test light source, it is not surprising that the former generally performs worse than the latter in modeling visual appreciation for metameric lighting scenarios. A further improvement is obtained when applying the $Q_{\mathrm {p}}$ and $Q_{\mathrm {g}}$ metrics, which both additionally reward light sources for increasing object chroma. As expected, this leads to a significantly better correlation with the preference rating results than the other two more fidelity-oriented CQS indices.

Here, it should be noted that the $Q_{\mathrm {g}}$ index actually falls into the category of gamut-based color quality metrics rather than offering pure chroma enhancement. As mentioned earlier, comparably good predictive performance for metameric lighting scenarios can be found for the related gamut-based alternatives $R_{\mathrm {g}}$, GAI, and FCI. However, in contrast to $Q_{\mathrm {p}}$ or the preference- and memory-based approaches, the gamut expansion metrics discussed here do not contain an upper limit for visually allowed chroma augmentation. It is commonly known and has been proven in various studies [6266] that the oversaturation of object colors beyond a certain degree may also have a negative impact on the perceived color quality and visual appreciation. Hence, the fact that the above mentioned gamut expansion approaches lack the possibility of setting a limit to the potential chroma enhancement caused by a test light source might be an explanation for their on average slightly worse predictive performance observed from Fig. 3 compared to the performance of the non-fidelity metrics that do include such a limit, either by defining certain reference chromaticities with more or less fixed values of increased saturation (e.g., CPI, MCRI $R_{\mathrm {m}}$, MCPI) or by explicitly giving a limit for the maximally allowed chroma enhancement (CQS $Q_{\mathrm {p}}$). A special case is the combined GAI/$R_{\mathrm {a}}$ metric which basically tries to counterbalance too large increases in chroma by penalizing induced deviations from the unsaturated reference chromaticities. As can be seen from Fig. 3, this special concept leads to a predictive performance for metameric lighting scenarios ranging between the performance of those color quality metrics that reward chroma enhancement and of those that are only fidelity-based or -oriented.

Multi-CCT lighting scenarios: In contrast to the evaluation of the metameric visual data where, with the exception of the fidelity-based approaches and Sanders’ $R_{\mathrm {p}}$, most of the color quality metrics performed quite well in properly ranking the test light sources in terms of observers’ visual appreciation indicated by relatively large positive Spearman correlation coefficients, the situation is completely different in case of considering multi-CCT lighting scenarios. Similar to Sanders $R_{\mathrm {p}}$ in the metameric case, most of the investigated color quality metrics confound the rank order of the light sources resulting in mostly negative correlations. With the exception of the FCI, which uses a constant reference illuminant (i.e., CIE D$65$) for calculating the relative gamut area spanned by the apparent chromaticities of four test samples, and Smet et al.’s MCRI, which uses reference chromaticities based on memory colors, all of these poorly performing metrics (i.e., CRI $R_{\mathrm {a}}$, $R_{\mathrm {f}}$, all four CQS indices, $R_{\mathrm {g}}$, $R_{\mathrm {CPI}}$, and $R_{\mathrm {flatt.}}$) adopt a reference illuminant of the same CCT as the test light source. However, in a multi-CCT scenario the values of such measures calculated for different CCTs are not really comparable because by definition they are correlated to different reference light sources [13,45] and, therefore, completely fail to model the general trend of visual preference ratings of observers assessing a series of multi-CCT illuminants. The only exception thereto is the CRI$2012$ metric which still shows a very poor but at least non-negative correlation of $\hat {\overline {r}}_{\mathrm {c},\mathrm {CRI}2012} = 0.142^{+0.293}_{-0.320}$. This slightly better performance than the rest of the color quality metrics whose calculation schemes are based on a CCT-dependent reference illuminant might be explained by the adoption of the CAM$02$-UCS including the application of a more state-of-the-art CAT$02$ chromatic adaptation transform, the usage of the CIE 10° standard observer, and a sophisticated test sample selection providing a much better hue coverage in the definition of the CRI$2012$ metric.

Compared to the color quality metrics discussed in the last paragraph, significantly better results are observed for both variations of the GAI measure, which similar to the FCI approach make use of a fixed reference illuminant for the calculation of the relative gamut area. With correlations of $\hat {\overline {r}}_{\mathrm {c},\mathrm {GAI}} = 0.606^{+0.166}_{-0.244}$ and $\hat {\overline {r}}_{\mathrm {c},\mathrm {GAI}/R_{\mathrm {a}}} = 0.807^{+0.087}_{-0.146}$ obtained for the GAI and the $\mathrm {GAI}/R_{\mathrm {a}}$ metric, respectively, cross-comparison CI tests summarized in the upper triangle of Table 6 revealed that, compared to most of the other color quality metrics, these two measures show a significantly enhanced predictive power for multi-CCT lighting scenarios.

Good results are also observed for Sanders’ $R_{\mathrm {p}}$ ($\hat {\overline {r}}_{\mathrm {c},\mathrm {Sanders}} = 0.615^{+0.163}_{-0.240}$), which is somewhat remarkable given that it performed rather poor for metameric lighting scenarios. This significant performance improvement might be explained by the fact that the assessment of the color appearance of certain test objects in a multi-CCT scenario always involves the adaptation to a changing illumination white point. Even though observers of such studies are usually asked to wait some time before giving their ratings in order to be fully adapted or to close their eyes during the change of the light source to be assessed, there is, indisputably, some bias of white point appreciation caused by the first glance a new lighting situation is perceived, which has a non-negligible impact on the observers’ subsequent color preference ratings. Hence, with the impact of the perceived white point increasing for multi-CCT comparisons, the lack in hue coverage of Sanders’ test samples used for the $R_{\mathrm {p}}$ calculation might become less severe for predicting visual appreciation in multi-CCT than in metameric lighting scenarios.

Poor performance, on the other hand, is observed for Smet et al.’s MCRI showing a Spearman correlation coefficient of $\hat {\overline {r}}_{\mathrm {c},\mathrm {MCRI}} = -0.447^{+0.290}_{-0.219}$, which is comparable to those of the worst performing color quality metrics for multi-CCT lighting scenarios. A possible explanation might be that in cases where chromatic re-adaptation between two subsequent ratings becomes necessary due to a change in the white point of the illumination, some additional cognitive process could be triggered which might cause a considerable shift in the recalled chromaticities of the memory color centers when the familiar test objects used for constructing the memory-based metric would be assessed under such a consecutively changing illumination. This potentially induced bias becomes increasingly more severe the larger the CCT of the test light source in a multi-CCT scenario deviates from the reference illumination under which the memory color centers have originally been determined. For metameric lighting scenarios, on the other hand, this kind of bias may also exist, however, it can be assumed to be approximately constant among the different test light sources as they exhibit more or less the same white points. Thus, despite such a potential bias, the MCRI metric is still capable of getting the rank order correctly in these cases, but clearly fails for multi-CCT comparisons.

In order to get hold of this problem, the new MCPI proposals were devised in such a way (see Sec. 2.2) that, similar to Sanders’ $R_{\mathrm {p}}$, the impact of the adapted white point on the assessment of memory colors is approximated by the implementation of a decision algorithm which, depending on the CCT of the test light source, chooses a more appropriate set of similarity distribution functions for that specific adaptation conditions. In addition to this conceptual improvement, the introduction of dedicated multi-CCT weighting factors in the MCPI calculations (see Table 4) further counterbalances the discussed chromatic adaptation bias by giving less weight to those test objects that would cause larger errors.

As can be seen from Fig. 3, the different MCPI versions exhibit superior correlations with the observers’ preference ratings. The corresponding artifact-corrected Spearman correlation coefficients read $\hat {\overline {r}}_{\mathrm {c},\mathrm {Global}} = 0.842^{+0.072}_{-0.124}$, $\hat {\overline {r}}_{\mathrm {c},\mathrm {German}} = 0.832^{+0.076}_{-0.130}$, and $\hat {\overline {r}}_{\mathrm {c},\mathrm {Chinese}} = 0.817^{+0.083}_{-0.139}$ for the global and the two cultural-specific MCPIs, respectively. Furthermore, analyzing the results of the CI cross-comparisons of Table 6 reveals that the different MCPI measures offer significantly better predictive performance for multi-CCT lighting scenarios than most of the other color quality metrics under inspection. Note that a comparably good performance is only observed for the GAI/$R_{\mathrm {a}}$ measure. Again, no significant differences in the predictive performance between the global and the cultural-specific MCPIs can be found. Like in the metameric case and by following the same argumentation as given above, it can therefore be concluded that also for multi-CCT lighting scenarios the global version of the MCPI can be assumed as a good approximation for a universally valid memory-based color quality metric that is expected to be applicable in most of the cases encountered in practice. Based on these findings and in order to ease the implementation of the MCPI metric for different applications, a corresponding Matlab script has been prepared for download [24], providing an example of the MCPI calculation for a selection of different measured light spectra.

3.2 MCPI evaluation of conventional and solid-state light sources

To demonstrate the MCPI metric’s performance for a range of real illuminants, Fig. 4(a) illustrates the relationship between the standard CIE $R_{\mathrm {a}}$ measure and the MCPI metric for a total number of 418 conventional and solid-state light sources. In addition, results for Smet et al.’s MCRI are shown in Fig. 4(b). The corresponding light source spectra for calculating the metrics’ outputs were collected from Refs. [59] and [67] and comprise measured SPDs of 31 fluorescent broadband (FB) and 34 fluorescent narrowband (FN) light sources, 32 high-intensity discharge lamps, 182 phosphor-converted (LED-P) and 79 multi-band (LED-M) white LEDs, 23 Tungsten filament lamps, and 37 natural daylight scenes. Here, the global MCPI version for multi-CCT comparisons was used to perform the calculations.

 figure: Fig. 4.

Fig. 4. Scatter plots of (a) $\mathrm {MCPI}_{\mathrm {Global}}$ vs. $R_{\mathrm {a}}$ and (b) $\mathrm {MCRI}$ vs. $R_{\mathrm {a}}$ for a selection of 418 different SPDs of conventional and solid-state light sources

Download Full Size | PDF

From Fig. 4(a) it can be seen that many light sources – most of them LED based – can have a high MCPI value ($\buildrel \wedge \over =$ high overall agreement between the test objects’ color rendition under the test illuminant and their respective memory color representations), while exhibiting only moderate $R_{\mathrm {a}}$ values. A similar behavior is observed for the MCRI metric. This represents an important feature of a memory color quality index as it reflects the experimental fact that a light source may score higher in terms of visual appreciation than its corresponding CIE reference illuminant. This suggests that the use of memory-based color quality metrics for describing the color rendition properties of white light sources may terminate the underestimation of the visual appreciation of certain, mainly LED-based light sources by standard fidelity measures. In addition, it should be noted that all of the daylight spectra cluster in a region of high $R_{\mathrm {a}}$ but only moderate-to-low MCPI values. Such a behavior complies with the findings of Refs. [14] and [68], which both showed that memory colors of familiar objects assessed at a given CCT tend to be shifted towards higher chroma values compared to their color appearance under a corresponding daylight reference. With the measured daylight spectra showing high $R_{\mathrm {a}}$ values, it is clear that no such favored chroma enhancement can be achieved with natural daylight sources as this would require a significant deviation from the reference SPDs, which in turn would again reduce fidelity. The fact that the MCPI naturally favors chroma enhancement thus is an essential feature for achieving high performance in predicting observers’ preference ratings.

Finally, the most prominent difference between both memory-based metric definitions, aside from the fact that the MCPI performs significantly better in predicting observers’ preference ratings especially for multi-CCT lighting scenarios (see Sec. 3.1), is the way of scaling the metric score values. In the present case, with the exception of the linear scaling of the object-specific indices to a 0–100 range, see Eq. (2), no further rescaling was applied for the MCPI calculation in order to preserve its possible interpretation as a measure for the degree of memory color similarity. On the contrary, a sigmoidal rescaling function was chosen for Smet et al.’s MCRI definition with the intention to better reflect saturation effects in human responses [6]. The corresponding rescaling parameters were optimized such that the MCRI shows a value of 90 and 50 for the CIE illuminants D65 and F4, respectively, and approaches a value of zero for an overall memory similarity (i.e., the geometric mean of the respective MCRI similarity indices) smaller than 50 % of its maximally possible value. As can be seen from the comparisons of Fig. 4(a) and (b), this leads to distinct shapes of the distribution patterns of both metric approaches, where the MCRI shows a somewhat stronger compression of metric scores towards the $R_{\mathrm {a}} = \mathrm {MCRI}$ line for most of the conventional and phosphor-converted LED light sources.

3.3 Future improvements of the MCPI

Despite the promising results reported in this work with regard to predicting visual appreciation in different lighting scenarios, the MCPI in its current definition is based on relatively limited experimental data (see Sec. 2.2). An extension of the original experiments on memory color representations of real objects described by Babilon and Khanh [14,15] to a greater number of different lighting and adaptation conditions would certainly be required to properly model the huge variety of lighting situations encountered in real-world applications. Recently, it has for example been shown that the degree of adaptation and, thus, the perceived color appearance highly depends on the chromaticity of the illumination [6972]. A similar effect can be expected for memory color assessments under realistic lighting conditions. This effect is likely to become more prevalent as one increasingly departs from the two specific adaptation conditions that, so far, have been used for defining the MCPI. Collecting data of memory color assessments for a larger number of distinct CCTs may thus contribute to a significant improvement of the MCPI.

In addition, the MCPI does not consider the impact of light levels yet. However, as illuminance changes, it generally influences the objects’ color appearance [73,74], which is known as the Hunt effect [75]. Even though the Hunt effect is considered in the predictions of the CIECAM02 color appearance model, which underlies the specific color space used for the MCPI definition, it still remains unclear to which extent and in what manner a light level dependent shift in the objects’ perceived chromaticities may affect their memory color representations. Corresponding dependencies and interaction effects, if they exist, must be included in a comprehensive model of memory-based color rendition evaluation. Dedicated experiments are therefore required to investigate how the memory color representations used for the MCPI definition may change across a wide range of different illuminance levels.

Further studies may also be required to come up with a better founded and more comprehensible definition of the test-object-specific weighting factors introduced for the MCPI calculation to account primarily for the varying importance of certain object colors in the subjective assessment of visual appreciation and preference in lighting. So far, a purely mathematical optimization strategy to find suitable sets of weighting factors for the different cultural observer groups has been applied in this work to maximize the respective MCPI’s correlation with the selected training data. Despite the metrics’ resulting excellent predictive performances validated on a representative sample of adequate test data, this procedure has the general disadvantage that the optimization differences between the cultural-specific weighting factors observed for example from Table 4 cannot be explained satisfyingly on an individual level. For that reason, it might be expedient to further investigate the relevance of different object colors for the subjective assessments of color preference in a cultural-dependent manner. This could for example occur in a similar way as recently proposed by Bodrogi et al. [12], who collected subjects’ relevance ratings on a number of different colored objects and scene contexts. The results of these kind of experiments may potentially serve as suitable input for a more conclusive definition of the MCPI weighting factors.

Finally, it would be of interest to examine how the context of a viewing environment may impact memory color assessments. Previous research outcomes suggest that color preference in general exhibits a context-dependent component [11,7678]. The question that arises from this finding is whether the lighting context may also modulate memory color assessments up to a significant degree or, which is the hypothesis, memory color representations of familiar objects remain stable across different contexts. In any case, further experimental insights are required.

4. Conclusion and outlook

In this paper, a new memory-based color quality metric, denoted as memory color preference index, MCPI, has been proposed for the color rendition evaluation of white light sources. It is based on the evaluation of the degree of similarity between the color appearance of certain familiar test objects rendered by an arbitrary test light source and their respective memory color representations. The degree of similarity is calculated from Gaussian similarity distribution functions fitted to the results of the color appearance rating experiments which have recently been conducted by Babilon and Khanh [14,15] for a set of twelve familiar test objects assessed under two different adaptation conditions and by two different cultural observer groups. The key features of the new MCPI proposal are the adoption of the perceptually uniform CAM$02$-UCS as the working chromaticity space, the implementation of a CCT-based decision algorithm to choose a suitable set of similarity functions for approximating the impact of the adapted white point on the memory color assessments, and the introduction of some additional test-object-specific weighting factors in the MCPI calculation to allow for i) modeling the varying importance of certain test colors in the evaluation of light sources with respect to color preference and ii) counterbalancing the metric errors in the memory color assessments introduced by chromatic adaptation bias. In addition, a global and two cultural-specific MCPI versions have been defined.

A meta-correlation analysis performed on a comprehensive collection of observer rating data from various published studies on color preference and visual appreciation revealed that the different MCPI measures show an excellent performance in predicting visual appreciation in different lighting scenarios. Compared to other color quality metrics, the MCPI – in most cases – yields significantly better correlations with the observers’ preference ratings and, in contrast to many others, has proven to be capable of correctly predicting the light sources’ rank order. With no significant differences being found between the different MCPI measures, its global version can be considered as a sufficiently good approximation for a single, universally valid memory-based color quality metric suitable for use in most applications.

The excellent performance of the new metric proposal across a variety of different lighting conditions can partly be attributed to the use of CAM02-UCS for modeling the metric predictions. Being based on the CIECAM02 color appearance model, CAM02-UCS is not only a uniform color space but also includes a chromatic adaptation transform (CAT02) that, in combination with the CCT-based decision algorithm for choosing a suitable set of similarity functions, seems to approximate the impact of the adapted white point on the memory color assessments reasonably well. With regard to these promising results and in light of the ongoing research and future improvements discussed in Sec. 3.3, the MCPI, after some further iterations, may serve as a valuable tool for the optimization of multi-channel LED lighting systems intended to create modern living and working conditions that do not only support the visual task fulfillment but also increase the users’ well-being and emotional comfort in the lit environment. Corresponding models that consider various aspects of lighting quality in the architectural context might benefit from such memory-related approaches, because in the absence of an external reference, which is the case for the majority of lighting applications, the lit environment is often judged in relation to what people expect the perceived objects and the space to look like. This kind of an inherent reference is usually recalled from memory so that a proper understanding of this process as part of a more sophisticated lighting quality model might be very beneficial in the future for the lighting designer to create optimal environments for humans. Another important field of application of memory colors can be identified in the imaging domain [7981], where an adequate modeling of people’s preferences may help to develop dedicated color and image enhancement strategies.

Appendix A Gaussian representation of memory colors

 figure: Fig. 5.

Fig. 5. Gaussian memory color representations of the familiar test objects for German observers at 5600 K ambient illumination. Indicated goodness-of-fit measures emphasize the excellent model performance and the appropriateness of the Gaussian approach.

Download Full Size | PDF

Appendix B Meta-correlation analysis

In this work, the method of Hunter and Schmidt (HS) [52] is applied for running the meta-correlation analysis. To account and correct for differences in the experimental designs and conditions between the different studies included in the analysis, the HS method offers statistical approaches to rectify the sample data by explicitly considering potential study artifacts that in general attenuate the true correlation. Based on the information that could be extracted from the collected studies on color preference in lighting, the following artifacts were included for the analysis presented in this work: i) bare-bones sampling error, ii) heterogeneity between different studies, iii) restriction or enhancement of range, iv) attenuation due to idiosyncrasy in the perception of the observers (e.g., inter-observer variability), and v) statistical bias in the sample correlation.

According to Hunter and Schmidt, the basic sampling error corrected, weighted average correlation coefficient $\overline {r}$ is given by

$$\overline{r}=\frac{\sum_{i=1}^{k} N_{i} r_{i}}{\sum_{i=1}^{k} N_{i}},$$
where $r_{i}$ represents the individual Spearman correlation coefficient and $N_{i}$ the respective observer number of the $i^{\mathrm {th}}$ lighting scenario ($\buildrel \wedge \over =$ test condition) with $i=1,\ldots ,k$, where $k$ gives the total number of lighting scenarios extracted from the various color preference studies. This simple correction for sampling error is also referred to as bare-bones meta-analysis.

Starting from this bare-bones correction, the adjustment for study heterogeneity can be included by changing the study weights in Eq. (4) from $N_{i}$ to the true or optimal weights given by [82]

$$N_{i}^{\mathrm{opt}}=\frac{1}{\hat{\tau}^2 + N_{i}^{{-}1}},$$
where $\hat {\tau }^2$ is the so-called heterogeneity estimator. Several different of such estimators can be found in the literature [83,84]. The one that should be adopted here was originally defined by Hunter and Schmidt [52] and reads
$$\hat{\tau}^{2}_{\mathrm{HS}} = \frac{Q-k}{\sum_{i=1}^{k} N_{i}},$$
where
$$Q=\sum_{i=1}^{k} N_{i} \left( r_{i} - \overline{r} \right)^{2},$$
so that Eq. (4) changes to
$$\hat{\overline{r}} = \frac{\sum_{i=1}^{k} N_{i}^{\mathrm{opt}} r_{i}}{\sum_{i=1}^{k} N_{i}^{\mathrm{opt}}}.$$
In cases where $Q<k$, the heterogeneity estimator $\hat {\tau }^{2}_{\mathrm {HS}}$ would be negative and, therefore, must be truncated to zero.

In order to eliminate the effect of potential range variations between different studies on the independent variable, which in the present case is the respective metric score, a range correction formula is used which projects all correlations onto the same reference standard deviation $\sigma _{0}$. This correction for restriction in range for the $i^{\mathrm {th}}$ lighting scenario is given by

$$r_{0,i} = \frac{U_{x} r_{i}}{\sqrt{\left( U_{x}^{2} - 1 \right) r_{i}^{2} + 1}},$$
where
$$U_{x}=\frac{1}{u_{x}}=\frac{\sigma_{0}}{\sigma_{i}},$$
is the inverse of the restriction parameter $u_{x}$ with $\sigma _{0}$ being the standard deviation of the pooled metric scores of all lighting scenarios and $\sigma _{i}$ representing the standard deviation of the range restricted metric scores of the $i^{\mathrm {th}}$ lighting scenario.

To correct for observer idiosyncrasy in the preference ratings, the reliability $r_{yy,i}$ in the observer ratings and, therefore, the degree of the resulting attenuation $\sqrt {r_{yy,i}}$ of the true correlation of the $i^{\mathrm {th}}$ lighting scenario can be estimated from the respective inter-observer variability measured in terms of the standardized residual sum of squares (STRESS) index [7,85] by assuming that

$$r_{yy,i}=1-\frac{\mathrm{STRESS}_{\mathrm{inter},i}}{100}$$
can be treated as a systematic measurement error so that the correction for inter-observer idiosyncrasy giving a correlation value $r_{i}^{\mathrm {corr}}$ closer to the true correlation is simply obtained by
$$r_{i}^{\mathrm{corr}}=\frac{r_{i}}{\sqrt{r_{yy,i}}}.$$
Unfortunately, for most of the color preference studies included in this work, the individual observer ratings and, consequently, the corresponding inter-observer variability measures were not accessible. For this reason, a $\mathrm {STRESS}_{\mathrm {inter}}$ value of $35$ was assumed here, which is the typical value of inter-observer variability obtained in various color discrimination studies [86,87] and, therefore, assumed to be a good approximation to be applied in the meta-analysis.

Finally, the sample correlation $r_{i}$ of the $i^{\mathrm {th}}$ lighting scenario should be corrected for statistical bias. With the sample correlation being a statistical estimator of the true population correlation, the impact of the observed bias is systematic and can consequently be captured to a good approximation by an attenuation multiplier. As proposed by Hunter and Schmidt [52], the best attenuation multiplier in a meta-analysis for absolute (estimated) population correlations smaller than $0.7$ is a linear attenuation factor given by

$$a_{i}=1-\frac{1}{2N_{i} - 1},$$
which is independent of the population correlation. For absolute values larger than $0.7$ the more accurate, nonlinear attenuation factor
$$a_{i}=1-\frac{1-r_{i}^2}{2N_{i} - 1}$$
should be used. Note that in the above equation the actual population correlation was replaced by its best estimator, i.e., the sample correlation $r_{i}$. The final correction for statistical bias is thus given by
$$r_{i}^{\mathrm{bias}}=\frac{r_{i}}{a_{i}}$$
and should be the last artifact to be corrected for in a meta-analysis (at least for correlations $>0.7$) since the attenuation multiplier $a_{i}$ is directly related to the population correlation estimator $r_{i}$.

Applying all attenuation correction steps in consecutive order finally leads to the best estimate $\hat {\overline {r}}_{\mathrm {c}}$ of the true correlation between the respective metric’s predictions and the observers’ preference ratings. Hence, the complete correction formula is given by

$$\begin{aligned} \hat{\overline{r}}_{\mathrm{c}} &= \left( \sum_{i=1}^{k} N_{i}^{\mathrm{opt}} \frac{U_{x} r_{i}}{a_{i}^{*}\cdot \sqrt{r_{yy,i} \left( \left( U_{x}^{2} - 1 \right) r_{i}^{2} + 1 \right)}} \right) \left( \sum_{i=1}^{k} N_{i}^{\mathrm{opt}} \right)^{{-}1} \\ &= \left( \sum_{i=1}^{k} N_{i}^{\mathrm{opt}} \frac{r_{i}^{*}}{a_{i}^{*}} \right) \left( \sum_{i=1}^{k} N_{i}^{\mathrm{opt}} \right)^{{-}1}, \end{aligned}$$
where
$$a_{i}^{*} = \begin{cases} 1-\left(2N_{i} - 1\right)^{{-}1} & \textrm{if } \vert r_{i}^{*} \vert < 0.7 \\ 1 - \left( 1 - \left( r_{i}^{*} \right)^{2} \right) \left( 2N_{i} - 1 \right)^{{-}1} & \textrm{otherwise} \end{cases}$$
is the modified version of Eqs. (13) and (14). Furthermore, the heterogeneity estimator for the calculation of $N_{i}^{\mathrm {opt}}$ must also be adjusted accordingly. The formula for the $Q$ value consequently changes to
$$Q=\sum_{i=1}^{k} N_{i} \left( \frac{r_{i}^{*}}{a_{i}^{*}} - \overline{r}^{*} \right)^{2},$$
with
$$\overline{r}^{*} = \left( \sum_{i=1}^{k} N_{i} \frac{r_{i}^{*}}{a_{i}^{*}} \right) \left( \sum_{i=1}^{k} N_{i} \right)^{{-}1}$$
being the modification of Eq. (4) corrected for range restriction, idiosyncrasy, and statistical bias artifacts, respectively.

Funding

Technische Universität Darmstadt (Ernst Ludwig Mobility Grant); Bundesministerium für Bildung und Forschung (13N13394 (UNILED II)).

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. D. B. Judd, “A flattery index for artificial illuminants,” Illum. Eng. 62, 593–598 (1967).

2. W. A. Thornton, “A validation of the color-preference index,” J. Illum. Eng. Soc. 4(1), 48–52 (1974). [CrossRef]  

3. C. L. Sanders, “Assessment of color rendition under an illuminant using color tolerances for natural objects,” Illum. Eng. 54, 640–646 (1959).

4. K. Smet, S. Jost-Boissard, W. R. Ryckaert, G. Deconinck, and P. Hanselaer, “Validation of a colour rendering index based on memory colours,” in Proceedings of the CIE 2010 Conference: Lighting Quality and Energy Efficiency, (International Commission on Illumination CIE, Vienna, Austria, 2010), pp. 136–142.

5. K. A. G. Smet, W. R. Ryckaert, M. R. Pointer, G. Deconinck, and P. Hanselaer, “Memory colours and colour quality evaluation of conventional and solid-state lamps,” Opt. Express 18(25), 26229–26244 (2010). [CrossRef]  

6. K. A. G. Smet, W. R. Ryckaert, M. R. Pointer, G. Deconinck, and P. Hanselaer, “A memory colour quality metric for white light sources,” Energy Build. 49, 216–225 (2012). [CrossRef]  

7. K. Smet and P. Hanselaer, “Memory and preferred colours and the colour rendition of white light sources,” Light. Res. & Technol. 48(4), 393–411 (2016). [CrossRef]  

8. J. W. Jennes and S. K. Shevell, “Color Appearance with Sparse Chromatic Context,” Vision Res. 35(6), 797–805 (1995). [CrossRef]  

9. M. A. Webster and J. D. Mollon, “Adaptation and the Color Statistics of Natural Images,” Vision Res. 37(23), 3283–3298 (1997). [CrossRef]  

10. I. Juricevic and M. A. Webster, “Normal Variations in Color Vision V: Simulations of Adaptation to Natural Color Environments,” Vis. Neurosci. 26(1), 133–145 (2009). [CrossRef]  

11. Y. Lin, M. Wei, K. A. G. Smet, A. Tsukitani, P. Bodrogi, and T. Q. Khanh, “Colour Preference Varies with Lighting Application,” Light. Res. & Technol. 49(3), 316–328 (2017). [CrossRef]  

12. P. Bodrogi, D. Carella, and T. Q. Khanh, “Weighting the relevance of the different colours in subjective assessments of colour preference,” Light. & Eng. 28(03-2020), 37–46 (2020). [CrossRef]  

13. Z. Huang, Q. Liu, S. Westland, M. R. Pointer, M. R. Luo, and K. Xiao, “Light dominates colour preference when correlated colour temperature differs,” Light. Res. & Technol. 50(7), 995–1012 (2018). [CrossRef]  

14. S. Babilon and T. Q. Khanh, “Color appearance rating of familiar real objects under immersive viewing conditions,” Color Research & Application 43(4), 551–568 (2018). [CrossRef]  

15. S. Babilon and T. Q. Khanh, “Impact of the adapted white point and the cultural background on memory color assessments,” Color Res. Appl. 45(5), 803–824 (2020). [CrossRef]  

16. J. A. Veitch, G. R. Newsham, P. R. Boyce, and C. C. Jones, “Lighting appraisal, well-being and performance in open-plan offices: A linked mechanisms approach,” Light. Res. & Technol. 40(2), 133–151 (2008). [CrossRef]  

17. J. A. Veitch, M. G. M. Stokkermans, and G. R. Newsham, “Linking lighting appraisals to work behaviors,” Environ. Behav. 45(2), 198–214 (2013). [CrossRef]  

18. Q. Wang, H. Xu, F. Zhang, and Z. Wang, “Influence of color temperature on comfort and preference for LED indoor lighting,” Optik 129, 21–29 (2017). [CrossRef]  

19. T. Q. Khanh, P. Bodrogi, X. Guo, and P. Q. Anh, “Towards a user preference model for interior lighting Part 1: Concept of the user preference model and experimental method,” Light. Res. & Technol. 51(7), 1014–1029 (2019). [CrossRef]  

20. K. W. Houser and T. Esposito, “Human-centric lighting: Foundational considerations and a five-step design process,” Front. Neurol. 12, 25 (2021). [CrossRef]  

21. K. W. Houser, P. R. Boyce, J. M. Zeitzer, and M. Herf, “Human-centric lighting: Myth, magic or metaphor?” Light. Res. & Technol. 53(2), 97–118 (2021). [CrossRef]  

22. K. Smet, W. R. Ryckaert, M. R. Pointer, G. Deconinck, and P. Hanselaer, “Correlation between colour quality metric predictions and visual appreciation of light sources,” Opt. Express 19(9), 8151–8166 (2011). [CrossRef]  

23. S. Babilon, “On the color rendition of white light sources in relation to memory preference. Doctoral thesis: Technische Universität Darmstadt,” https://tuprints.ulb.tu-darmstadt.de/7799/ (2018). [Online; accessed 07-January-2021].

24. S. Babilon and T. Q. Khanh, “Reflectance spectra and Matlab implementation for MCPI calculation,” hessenbox (2020) [Online; accessed 09-June-2021], https://hessenbox.tu-darmstadt.de/getlink/fiALVsnP6T2HrmHen5cUcSTb/.

25. S. N. Yendrikhovskij, F. J. J. Blommaert, and H. de Ridder, “Representation of Memory Prototype for an Object Color,” Color Research & Application 24(6), 393–410 (1999). [CrossRef]  

26. F. G. Ashby, Multidimensional models of categorization (Lawrence Erlbaum Associates, Inc., Hillsdale, NJ, USA, 1992), pp. 449–483, Scientific Psychology Series.

27. C. Li, M. R. Luo, and G. Cui, “Colour-differences evaluation using colour appearance models,” in Proceedings of the 11th Color and Imaging Conference, (Society for Imaging Science and Technology (IS&T), Scottsdale, AZ, USA, 2003), pp. 127–131.

28. M. R. Luo, G. Cui, and C. Li, “Uniform colour spaces based on CIECAM02 colour appearance model,” Color Research & Application 31(4), 320–330 (2006). [CrossRef]  

29. C. Li, G. Cui, M. Melgosa, X. Ruan, Y. Zhang, L. Ma, K. Xiao, and M. R. Luo, “Accurate method for computing correlated color temperature,” Opt. Express 24(13), 14066–14078 (2016). [CrossRef]  

30. K. A. G. Smet, “Tutorial: The LuxPy Python toolbox for lighting and color science,” Leukos 16(3), 179–201 (2020). [CrossRef]  

31. W. Davis and Y. Ohno, “Toward an improved color rendering metric,” in Proceedings of the SPIE 5941, 5th International Conference on Solid State Lighting, (International Society for Optics and Photonics, Bellingham, WA, USA, 2005), pp. 59411G–1–59411G–8.

32. W. Davis and Y. Ohno, “Color quality scale,” Opt. Eng. 49(3), 033602 (2010). [CrossRef]  

33. F. Ebner and M. D. Fairchild, “Development and testing of a color space (IPT) with improved hue uniformity,” in Proceedings of the 6th Color and Imaging Conference: Color Science, Systems and Applications, (Society for Imaging Science and Technology (IS&T), Scottsdale, AZ, USA, 1998), pp. 8–13.

34. S. Jost-Boissard, M. Fontoynont, and J. Blanc-Gonnet, “Perceived lighting quality of LED sources for the presentation of fruit and vegetables,” J. Mod. Opt. 56(13), 1420–1432 (2009). [CrossRef]  

35. S. Jost-Boissard, P. Avouac, and M. Fontoynont, “Assessing the colour quality of LED sources: Naturalness, attractiveness, colourfulness and colour difference,” Light. Res. & Technol. 47(7), 769–794 (2015). [CrossRef]  

36. M. Wei, K. W. Houser, G. R. Allen, and W. W. Beers, “Color preference under LEDs with diminished yellow emission,” J. Illum. Eng. Soc. 10(3), 119–131 (2014). [CrossRef]  

37. F. Szabó, R. Kéri, J. Schanda, P. Csuti, and E. Mihálykó-Orbán, “A study of preferred colour rendering of light sources: Home lighting,” Light. Res. & Technol. 48(2), 103–125 (2016). [CrossRef]  

38. J. VanRie, “The effect of the spectral composition of a light source on the visual appreciation of a composite object set,” in Technical Report to the User Committee of the IWT-TETRA Project (80163), Appendix 4, (Diepenbeek, Belgium, 2009).

39. Y. Imai, T. Kotani, and T. Fuchida, “A study of colour rendering properties based on colour preference in adaptation to led lighting,” in Proceedings of the CIE 2012 Conference: Lighting Quality and Energy Efficiency, (International Commission on Illumination CIE, Vienna, Austria, 2012), pp. 369–374.

40. Y. Imai, T. Kotani, and T. Fuchida, “A study of colour rendering properties based on colour preference of objects in adaptation to led lighting,” in CIE Centenary Conference: Towards a new Century of Light, (International Commission on Illumination CIE, Vienna, Austria, 2013), pp. 62–67.

41. S. Jost and M. Fontoynont, “Colour rendering of face complexion and hair under LED sources,” in CIE Centenary Conference: Towards a new Century of Light, (International Commission on Illumination CIE, Vienna, Austria, 2013), pp. 53–61.

42. A. Tsukitani, “Optimization of colour quality for landscape lighting based on feeling of contrast index,” in CIE Centenary Conference: Towards a new Century of Light, (International Commission on Illumination CIE, Vienna, Austria, 2013), pp. 68–71.

43. Y. Lin, J. He, A. Tsukitani, and H. Noguchi, “Colour quality evaluation of natural objects based on the feeling of contrast index,” Light. Res. & Technol. 48(3), 323–339 (2016). [CrossRef]  

44. K. W. Houser, D. K. Tiller, and X. Hu, “Tuning the fluorescent spectrum for the trichromatic visual response: A pilot study,” J. Illum. Eng. Soc. 1(1), 7–23 (2005). [CrossRef]  

45. N. Narendran and L. Deng, “Color rendering properties of LED light sources,” Proc. SPIE 4776, 61–67 (2002). [CrossRef]  

46. Q. Liu, Z. Huang, K. Xiao, M. R. Pointer, S. Westland, and M. R. Luo, “Gamut volume index: A color preference metric based on meta-analysis and optimized color samples,” Opt. Express 25(14), 16378–16391 (2017). [CrossRef]  

47. D. Durmus, “Optimising light source spectrum to reduce the energy absorbed by objects. Doctoral thesis: The University of Sydney,” http://hdl.handle.net/2123/17844 (2018). [Online; accessed 12-June-2021].

48. W. Wang, S. Gao, H. Lin, Y. Liu, and Q. Liu, Objective colour quality assessment for lighting (Springer, Singapore, 2019), vol. 543, pp. 93–101.

49. Z. Huang, H. Lin, B. Wu, W. Wang, and Q. Liu, Spectral power distributions with high gamut-based metric values (Springer, Singapore, 2019), vol. 543, pp. 9–15.

50. A. Rohatgi, “WebPlotDigitizer 4.0,” https://automeris.io/WebPlotDigitizer/ (2017). [Online; accessed 28-March-2018].

51. Z. Ugray, L. Lasdon, J. Plummer, F. Glover, J. Kelly, and R. Martí, “Scatter search and local NLP solvers: A multistart framework for global optimization,” INFORMS J. on Comput. 19(3), 328–340 (2007). [CrossRef]  

52. J. E. Hunter and F. L. Schmidt, Methods of Meta-analysis: Correcting Error and Bias in Research Findings (SAGE Publications, Thousand Oaks, CA, USA, 2004), 2nd ed.

53. Commission Internationale de l’Éclairage, “Method of measuring and specifying colour rendering properties of light sources,” CIE Technical Report 13.3 (1995).

54. Commission Internationale de l’Éclairage, “Colour fidelity index for accurate scientific use,” CIE Technical Report 224 (2017).

55. J. P. Freyssinier and M. Rea, “A two-metric proposal to specify the color-rendering properties of light sources for retail lighting,” in Proceedings of the SPIE 7784, 10th International Conference on Solid State Lighting, (International Society for Optics and Photonics, San Diego, CA, USA, 2010), p. 77840V.

56. K. A. G. Smet, J. Schanda, and L. Whitehead, “CRI2012: A proposal for updating the cie colour rendering index,” Light. Res. & Technol. 45(6), 689–709 (2013). [CrossRef]  

57. K. Hashimoto, T. Yano, M. Shimizu, and Y. Nayatani, “New method of specifying color rendering properties of light sources based on feeling of contrast,” Color Research & Application 32(5), 361–371 (2007). [CrossRef]  

58. A. David, P. T. Fini, K. W. Houser, Y. Ohno, M. P. Royer, K. A. G. Smet, M. Wei, and L. Whitehead, “Development of the IES method for evaluating the color rendition of light sources,” Opt. Express 23(12), 15888–15906 (2015). [CrossRef]  

59. K. W. Houser, M. Wei, A. David, M. R. Krames, and X. S. Shen, “Review of measures for light-source color rendition and considerations for a two-measure system for characterizing color rendition,” Opt. Express 21(8), 10393–10411 (2013). [CrossRef]  

60. G. Y. Zou, “Toward using confidence intervals to compare correlations,” Psychol. Methods 12(4), 399–413 (2007). [CrossRef]  

61. K. A. G. Smet and P. Hanselaer, “Impact of cross-regional differences on color rendition evaluation of white light sources,” Opt. Express 23(23), 30216–30226 (2015). [CrossRef]  

62. T. Q. Khanh, P. Bodrogi, Q. T. Vinh, and D. Stojanovic, “Colour preference, naturalness, vividness and colour quality metrics, Part 1: Experiments in a room,” Light. Res. & Technol. 49(6), 697–713 (2017). [CrossRef]  

63. T. Q. Khanh, P. Bodrogi, Q. T. Vinh, and D. Stojanovic, “Colour preference, naturalness, vividness and colour quality metrics, Part 2: Experiments in a viewing booth and analysis of the combined dataset,” Light. Res. & Technol. 49(6), 714–726 (2017). [CrossRef]  

64. T. Q. Khanh and P. Bodrogi, “Colour preference, naturalness, vividness and colour quality metrics, Part 3: Experiments with makeup products and analysis of the complete warm white dataset,” Light. Res. & Technol. 50(2), 218–236 (2018). [CrossRef]  

65. T. Q. Khanh, P. Bodrogi, Q. T. Vinh, X. Guo, and T. T. Anh, “Colour preference, naturalness, vividness and colour quality metrics, Part 4: Experiments with still life arrangements at different correlated colour temperatures,” Light. Res. & Technol. 50(6), 862–879 (2018). [CrossRef]  

66. T. Q. Khanh, P. Bodrogi, X. Guo, Q. T. Vinh, and S. Fischer, “Colour preference, naturalness, vividness and colour quality metrics, Part 5: A colour preference experiment at 2000 lx in a real room,” Light. Res. & Technol. 51(2), 262–279 (2019). [CrossRef]  

67. J. Parkkinen, P. Silfsten, and M. Hauta-Kasari, “Daylight spectra,” https://sites.uef.fi/spectral/daylight-spectra/ (1995). [Online; accessed 05-January-2021].

68. K. Smet, W. R. Ryckaert, M. R. Pointer, G. Deconinck, and P. Hanselaer, “Colour appearance rating of familiar real objects,” Color Research & Application 36(3), 192–200 (2011). [CrossRef]  

69. K. A. G. Smet, Q. Zhai, M. R. Luo, and P. Hanselaer, “Study of chromatic adaptation using memory color matches, Part I: neutral illuminants,” Opt. Express 25(7), 7732–7748 (2017). [CrossRef]  

70. K. A. G. Smet, Q. Zhai, M. R. Luo, and P. Hanselaer, “Study of chromatic adaptation using memory color matches, Part II: colored illuminants,” Opt. Express 25(7), 8350–8365 (2017). [CrossRef]  

71. M. Wei and S. Chen, “Effects of adapting luminance and CCT on appearance of white and degree of chromatic adaptation,” Opt. Express 27(6), 9276–9286 (2019). [CrossRef]  

72. Q. Zhai and M. R. Luo, “Study of chromatic adaptation via neutral white matches on different viewing media,” Opt. Express 26(6), 7724–7739 (2018). [CrossRef]  

73. M. Wei, W. Bao, and H.-P. Huang, “Consideration of light level in specifying light source color rendition,” Leukos 16(1), 55–65 (2020). [CrossRef]  

74. W. Bao and M. Wei, “Change of gamut size for producing preferred color appearance from 20 to 15000 lux,” Leukos 17(1), 21–42 (2021). [CrossRef]  

75. R. W. G. Hunt, “Light and Dark Adaptation and Perception of Color,” J. Opt. Soc. Am. 42(3), 190–199 (1952). [CrossRef]  

76. P. Bodrogi and T. Tarczali, “Colour Memory for Various Sky, Skin, and Plant Colours: Effect of the Image context,” Color Research & Application 26(4), 278–289 (2001). [CrossRef]  

77. D. Jonauskaite, C. Mohr, J.-P. Antonietti, P. M. Spiers, B. Althaus, S. Anil, and N. Dael, “Most and least preferred colours differ according to object context: New insights from an unrestricted colour range,” PLoS One 11(3), e0152194 (2016). [CrossRef]  

78. K. B. Schloss, E. D. Strauss, and S. E. Palmer, “Object Color Preferences,” Color. Res. & Appl. 38(6), 393–411 (2012). [CrossRef]  

79. Y. Zhu, M. R. Luo, L. Xu, X. Liu, G. Cui, S. Fischer, P. Bodrogi, and T. Q. Khanh, “Investigation of memory colours across cultures,” in Proceedings of the 23th Color and Imaging Conference, (Society for Imaging Science and Technology (IS&T), Springfield, VA, USA, 2015), pp. 133–136.

80. Y. Zhu, M. R. Luo, S. Fischer, P. Bodrogi, and T. Q. Khanh, “The effectiveness of colour appearance attributes for enhancing image preference and naturalness,” in Proceedings of the 24th Color and Imaging Conference, (Society for Imaging Science and Technology (IS&T), Springfield, VA, USA, 2016), pp. 231–236.

81. Y. Zhu, M. R. Luo, S. Fischer, P. Bodrogi, and T. Q. Khanh, “Long-term memory color investigation: Culture effect and experimental setting factors,” J. Opt. Soc. Am. A 34(10), 1757–1768 (2017). [CrossRef]  

82. J. Sánchez-Meca and F. Marín-Martínez, “Confidence Intervals for the Overall Effect Size in Random-Effects Meta-Analysis,” Psychol. Methods 13(1), 31–48 (2008). [CrossRef]  

83. K. Sidik and J. N. Jonkman, “A Comparison of Heterogeneity Variance Estimators in Combining Results of Studies,” Statist. Med. 26(9), 1964–1981 (2007). [CrossRef]  

84. W. Viechtbauer, “Bias and Efficiency of Meta-analytic Variance Estimators in the Random-Effects Model,” J. Educ. Behav. Stat. 30(3), 261–293 (2005). [CrossRef]  

85. P. A. García, R. Huertas, M. Melgosa, and G. Cui, “Measurement of the relationship between perceived and computed color differences,” J. Opt. Soc. Am. A 24(7), 1823–1829 (2007). [CrossRef]  

86. H. Wang, G. Cui, M. R. Luo, and H. Xu, “Evaluation of Colour-Difference Formulae for Different Colour-Difference Magnitudes,” Color Res. Appl. 37(5), 316–325 (2012). [CrossRef]  

87. M. Melgosa, P. A. García, L. Gómez-Robledo, R. Shamey, D. Hinks, G. Cui, and M. R. Luo, “Notes on the Application of the Standardized Residual Sum of Squares Index for the Assessment of Intra- and Inter-observer Variability in Color-Difference Experiments,” J. Opt. Soc. Am. A 28(5), 949–953 (2011). [CrossRef]  

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (5)

Fig. 1.
Fig. 1. Relative spectral power distributions of the 3200 K and 5600 K ambient illumination settings defining the two different adaptation conditions used in the memory color rating experiments of Babilon and Khanh [14,15]
Fig. 2.
Fig. 2. Comparison between the CAM $02$ -UCS chromaticities of the twelve familiar test objects used for the new MCPI metric and those nine test objects selected by Smet et al. for their MCRI definition.
Fig. 3.
Fig. 3. Comparison of the predictive performance of various color quality metrics expressed by artifact-corrected Spearman correlation coefficients obtained from meta-analysis which describe the individual metric’s ability of correctly ranking light sources in terms of visual appreciation based on human observers’ preference ratings. The indicated errorbars represent corresponding 95 % confidence intervals. Metameric as well as multi-CCT lighting scenarios have been analyzed separately.
Fig. 4.
Fig. 4. Scatter plots of (a) $\mathrm {MCPI}_{\mathrm {Global}}$ vs. $R_{\mathrm {a}}$ and (b) $\mathrm {MCRI}$ vs. $R_{\mathrm {a}}$ for a selection of 418 different SPDs of conventional and solid-state light sources
Fig. 5.
Fig. 5. Gaussian memory color representations of the familiar test objects for German observers at 5600 K ambient illumination. Indicated goodness-of-fit measures emphasize the excellent model performance and the appropriateness of the Gaussian approach.

Tables (6)

Tables Icon

Table 1. Summary of the relevant fitting parameters describing the similarity distribution functions of the twelve familiar test objects assessed by the Chinese and German observers at 3200 K ambient illumination. Parameters a 3 and a 4 give the locations of the centroids of the distribution functions which are defined to be the most likely representations of the objects’ memory colors in CAM02-UCS chromaticity space. The size, shape, and orientation of the similarity distribution functions are determined by the parameters a 5 to a 7 .

Tables Icon

Table 2. Summary of the relevant fitting parameters describing the similarity distribution functions of the twelve familiar test objects assessed by the Chinese and German observers at 5600 K ambient illumination. Parameters a 3 and a 4 give the locations of the centroids of the distribution functions which are defined to be the most likely representations of the objects’ memory colors in CAM02-UCS chromaticity space. The size, shape, and orientation of the similarity distribution functions are determined by the parameters a 5 to a 7 .

Tables Icon

Table 3. Summary of the relevant fit parameters describing the similarity distribution functions of the twelve familiar test objects assessed by an assumed global average observer at both adaptation conditions. Parameters a 3 and a 4 give the locations of the centroids of the distribution functions which are defined to be the most likely representations of the objects’ memory colors in CAM 02 -UCS chromaticity space. The size, shape, and orientation of the similarity distribution functions are determined by the parameters a 5 to a 7 .

Tables Icon

Table 4. Summary of the optimized weighting coefficients introduced in Eq. (3) for the definition of the global and both cultural-specific MCPI color quality metrics. The individual p i values represent from left to right the weighting coefficients for the test objects of Asian skin (AS), banana (BA), blueberry (BB), blue jeans (BJ), broccoli (BR), butternut (BU), carrot (CA), Caucasian skin (CS), concrete (CO), green salad (GS), red cabbage (RC), and red rose (RR). For each kind of lighting scenario (metameric vs. multi-CCT) the appropriate set of weighting coefficients has been determined by running a global search algorithm intended to maximize the resulting correlations on a corresponding subset of randomly selected preference rating data.

Tables Icon

Table 5. Comparison of the various predictive metric performances in terms of weighted average artifact-corrected Spearman correlation coefficients obtained from meta-analysis. The correlation coefficients were calculated for both metameric and multi-CCT lighting scenarios. The intermediate results of the subsequently applied artifact correction steps are also tabulated (top to bottom), where the term "bare-bones" and the numbers in the first column represent the application of the correction formulae for sampling error, study heterogeneity (# 2), range restriction (# 3), and observer idiosyncrasy (# 4), respectively.

Tables Icon

Table 6. Overview of the results of the cross-comparison confidence interval test of Zou [60] intended to examine the predictive performance of the various color quality metrics for significant differences adopting a 5 % significance level. The null hypothesis H 0 : r ¯ ^ c , i r ¯ ^ c , j = 0 tests whether or not the observed correlations of two distinct metrics i and j are equal. The tabulated values are the confidence interval bounds of the correlation differences closest to zero. Significant correlation differences are indicated by asterisks (i.e., in case that zero was not within the corresponding confidence interval bounds). Results for the metrics’ correlation with observers’ preference ratings in metameric and multi-CCT lighting scenarios are shown in the lower and upper triangle of the table, respectively.

Equations (19)

Equations on this page are rendered with MathJax. Learn more.

f i ( x i ) = a i , 1 + a i , 2 e x p ( 1 2 ( ( x i μ i ) Σ i 1 ( x i μ i ) ) ) = a i , 1 + a i , 2 e x p ( 1 2 ( ( x i ( a i , 3 a i , 4 ) ) ( a i , 5 a i , 7 a i , 7 a i , 6 ) ( x i ( a i , 3 a i , 4 ) ) ) ) = a i , 1 + a i , 2 S i ( x i ) ,
R M C P I , i = 100 S i .
R M C P I = i = 1 12 ( R M C P I , i ) p i ,
r ¯ = i = 1 k N i r i i = 1 k N i ,
N i o p t = 1 τ ^ 2 + N i 1 ,
τ ^ H S 2 = Q k i = 1 k N i ,
Q = i = 1 k N i ( r i r ¯ ) 2 ,
r ¯ ^ = i = 1 k N i o p t r i i = 1 k N i o p t .
r 0 , i = U x r i ( U x 2 1 ) r i 2 + 1 ,
U x = 1 u x = σ 0 σ i ,
r y y , i = 1 S T R E S S i n t e r , i 100
r i c o r r = r i r y y , i .
a i = 1 1 2 N i 1 ,
a i = 1 1 r i 2 2 N i 1
r i b i a s = r i a i
r ¯ ^ c = ( i = 1 k N i o p t U x r i a i r y y , i ( ( U x 2 1 ) r i 2 + 1 ) ) ( i = 1 k N i o p t ) 1 = ( i = 1 k N i o p t r i a i ) ( i = 1 k N i o p t ) 1 ,
a i = { 1 ( 2 N i 1 ) 1 if  | r i | < 0.7 1 ( 1 ( r i ) 2 ) ( 2 N i 1 ) 1 otherwise
Q = i = 1 k N i ( r i a i r ¯ ) 2 ,
r ¯ = ( i = 1 k N i r i a i ) ( i = 1 k N i ) 1
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.