Perception of human skin conditions and image statistics

Hitomi Otaka; Hitomi Shimakura; Isamu Motoyoshi

doi:10.1364/JOSAA.36.001609

1. INTRODUCTION

The appearance of facial skin plays an important role in human social interactions. Much psychophysical evidence has shown that skin appearance has significant influence on the perception of a person’s age [1,2], emotional state [3], health [1,2,4], and attractiveness [1,2,5]. It has been suggested that skin color as determined by blood flow and oxyhemoglobin is helpful in determining emotional state and health condition ([3], but see also [6]). Skin homogeneity is also one of the key properties in the perception of attractiveness [5], and wrinkles and dark spots are used as cues to evaluate age and health [7,8].

An image of skin is a product of complicated intercourse between light and multiple biological structures. Human skin is composed of three partially transparent layers: stratum corneum, epidermis, and dermis that consists of papillary and reticular parts [9]. The stratum corneum includes a fine texture; epidermis contains melanin; and papillary dermis contains blood with hemoglobin. Thus, the image of skin is determined by specular and diffuse light reflections on the stratum corneum as well as scattering and absorption through the deep layers [10]. From these image data alone, the visual system can perceive various qualities of the skin surface.

Recent psychophysical evidence shows that the visual system effectively utilizes low-level image features to perceive the quality of natural surfaces such as glossiness and translucency [11–18], whereas later studies also point that the system makes better use of higher level perceptual cues [19,20]. Studies have investigated key image features associated with the perception of certain skin properties. Some of them reveal that the variance and skew of pixel-luminance histograms are correlated with the perceived age of skin [21,22]. The other study that analyzed subsurface reflection images and surface reflection of the skin suggests that the variance and skew of luminance in the specular component of skin images can be used to predict the perceived shininess of skin [23].

The present study carried out a systematic and comprehensive investigation of the relationships among various apparent properties of skin and image features, though it only investigated females of certain ethnicities (Chinese Americans, Caucasians, Thais, and Japanese) [24]. Using a large set of skin images (289 patches and 176 whole faces), we collected rating data from a total of 34 Japanese females for nine different attributes concerning various physical and biological properties as well as attractiveness and healthiness, and we analyzed the relationships among them and their correlations with pixel and subband image statistics. As a result, we found that (1) skin appearance is well described only by two cardinal dimensions, (2) each cardinal skin appearance is highly correlated with a relatively small set of simple image statistics, and (3) the skin appearance is controlled by the manipulation of these diagnostic image statistics. Although the data were obtained for only a limited group of people, both in terms of face images and observers, the results suggest a possibility that, in contrast to the rich vocabulary regularly used to describe skin appearance, the visual system relies primarily on a few sets of textural representations that are correlated with simple image features.

Fig. 1. Examples of skin images used in the experiment.

Download Full Size | PDF

2. EXPERIMENT 1

A. Methods

1. Observers

Twenty-eight native Japanese females [28.4 (SD = 5.7) years old] participated in the experiment. All had normal or corrected-to-normal vision. The experiment was conducted in accordance with the Declaration of Helsinki and approved by the Ethical Committee of Shiseido Global Innovation Center. All participants provided written informed consent prior to participation.

2. Apparatus

All visual stimuli were presented on an LCD monitor (Color Edge CG277, EIZO, Japan). The monitor had a spatial resolution of 0.072 min/pixel at the viewing distance of 0.82 m. The luminance of each pixel (8 bits) was calibrated by a spectroradiometer (CS-2000A, KONICA MINOLTA, INC., Japan) and gamma corrected. The CIE $xy$ coordinate of the monitor was $[x,y]=[0.104,0.057] $ for ${\rm{R}}$, [0.257, 0.481] for ${\rm{G}}$, and [0.136, 0.068] for ${\rm{B}}$, respectively.

3. Stimuli

The visual stimuli were 289 patches of skin images (Fig. 1). The images were parts of female faces. Reflecting skin conditions in daily life, faces are wearing makeup or are without makeup and after applying skincare products. These are collected from multicultural women (Chinese American, Caucasian, Thai, and Japanese) ranging from 20 to 50 years old. Each image was photographed from a distance of 1 m by a digital camera (D5300, Nikon, Japan) in a light-controlled environment. The models were seated, facing the camera directly against the background of a uniform gray wall. In every image, the face of a woman with closed eyes and a neutral facial expression was illuminated diagonally from the top by a strobe light (D65) equipped with a diffuser. The image patch was a square region of ${256} \times {256}$ pixels (${4.2} \times {{4.2}^\circ} $) taken from an area around the cheek that did not contain obviously visible black hair or moles. The camera’s gamma was assumed to be 0.5. The scale of the skin images was 1.4 times larger than actual size.

4. Procedure

For each skin image, we collected nine attributes of rating data on the apparent properties of skin: uniformity, glossiness, tightness, smoothness, translucency, healthiness, moisture, oiliness and attractiveness [note that these words are translations from those in the Japanese language we used (Hadairo no Kinitsu-sa, Koutaku, Hari, Nameraka-sa, Toumeido, Kenko, Suibun, Yubun, and Miryoku, respectively)]. These attributes were decided on the basis of vigorous pilot observations in which we tested with 29 observers’ ratings for 24 candidate attributes including brightness, stickiness, softness, warmness, thickness, sex, age, cleanliness, liveliness, artificiality, amount of makeup, complexion, desire to approach, dullness, and shininess, in addition to the above nine attributes. Analyzing the cross correlation of the rating data across attributes and observers and considering conceptual similarities across attributes, difficulty of rating task, absolute variations of the rating data, and so on, we finally adopted these nine attributes. Prior to the experiment, detailed instructions were given to observers regarding the definition of each attribute. Observers were instructed to rate the given attribute on a five-point scale, in which 5 stands for the positive extreme, 1 stands for the negative extreme, and 3 stands for a neutral or undecided opinion. For each attribute, the two extremes were explained to observers as follows. Uniformity = [uniform or no texture (5), non-uniform or variegated (1)] , glossiness = [very glossy (5), matte (1)], tightness = [very tight (5), winkled (1)], smoothness = [very smooth (5), bumpy (1)], translucency = [very translucent (5), opaque (1)], healthiness = [very healthy (5), unhealthy (1)], moisture = [very moist or wet (5), dry (1)], oiliness = [very oily (5), not oily at all (1)], and attractiveness (very attractive (5), not attractive at all (1)]. Note that all instructions were given in Japanese, and the above words are translations that do not always capture the precise meaning of Japanese words employed in the instruction.

The rating data were collected for each of the nine attributes in separate blocks of random order. In each block, 289 stimuli were presented on a dark background of $\sim{0.1}\,\,{{\rm{cd/m}}^2}$, also in random order. At the beginning of each block, observers were instructed on the attribute to assess (e.g., translucency) and shown with all stimuli in succession so that they could establish the criteria. During each block trial, observers freely viewed the stimulus and rated by pressing buttons the given attribute on a five-point scale that was shown 1.6° below the skin image. The observers were allowed to view the stimulus freely until giving their response. They were informed that skin patches were parts of a person’s cheek. When observers finished the block, they either initiated the next block or had a rest. In total, 2601 rating data (${9}\,{\rm{attributes}} \times {289}\,{\rm{images}}$) were collected for each observer.

B. Results

Figure 2A shows a histogram of the average rating for all stimuli. We found that for all attributes, the rating (average across observers) is distributed widely in accordance with variations of skin images. Figure 2B shows the Pearson’s product-ratio correlation coefficients in the ratings between different attributes (${p} \lt {.01}$), in which red represents positive correlation and blue represents negative correlation. Ratings are highly correlated to one another, implying that they are governed by a few underlying factors.

Fig. 2. (A) Histogram of distribution of ratings obtained for 289 skin images. Each panel shows the results for different attributes. (B) Correlation matrix of the rating across attributes. Red represents positive correlation, and blue represents negative correlation.

Download Full Size | PDF

1. Factor Analysis

It is well known that materials could be clustering in the space of perceptual qualities [25]. To examine the underlying factors behind the rating data for the nine attributes, we next performed a factor analysis for those data using the principal factor method with promax rotation (note that the data were normalized in the analysis). The analysis extracted two factors based on the Kaiser-Guttman criterion. The eigenvalues of the third and later factors were less than one, indicating that they were not effective. These two factors accounted for 83% of the variance in the data. Figure 3A plots the factor loading of each rating item on a two-dimensional space consisting of Factor 1 ($x$ axis) and Factor 2 ($y$ axis). Many items, including healthiness, translucency, uniformity, tightness, smoothness, and attractiveness, are concentrated at higher values along the $x$ axis. This result indicates that skin that appears more translucent, uniform, tight, smooth, or attractive is commonly evaluated as high along this axis, whereas skin that appears more opaque, inhomogeneous, bumpy, and ugly is evaluated as low. Therefore, this axis (Factor 1) may be named “pleasantness” (pleasant-less pleasant). On the other hand, we found that only two items, including glossiness and oiliness, are concentrated along the $y$ axis, indicating that skin that appears glossy or oily is evaluated as high along this axis. Simply, this axis (Factor 2) may be named “glossiness” (glossy-matte). The item “moisture” is located in between the two axes, indicating that moist skin appears glossier and more pleasant than dry skin. Figure 3B plots several samples of skin patch images on the same 2D space. The impressions of these images are well arranged according to the two factors. Although the names of the factors are subjective, these results suggest that ratings for various aspects of skin appearance are largely determined by the two cardinal dimensions, pleasantness and glossiness.

Fig. 3. (A) Nine rating attributes of skin appearance arranged on a plane consisting of two cardinal axes derived by a factor analysis. The horizontal axis is Factor 1, and the vertical axis is Factor 2. (B) Sample images of skin patch arranged on the same plane.

Download Full Size | PDF

2. Image Analysis

To examine if and how the rating data and the two underlying factors are related to the simple features of the skin image, we analyzed simple statistics of each of the 289 images used in the experiment. In the analysis, the RGB image was converted to luminance, red–green (RG, L-M), and yellow–blue (YB, B-Lum) in the DKL color space by using a method proposed by Brainard [26,27]. The adaptation level was defined as ${30}\,\,{\rm{cd/m}^{2}}$, and the scale of three axes was normalized on the basis of the monitor’s chromatic property (see Appendix A for the spatial distribution of chromatic signals for example images in Fig. 1). Each image was decomposed into subband images of eight different spatial scales (1, 2, 4, 8, 16, 32, 64, 128 c/image, 0.24, 0.48, 0.95, 1.9, 3.8, 7.6, 15, 31 c/deg) by using Gaussian bandpass filters with a bandwidth of two octaves. For each color channel, we calculated the second to fourth moment statistics of the image (s.d., skew, and kurtosis) for each subband and for pixel data. Note that s.d. and kurtosis were log-scaled in the analysis. Figure 4A illustrates the distribution of several image statistics obtained for all stimuli: mean, and subband s.d. (log), skew, and kurtosis (log) at 1, 8, and 32 c/image, respectively.

Fig. 4. (A) Distribution of image statistics obtained for all stimuli. (B) Correlation coefficients of skin image statistics with scores of Factor 1 (pleasantness) and of Factor 2 (glossiness). (C) Correlation coefficients of skin image statistics with the rating for each attribute. Red represents positive correlation, while blue represents negative correlation.

Download Full Size | PDF

To quantify how these statistics are related to the skin appearance, we calculated the correlation coefficients of each image statistic with the rating data as well as with the factor scores. Figure 4B shows the correlation coefficients obtained for the Factor 1 (pleasantness) score and the Factor 2 (glossiness) score. The horizontal axis shows the spatial frequency band, and each row shows the correlations for each color channel and class of statistics (s.d., skew, and kurtosis). Red cells represent the positive correlation and blue cells the negative correlation, respectively; only statistically significant correlations (${p} \lt {.01}$) are colored. We employed this color map instead of the line plots because it allowed us to grasp the general tendency of correlation data at a glance.

The data show that Factor 1 and 2 evince very strong correlations with a particular pattern of image statistics. Factor 1’s score is negatively correlated with the s.d. in both luminance and colors over a wide range of spatial frequencies and is positively correlated with the mean luminance. Factor 1’s score is also correlated positively with the mean luminance and with the mean RG, and negatively with the mean YB, although the correlation for chromatic channels are low. These are consistent with an impression in Fig. 3B where, as FA1’s score increases, skin samples appears lighter, somehow more pinkish, and smoother [4,5]. Factor 2’s score is positively correlated with a wide range of subband skew for luminance and s.d., skew, and kurtosis of pixel luminance. These are consistent with an impression in Fig. 3B in which skin samples appears glossier as FA2’s score increases [12,13]. Each panel in Fig. 4C shows the correlation coefficients obtained for each attribute. As expected from the high correlations in the ratings among attributes (Fig. 2B), some of the correlation maps with image statistics are similar to one another. The maps can be classified into two groups. One group consists of the results for oiliness and glossiness, and the other consists of the results for the other seven attributes. Again, as expected from the results of the factor analysis (Fig. 3A), the former two maps look quite similar to the map for Factor 2 in Fig. 4B, and the latter seven maps look similar to the map for Factor 1. However, we can also find partial differences among attributes. For example, the results for attractiveness and surface uniformity appear to be determined almost exclusively by the s.d. of luminance, whereas the color statistics also appear to be effective for healthiness and smoothness. We obtained similar correlation maps when we computed image statistics with the other color space (CIELAB space).

As many statistics in our skin images are mutually correlated with one another, we further conducted multiple regression analysis (stepwise, ${p} \lt {.05}$) to identify which image statistics were the most indicative for the scores of Factor 1 and 2, respectively. The results showed that Factor 1 was determined by the subband’s s.d. at mid-spatial frequency (8 c/image) in luminance (${{\rm{R}}^2} \gt {0.41}$) and that Factor 2 was determined by s.d. and skew in the pixel luminance (${{\rm{R}}^2} \gt{0.65}$). Namely, skin images that appear “pleasant” tend to have low power (s.d.) at mid-spatial frequencies at around 8 c/image (1.9 c/deg), and those appearing “glossy” tend to have higher s.d. and skewness in pixel luminance patterns, although it is unclear whether the mild spatial frequency dependency of pleasantness is relevant in the retinal scale (c/deg) or in the relative scale (c/image). The results may imply that the perceptual appearance of various properties of human skin are associated with a relatively small set of simple image statistics such as s.d. and skew of the luminance-color subband histogram.

3. EXPERIMENT 2

In Experiment 1, we employed small patches of skin images so that the observers would make judgments only by using image information within a fixed and limited part of the face. Could a similar result pattern be obtained even when the observers freely viewed the whole face, which is more relevant to our daily communications? Thus, we carried out the same rating experiment and image statistics analysis by using whole face images.

A. Methods

The visual stimuli were 176 images of the entire female face used in Experiment 1. The stimuli were presented within an oval window [${19.1}({\rm{H}}) \times {13.5}({\rm{W}})$ deg] in the center of the dark background. The size and position of all face images were adjusted in advance so that the contours of the faces did not appear. The scale of skin images was 1.2 times larger than the actual size, so as to be comparable with those in Experiment 1. For each face image, we collected nine attributes of rating data, using the same procedure as Experiment 1. Other procedures were also the same as in Experiment 1. Six native Japanese female observers [28.8 (SD = 3.7) years old], who did not participated in Experiment 1, participated in the experiment. All had normal or corrected-to-normal vision. The experiment was conducted in accordance with the Declaration of Helsinki and approved by the Ethical Committee of Shiseido Global Innovation Center. All participants provided written informed consent prior to the participation.

Fig. 5. (A) Nine rating attributes of skin appearance arranged on a plane consisting of two cardinal axes derived by a factor analysis in face images. The horizontal axis is Factor 1, and the vertical axis is Factor 2. (B) Correlation coefficients of image statistics for the face image, Factor 1 (pleasantness), Factor 2 (glossiness), and nine attributes. Red represents positive correlation; blue represents negative correlation.

Download Full Size | PDF

B. Results

The results of the ratings for the whole face were similar to those for the skin patch. The factor that had an effective eigenvalue was 2 (Fig. 5A). The correlation coefficients of the rating of the patch and the whole face were 0.65 or more (${p} \lt {\rm{.01}}$; mean 0.75). Figure 5A shows the result of the factor analysis, which reveals two factors as in Experiment 1. These two factors accounted for 81% of the variance in the data. This result indicates that the whole face’s skin appearance is determined by the same two cardinal dimensions as the skin patch. We analyzed simple statistics of each of the 176 images used in the experiment. The areas inside the elliptical window, as well as the eyes, eyebrows, and mouth, were excluded from the horizontal oval portion and used for analysis, as in Experiment 1. The correlation coefficients of image statistics, rating values, and factor scores were calculated. Figure 5B shows the correlation coefficients obtained for Factor 1 (pleasantness) score, Factor 2 (glossiness) score, and the attributes of each. These maps are very similar to the map obtained for skin patches (Figs. 4B and 4C). These results indicate that the two cardinal dimensions and the corresponding image statistics are also relevant to the perception of the appearance of the whole face’s skin. There are also several discrepancies in the results between different types of stimuli. A notable difference is concerned with the effect of pixel-mean color on the apparent healthiness, which has been demonstrated by some previous studies [4], but not the others [28]. The present results show no significant correlation for skin patches (Fig. 4C), but for the whole face’s skin (Fig. 5B) that appears healthier as reddish (or less greenish) and yellowish (or less bluish). This may indicate a possibility that the whole face, and/or parts other than cheek, could play an important role for the apparent healthiness.

4. EXPERIMENT 3

The results of the previous two experiments support the notion that the perceptual appearance of skin is determined by two cardinal factors, and the either is correlated with a small set of simple image statistics. Concretely, Factor 1 was related with luminance s.d. at middle spatial frequency bands around 8 c/image, and Factor 2 was related with luminance s.d. and skew at broad spatial frequency bands. If this is the case, it can be expected that manipulation of these critical image statistics would alter the appearance of skin. Thus, decreasing s.d. at around 8 c/image bands would enhance skin’s pleasantness. Similarly, decreasing the s.d. and skew of a subband image of glossy skin would reduce the apparent glossiness, as had been demonstrated in the previous studies with natural surfaces [12,13]. Thus, we examined whether manipulations of image statistics related to Factor 1 and Factor 2, as revealed by regression analyses in Experiment 1, significantly affected the ratings for the nine attributes as well as the factor scores.

A. Methods

We manipulated parts of critical subband statistics related to Factor 1 and 2 for six patch images that were selected from stimuli used in Experiment 1. Regarding image statistics for Factor 1, the s.d. of luminance subbands at middle spatial frequencies around 8 c/image was decreased by four levels from the original relatively negatively scored image (Fig. 6A). We used three original images with different levels of Factor 2 (glossiness): glossy, a little glossy, and matte. We did not increase the s.d. of the original images that appeared positive (i.e., smooth and uniform) because it was difficult to make a smooth surface image look bumpy. Regarding image statistics for Factor 2 (glossiness), the s.d. and skew of luminance subbands over all spatial frequencies were decreased by four levels from the original image, which was scored as relatively glossy (Fig. 6B). We used three original images with different levels of Factor 1 (pleasantness): pleasant, intermediate, and less pleasant. We did not increase these statistics from the original images that looked matte because it was impossible to make a perfectly matte surface image look glossy. These manipulations could alter many other image features, but we left them intact because further image rendering could produce unnatural spurious features in the resulting image at least with our current techniques; we also confirmed this by using Portilla–Simoncelli texture synthesis.

Fig. 6. Examples of skin images used in the third experiment. (A) Factor 1 (pleasantness) manipulated images. The images for which the s.d. of luminance subbands at middle spatial frequencies around 8 c/image were multiplicatively decreased by four levels from the original image on the right. (B) Factor 2 (glossiness) manipulated images. The images for which the s.d. and skew of luminance subbands over all spatial frequencies were decreased by four levels from the original image on the right.

Download Full Size | PDF

The s.d. and skew of subband images were manipulated by means of the following algorithm similar to that used in the previous study [29]. (1) To manipulate skew, the target subband image, the mean of which is definitely zero, was first made positive by adding the minimum of the image ${\rm{M}}$, and powered by ${\rm{K}}$, and made it back to the image with the near-zero mean by subtracting ${\rm{M}}$. (2) The s.d. was normalized to the original value if necessary, and the image was multiplied by ${\rm{S}}$ to manipulate the s.d. To control the appearance along Factor 1, ${\rm{S}}$ was varied for 0.33, 0.44, 0.57, and 1 (Fig. 6A), and to control the appearance along Factor 2, ${\rm{S}}$ was varied for 0.44, 0.57, 0.76, and 1, and ${\rm{K}}$ was varied for 0.44, 0.57, 0.76, and 1 (Fig. 6B). We found that at least for the sample images shown in Fig. 6, modulations of these image statistics successfully and naturally altered the appearance of skin.

The visual stimuli were 12 images controlled by the appearance of Factor 1, and 12 other images that were controlled by the appearance of Factor 2. Each image was rated in nine attributes as in Experiment 1. In an experimental session of 108 trials, stimuli were presented in random order. Each observer repeated the sessions three times. Other procedures were the same as in Experiment 1. Twelve native Japanese female observers [28.9 (SD = 3.2) years old] served as participants for stimuli (1), and 10 native paid Japanese female observers [25.5 (SD = 2.4) years old] served as participants for stimuli (2). None of them participated in Experiment 1 nor 2. All had normal or corrected-to-normal vision. The experiment was conducted in accordance with the Declaration of Helsinki and approved by the Ethical Committee of Shiseido Global Innovation Center. All participants provided written informed consent prior to the participation. All participants reported that all images looked very realistic.

Fig. 7. (A) Ratings as a function of the modulation amount of the luminance s.d. of the subband, which are related to Factor 1 (pleasantness). Each panel shows the results for different attributes. Different colors represent the results obtained for three original skin images with different levels of Factor 2 (glossiness): gray represents the results for glossy skin, orange for neutral skin, and blue for matte skin. (B) The ratings as a function of the modulation amount of the luminance s.d. and skew, which are related to Factor 2 (glossiness). Each panel shows the results for different attributes. Different colors represent the results obtained for three original skin images with different levels of Factor 1 (pleasantness): red represents the results for relatively pleasant skin, green for neutral skin, and black for less pleasant skin. The error bar represents $ \pm{1}$ SE across observers.

Download Full Size | PDF

B. Results

Figure 7A shows the mean ratings as a function of the modulation amount of the luminance s.d. of the subband (Factor 1). Different colors represent the results obtained for three original skin images with different apparent glossiness. The rating value was submitted to one-way ANOVA with repeated measures. The effect of the luminance s.d. of the 8 c/image was significant [smoothness, ${\rm{F}}({{3,99}})= {83.72}$, ${\rm{p}} \lt {0.0001}$; healthiness, ${\rm{F}}{(3,99)}= {40.58}$, ${\rm{p}} \lt {0.0001}$; translucency, ${\rm{F}}{(1.85,61.06)} ={23.12}$, ${\rm{p}} \lt{0.0001}$; uniformity, ${\rm{F}}{(2.31,76.21)} ={ 23.15}$, ${\rm{p}} \lt {0.0001}$; attractiveness, ${\rm{F}}{(2.11,69.81)} = {31.92}$, ${\rm{p}} \lt {0.0001}$; tightness, ${\rm{F}}{(3,99)} = {32.75}$, ${\rm{p}} \lt {0.0001}$); moisture, ${\rm{F}}{(3,99)} = {4.02}$, ${\rm{p}} \lt{0.01}$]. These results suggest that the luminance s.d. of the 8 c/image is effective for apparent skin-related properties in Factor 1 and ineffective for glossy skin appearance. Figure 7B shows the mean ratings as a function of the modulation amount of the s.d. and skew of subbands (Factor 2). Different colors represent the results obtained for three original skin images with different pleasantness (in terms of Factor 1’s score). The rating value was submitted to a one-way ANOVA with repeated measures. The effect of the luminance s.d. and skew was significant [glossiness, ${\rm{F}}{(3,81)} = {31.72}$, ${\rm{p}} \lt{0.0001}$; oiliness, ${\rm{F}}{(3,82)} = {35.47}$, ${\rm{p}} \lt{0.0001}$]. The s.d. and skew of luminance subbands over all spatial frequencies associated with Factor 2 contain the s.d. of luminance subbands at middle spatial frequencies associated with Factor 1. Thus, there was significance in some items related to Factor 1 [smoothness, ${\rm{F}}{(2.6,70.08)}= {47.63}$, ${\rm{p}}\lt {0.0001}$; healthiness, ${\rm{F}}{(3,81)} ={ 9.89}$, ${\rm{p}} \lt {0.0001}$; translucency, ${\rm{F}}{(3,81)}= {5.99}$, ${\rm{p}} \lt {0.001}$; uniformity, ${\rm{F}}{(2.34,63.15)}={59.97}$, ${\rm{p}} \lt {0.0001}$; attractiveness, ${\rm{F}}{(3,81)} = {51.57}$, ${\rm{p}} \lt {0.0001}$; tightness, ${\rm{F}}{(3,81)} = {14.36}$, ${\rm{p}} \lt {0.0001}$; moisture, ${\rm{F}}{(3,81)}= {1.46}$, n.s]. The fact that the decrease in skew leads to higher attractiveness is partially consistent with the recent finding that humans generally prefer natural scenes with lower luminance skewness [30]. Overall, these results suggest that a small number of simple image statistics related to Factors 1 and 2 affect skin appearance and that skin appearance can be controlled by such simple image manipulation using these statistics. We believe that the fact that skin appearance was successfully altered by manipulating subband statistics without changing the mean luminance and color (Experiment 3) indicates the important role of subband statistics, which had previously been shown for solid surfaces [13,16], but not for human skin.

5. DISCUSSION

The present study examined how humans perceive various properties of human skin by analyzing the relationship between multidimensional rating data and low-level image statistics. The results showed that perceptual properties of skin can be summarized into two fundamental dimensions: pleasantness and glossiness. Moreover, the evaluation of each dimension is correlated with moment image statistics such as s.d. and skew at particular spatial frequency bands. The same result pattern was obtained for both small image patches and images of whole faces. The subsequent experiment further demonstrated that manipulation of a small set of critical image statistics significantly altered the appearance of skin in the expected direction. These results led us to the notion that human skin appearance can be described as functions of two cardinal perceptual axes, each of which is well predicted by low-level image statistics.

The role of image statistics has been illustrated for various aspects of natural surfaces, including glossiness [11–13,17], wetness [30], translucency [14], bumpiness [16,30], and so on. However, it has been claimed that such low-level visual information is insufficient to explain the perception perfectly [19]. Several studies with large natural image datasets show that the apparent glossiness in 70% to 80% of daily objects can be classified based solely on pixel image statistics [18]. Other lines of studies suggest that such diverse material perception must be described by the “appearance cues,” which are presumably associated with more complex information such as 3D structure, rather than simple image statistics [19,20], although these studies show no evidence for that these cues are detected by the biological visual system [i.e., they explain one subjective appearance (e.g., gloss) by the other subjective appearance (3D shape)].

Nevertheless, the present data suggested that human skin perception is predicted well (correlation of $ \gt\!{0.6}$) by a small set of low-level image statistics. One reason for this may be that in contrast to the aforementioned studies that employ images of objects that contain much contour information, the present study employed images of human skin that were relatively uniform textures with no explicit contour information. It is sensible that the perception of such textural images is predicted by the statistical property of the image. Another potentially important factor is that the perception data were collected only for a single material category (skin). Indeed, the previous studies also show that simple image statistics predict the change in the perceived glossiness [12,13] and wetness [31] of the same surface but not the diversity of the perceived glossiness across different materials [18].

In cosmetics and other cultural activities, we tend to describe human skin with a large variety of words. Despite these rich expressions used in everyday language, the present results reveal that the visual system discriminates and evaluates skin appearance based on a small number (probably two dimensions) of fundamentals. This finding is analogous to the discovery that only a few dimensions of fundamental signals underlie our rich and varied language regarding colors, faces, and emotions. One should of course be cautious about this claim because the present results are derived from only nine attributes of rating data. If we collected the rating data for more attributes, we might have found additional cardinal dimensions. However, according to our pilot analyses, in which we examined 24 more rating items, it is difficult for us to expect a third axis that is clearly orthogonal to the two dimensions shown in Fig. 3A.

It should be noted that all visual stimuli and observers used in the present study were females. Thus, the present finding is concerned with the perception of female skin by females, and it is possible that different result patterns could be obtained from males. It is well known that skin appearance significantly differs between females and males and that skin can be used to determine the person’s physical sex [32]. It is also suggested that the effect of skin appearance on a person’s attractiveness is greater in female skin than in male skin [33].

The present results revealed a robust relationship between human skin perceptions and low-level image statistics, providing a simple and useful framework for the analysis and control of skin appearance. However, it is still unclear what physical and physiological characteristics of skin are related to the critical image statistics we found. Quantitative analyses of such relationships may help us to gain deeper insights into the understanding of human skin perception. Also, given that the present results are obtained only for images of limited races (Caucasians and Asians) evaluated by a limited group of observers (Japanese females), one should be cautious about generalizing the conclusion.

APPENDIX A

Fig. 8. Isoluminant chromatic images of skin patches in Fig. 1.

Download Full Size | PDF

Figure 8 shows isoluminant images that illustrate spatial distribution of chromatic information in the skin patch images shown in Fig. 1.

Funding

Shiseido Group.

REFERENCES

1. B. Fink, K. Grammer, and J. P. Matts, “Visible skin color distribution plays a role in the perception of age, attractiveness, and health in female faces,” Evol. Hum. Behav. 27, 433–442 (2006). [CrossRef]

2. J. P. Matts, B. Fink, K. Grammer, and M. Barquest, “Color homogeneity and visual perception of age, health, and attractiveness of female facial skin,” J. Am. Acad. Dermatol. 57, 977–984 (2007). [CrossRef]

3. M. A. Changizi, Q. Zhang, and S. Shimojo, “Bare skin, blood and the evolution of primate colour vision,” Biol. Lett. 2, 217–221 (2006). [CrossRef]

4. I. D. Stephen, J. L. Smith, M. R. Stirrat, and D. I. Perrett, “Facial skin coloration affects perceived health of human faces,” Int. J. Primatol. 30, 845–857 (2009). [CrossRef]

5. B. Fink, K. Grammer, and R. Thornhill, “Human (Homo sapiens) facial attractiveness in relation to skin texture and color,” J. Comp. Psych. 115, 92–99 (2001). [CrossRef]

6. T. Chauhan, K. Xiao, and S. Wuerger, “Chromatic and luminance sensitivity for skin and skinlike textures,” J. Vision 19(1), 13 (2019). [CrossRef]

7. B. Fink and P. J. Matts, “The effects of skin color distribution and topography cues on the perception of female facial age and health,” J. Eur. Acad. Dermatol. Venereol. 22, 493–498 (2008). [CrossRef]

8. A. Nkengne, C. Bertin, G. N. Stamatas, A. Giron, A. Rossi, N. Issachar, and B. Ferti, “Influence of facial skin attributes on the perceived age of Caucasian women,” J. Eur. Acad. Dermatol. Venereol. 22, 982–991 (2008). [CrossRef]

9. R. R. Anderson and J. A. Parrish, “The optics of human skin,” J. Invest. Dermatol. 77, 13–19 (1981). [CrossRef]

10. Y. Masuda, T. Yamashita, T. Hirao, and M. Takahashi, “An innovative method to measure skin pigmentation,” Skin Res. Technol. 15,224–229 (2009). [CrossRef]

11. R. W. Fleming, R. O. Dror, and E. H. Adelson, “Real-world illumination and the perception of surface reflectance properties,” J. Vision 3(5), 347–368 (2003). [CrossRef]

12. I. Motoyoshi, S. Nishida, L. Sharan, and E. H. Adelson, “Image statistics and the perception of surface qualities,” Nature 447, 206–209 (2007). [CrossRef]

13. L. Sharan, Y. Li, I. Motoyoshi, S. Nishida, and E. H. Adelson, “Image statistics for surface reflectance perception,” J. Opt. Soc. Am. A 25, 846–865 (2008). [CrossRef]

14. I. Motoyoshi, “Highlight-shading relationship as a cue for the perception of translucent and transparent materials,” J. Vis. 10(9):6, 1–11 (2010). [CrossRef]

15. I. Motoyoshi and H. Matoba, “Variability in constancy of the perceived surface reflectance across different illumination statistics,” Vision Res. 53, 30–39 (2012). [CrossRef]

16. M. Giesel and Q. Zaidi, “Frequency-based heuristics for material perception,” J. Vis. 13(14):7, 1–19 (2013). [CrossRef]

17. R. W. Fleming, “Visual perception of materials and their properties,” Vision Res. 94, 62–75 (2014). [CrossRef]

18. C. B. Wiebel, M. Toscani, and K. R. Gegenfurtner, “Statistical correlates of perceived gloss in natural images,” Vision Res. 115, 175–187 (2015). [CrossRef]

19. J. Kim and B. L. Anderson, “Image statistics and the perception of surface gloss and lightness,” J. Vis. 10(9):3, 1–17 (2010). [CrossRef]

20. P. J. Marlow and B. L. Anderson, “Generative constraints on image cues for perceived gloss,” J. Vis. 13(14):2, 1–23 (2013). [CrossRef]

21. C. Arce-Lopera, T. Igarashi, K. Nakao, and K. Okajima, “Effects of diffuse and specular reflections on the perceived age of facial skin,” Opt. Rev. 19, 167–173 (2012). [CrossRef]

22. C. Arce-Lopera, T. Igarashi, K. Nakao, and K. Okajima, “Image statistics on the age perception of human skin,” Skin Res. Technol. 19, e273–e278 (2013). [CrossRef]

23. A. Matsubara, Z. Liang, Y. Sato, and K. Uchikawa, “Analysis of human perception of facial skin radiance by means of image histogram parameters of surface and subsurface reflections from skin,” Skin Res. Technol. 18, 265–271 (2012). [CrossRef]

24. K. Xiao, J. M. Yates, F. Zardawi, S. Sueeprasan, N. Liao, L. Gill, and S. Wuerger, “Characterising the variations in ethnic skin colours: a new calibrated data base for human skin,” Skin Res. Technol. 23, 21–29 (2017). [CrossRef]

25. R. W. Fleming, C. Wiebel, and K. Gegenfurtner, “Perceptual qualities and material classes,” J. Vis. 13(8):9, 1–20 (2013). [CrossRef]

26. K. P. Kaiser and M. R. Boynton, Human Color Vision (Optical Society of America, 1996).

27. K. Gegenfurtner and L. Sharpe, Color Vision: From Genes to Perception (Cambridge University, 2001).

28. K. W. Tan, B. Tiddeman, and I. D. Stephen, “Skin texture and colour predict perceived health in Asian faces,” Evol. Hum. Behav. 39,320–335 (2018). [CrossRef]

29. I. Boyadzhiev, K. Bala, S. Paris, and E. Adelson, “Band-sifting decomposition for image-based material editing,” ACM Trans. Graphics 34, 163 (2015). [CrossRef]

30. D. Graham, B. Schwarz, A. Chatterjee, and H. Leder, “Preference for luminance histogram regularities in natural scenes,” Vision Res. 120, 11–21 (2016). [CrossRef]

31. M. Sawayama and E. H. Adelson, “Visual wetness perception based on image color statistics,” J. Vis. 17(5):7, 1–24 (2017). [CrossRef]

32. H. Hill, V. Bruce, and S. Akamatsu, “Perceiving the sex and race of faces: the role of shape and color,” Proc. R. Soc. London B 261, 367–373 (1995). [CrossRef]

33. S. Samson, B. Fink, and P. J. Matts, “Visible skin condition and perception of human facial appearance,” Int. J. Cosm. Sci. 32, 167–184 (2010). [CrossRef]

Perception of human skin conditions and image statistics

Abstract

1. INTRODUCTION

2. EXPERIMENT 1

A. Methods

1. Observers

2. Apparatus

3. Stimuli

4. Procedure

B. Results

1. Factor Analysis

2. Image Analysis

3. EXPERIMENT 2

A. Methods

B. Results

4. EXPERIMENT 3

A. Methods

B. Results

5. DISCUSSION

APPENDIX A

Funding

REFERENCES

Cited By

Figures (8)

Journal of the Optical Society of America A