Multi-class GAN for generating multi-class images in object recognition

Bingxu Wang; Jinhui Lan; Jiangjiang Gao

doi:10.1364/JOSAA.454330

Journal of the Optical Society of America A
Vol. 39,
Issue 5,
pp. 897-906
(2022)
•https://doi.org/10.1364/JOSAA.454330

Multi-class GAN for generating multi-class images in object recognition

Bingxu Wang, Jinhui Lan, and Jiangjiang Gao

Not Accessible

Your library or personal account may give you access

Get PDF
Email
Share
Get Citation
Copy Citation Text
Bingxu Wang, Jinhui Lan, and Jiangjiang Gao, "Multi-class GAN for generating multi-class images in object recognition," J. Opt. Soc. Am. A 39, 897-906 (2022)

Export Citation
- BibTex
- Endnote (RIS)
- HTML
- Plain Text
Citation alert
Save article

Check for updates

Abstract

The current generative adversarial network (GAN) is limited in the application of data augmentation in object recognition. The training of the GAN is unstable, and the generated image quality is poor. Methods such as progressive growing of GANs and multi-scale gradient GAN solve these problems. The packed GAN (PacGAN) solves the problem of mode collapse during training. However, these methods can generate only one type of image at a time, and the training time is long. To solve the above problems, this paper proposes the multi-class GAN (Mc-GAN). It uses an augmented discriminator to train multiple generators at the same time. Through iterative training, the discriminator can accurately judge the output of each generator and guide it to generate the corresponding image. This paper analyzes the optimization process of the objective function of Mc-GAN. Experiments show that the method can generate high-quality images and reduce training time, and it can be used for data augmentation in object recognition. It effectively improves the practicality of GAN.

Full Article | PDF Article

More Like This

Multi-class remote sensing object recognition based on discriminative sparse representation

Xin Wang, Siqiu Shen, Chen Ning, Fengchen Huang, and Hongmin Gao
Appl. Opt. 55(6) 1381-1394 (2016)

Generative adversarial networks and faster-region convolutional neural networks based object detection in X-ray baggage security imagery

Jongchol Kim, Jiyong Kim, and Jinmyong Ri
OSA Continuum 3(12) 3604-3614 (2020)

Semantic-guided polarization image fusion method based on a dual-discriminator GAN

Ju Liu, Jin Duan, Youfei Hao, Guangqiu Chen, and Hao Zhang
Opt. Express 30(24) 43601-43621 (2022)

Previous Article Next Article

Data availability

Data underlying the results presented in this paper are available in the Celeba Dataset, Ref. [21], and ImageNet Dataset, Ref. [22].

21. Z. Liu, P. Luo, X. Wang, and X. Tang, “Deep learning face attributes in the wild,” in IEEE International Conference on Computer Vision (2015), pp. 3730–3738.

22. D. Jia, D. Wei, R. Socher, L. Li, L. Kai, and F. Li, “ImageNet: a large-scale hierarchical image database,” in IEEE Conference on Computer Vision & Pattern Recognition (2009).

Cited By

You do not have subscription access to this journal. Cited by links are available to subscribers only. You may subscribe either as an Optica member, or as an authorized user of your institution.

Contact your librarian or system administrator
or
Login to access Optica Member Subscription

Figures (5)

You do not have subscription access to this journal. Figure files are available to subscribers only. You may subscribe either as an Optica member, or as an authorized user of your institution.

Contact your librarian or system administrator
or
Login to access Optica Member Subscription

Tables (6)

You do not have subscription access to this journal. Article tables are available to subscribers only. You may subscribe either as an Optica member, or as an authorized user of your institution.

Contact your librarian or system administrator
or
Login to access Optica Member Subscription

Equations (20)

You do not have subscription access to this journal. Equations are available to subscribers only. You may subscribe either as an Optica member, or as an authorized user of your institution.

Contact your librarian or system administrator
or
Login to access Optica Member Subscription

	Metrics	Original Image	MSG-GAN	Mc-MSGGAN	Mc-MSGGAN without Augmented Discriminator
Face	IS	3.6864	2.9651	3.0292	2.0102
	FID	–	97.84	98.61	146.23
	PSNR	9.8352	9.3104	9.6145	8.3798
	SSIM	0.3154	0.2273	0.2206	0.2033
Eagle	IS	4.4307	3.5872	3.488	3.0536
	FID	–	166.5	166.83	256.41
	PSNR	12.9634	11.464	12.3708	10.2633
	SSIM	0.6235	0.4592	0.4123	0.3223
Goldfish	IS	4.2621	3.1916	4.1816	2.1843
	FID	–	199.03	197.98	264.09
	PSNR	10.2953	8.8643	9.7849	7.6302
	SSIM	0.4023	0.1864	0.2156	0.1164

Method	Object	Image Source	Top 1	Top 5
AlexNet	Eagle	Original image	83.65%	96.88%
		MSG-GAN	49.73%	68.27%
		Mc-MSGGAN	48.40%	69.33%
	Goldfish	Original image	95.63%	98.85%
		MSG-GAN	94.67%	98.40%
		Mc-MSGGAN	94%	97.47%
VGGNet	Eagle	Original image	97.92%	99.69%
		MSG-GAN	60.07%	95.13%
		Mc-MSGGAN	59.47%	95.53%
	Goldfish	Original image	90.47%	99.89%
		MSG-GAN	33.33%	71.47%
		Mc-MSGGAN	34%	69.60%
ResNet	Eagle	Original image	92.08%	96.56%
		MSG-GAN	47.27%	76.87%
		Mc-MSGGAN	47.47%	75.40%
	Goldfish	Original image	78.85%	96.04%
		MSG-GAN	14.00%	22.53%
		Mc-MSGGAN	12%	21.27%

	Original Image	MSG-GAN	Mc-MSGGAN
Alexnet	70.01%	56.1%	57.47%
VGGNet	66.54%	42.42%	36.69%
ResNet	61.09%	29.78%	33.96%

Method		Original Image	MSG-GAN	Mc-MSGGAN
Eagle	AlexNet	83.65%	86.88%	88.02%
	VGGNet	97.92%	97.19%	96.25%
	ResNet	92.08%	88.23%	94.38%
Goldfish	AlexNet	95.63%	94.89%	95.42%
	VGGNet	90.47%	86.04%	92.08%
	ResNet	78.85%	87.5%	88.23%

	Original Image	MSG-GAN	Mc-MSGGAN
Alexnet	70.01%	70.18%	65.59%
VGGNet	66.54%	67.39%	71.17%
ResNet	61.09%	57.72%	63.47%

Training Set: Test Set	7:3	8:2	9:1
Face	54.33%	53.5%	48%
Eagle	46.67%	51%	46%
Goldfish	49.33%	48.5%	52%

Training Set: Test Set	7:3	8:2	9:1
Face	54.33%	53.5%	48%
Eagle	46.67%	51%	46%
Goldfish	49.33%	48.5%	52%

Abstract

Data availability

Cited By

Figures (5)

Tables (6)

Equations (20)

Journal of the Optical Society of America A