Training a dynamically configurable classifier with deep Q-learning

R. O. Malashin; R. O. Malashin; R. O. Malashin; A. A. Boiko; A. A. Boiko

doi:10.1364/JOT.89.000437

Journal of Optical Technology
Vol. 89,
Issue 8,
pp. 437-447
(2022)
•https://doi.org/10.1364/JOT.89.000437

Training a dynamically configurable classifier with deep Q-learning

R. O. Malashin and A. A. Boiko

Not Accessible

Your library or personal account may give you access

Get PDF
Email
Share
Get Citation
Copy Citation Text
R. O. Malashin and A. A. Boiko, "Training a dynamically configurable classifier with deep Q-learning," J. Opt. Technol. 89, 437-447 (2022)

Export Citation
- BibTex
- Endnote (RIS)
- HTML
- Plain Text
Citation alert
Save article

Check for updates

Abstract

Subject of study. We studied dynamic networks capable of performing calculations from input data. Aim. We studied whether deep Q-learning can be used for the construction of dynamic computer vision networks. Methods. In modern dynamically configurable systems, image analysis is typically performed using a policy gradient algorithm. We propose a method for hybrid Q-learning by an image classification agent taking into account limitations on available computer resources. We train the agent to recognize images using a set of pretrained classifiers, and the resulting dynamically configurable system is capable of constructing a computational graph that takes into account the limitations on the number of operations with a trajectory that corresponds to the maximum expected accuracy. The agent only receives an award when the image is correctly recognized within a limit on the number of actions that can be taken by the agent. Experiments were performed using the CIFAR-10 image database and a set of six external classifiers that the agent was trained to control. The experiments performed showed that the standard deep learning method using action values (Deep Q-Network) does not permit the agent to learn strategies that are better than random ones in terms of recognition accuracy. We therefore propose a Q-least-action classifier that approximates the desired classifier selection function by reinforcement learning and the label prediction function by supervised learning. Main results. The trained agent exceeded the recognition accuracy of random strategies (reduces the error by 9.65%). We show that such an agent can make explicit use of information from several classifiers since the accuracy increases when the number of permitted actions increases. Practical significance. Our research shows that the deep Q-learning method is capable of extracting information from sparse responses by classifiers as well as a least-action classifier trained by the policy-gradient method. In addition, the method proposed herein did not require the development of special loss functions.

PDF Article

More Like This

Efficient automated high dynamic range 3D measurement via deep reinforcement learning

Pan Zhang, Kai Zhong, Zhongwei Li, and Yusheng Shi
Opt. Express 32(4) 4857-4875 (2024)

Deep learning image transmission through a multimode fiber based on a small training dataset

Binbin Song, Chang Jin, Jixuan Wu, Wei Lin, Bo Liu, Wei Huang, and Shengyong Chen
Opt. Express 30(4) 5657-5672 (2022)

Deep learning-based object classification through multimode fiber via a CNN-architecture SpeckleNet

Ping Wang and Jianglei Di
Appl. Opt. 57(28) 8258-8263 (2018)

Previous Article Next Article

Cited By

You do not have subscription access to this journal. Cited by links are available to subscribers only. You may subscribe either as an Optica member, or as an authorized user of your institution.

Contact your librarian or system administrator
or
Login to access Optica Member Subscription

Abstract

Cited By

Journal of Optical Technology