Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Dynamic termination of computations in computer vision systems

Not Accessible

Your library or personal account may give you access

Abstract

Subject of study. Two classes of dynamically configurable computer vision systems trained using reinforcement learning algorithms were considered. The first class of models comprises models of visual attention that recognize images by successively viewing their fragments. The second class of models comprises least action classifiers that analyze images indirectly by successively calling pretrained convolutional neural networks. Aim of study. This study investigated the possibility of adding actions to the system for termination of computations so that the models spend more resources on analysis of complex images than on analysis of simpler images. Method. A stop network for termination of computations that receives a hidden state vector of the system at its input and returns a signal to stop or continue computations was added to the investigated architectures. Three-stage curriculum training of the individual network modules was used, and the obtained strategies of image viewing and classifier selection were analyzed. Main results. The proposed model of visual attention with dynamic termination of computations significantly surpassed the existing solutions in terms of accuracy in the recognition of images in the MNIST database and average number of image fragments intelligible to the agent. The importance of curriculum learning was demonstrated. The agent’s use of a similar attention control strategy for different images with adaptations to specific images was demonstrated. A similar effect was observed for a common model of visual attention trained using ImageNet. The dynamic termination of computation for least action classifiers also reduced the average number of actions required for image analysis at a specified recognition accuracy. However, the increase in effectiveness in this case was less prominent. Practical significance. The methods of visual attention developed in this study can be advantageous for designing optoelectronic systems with intelligent control of a camera with a narrow-field lens for target recognition. The technology used in the least action classifiers can be applied to reduce computations in solutions obtained by the Bagging algorithm that averages several models.

© 2022 Optica Publishing Group

PDF Article
More Like This
Learning early-vision computations

John (Yiannis) Aloimonos and David Shulman
J. Opt. Soc. Am. A 6(6) 908-919 (1989)

Motion measurement system of compliant mechanisms using computer micro-vision

Sheng Yao, Hai Li, Shuiquan Pang, Longhuan Yu, Sergej Fatikow, and Xianmin Zhang
Opt. Express 29(4) 5006-5017 (2021)

Cited By

You do not have subscription access to this journal. Cited by links are available to subscribers only. You may subscribe either as an Optica member, or as an authorized user of your institution.

Contact your librarian or system administrator
or
Login to access Optica Member Subscription

Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.