Recently, a calculation method involving sparse point spread functions in the short-time Fourier transform (STFT) domain was proposed. In this paper, a dedicated processor using the STFT algorithm is described, which is implemented on a field-programmable gate array. All the operations in this algorithm are implemented using fixed-point arithmetic. Since this algorithm includes a trigonometric function and an error function, lookup tables (LUTs) are utilized to reduce the calculation costs. We have devised a dedicated circuit architecture that allows parallel operations. In addition, a central processing unit could generate holograms using the STFT-based algorithm with fixed-point arithmetic and LUTs at a higher speed than the generation using floating-point arithmetic.
You do not have subscription access to this journal. Cited by links are available to subscribers only. You may subscribe either as an Optica member, or as an authorized user of your institution.
You do not have subscription access to this journal. Figure files are available to subscribers only. You may subscribe either as an Optica member, or as an authorized user of your institution.
You do not have subscription access to this journal. Article tables are available to subscribers only. You may subscribe either as an Optica member, or as an authorized user of your institution.
You do not have subscription access to this journal. Equations are available to subscribers only. You may subscribe either as an Optica member, or as an authorized user of your institution.
Image Quality of the STFT-Based Algorithm Using the LUTs
Hayabusa
Tyrannosaurus Skeleton
Subblock Size []
PSNR [dB]
PSNR [dB]
30.53
26.84
31.33
27.14
31.55
27.17
31.60
27.18
Table 5.
Calculation Times and PSNRs of the WRP Method and Method
Calculation Method
Calculation Time [s]
PSNR [dB]
WRP
4.1
17.62
(6.25%)
0.34
25.07
(12.5%)
0.57
25.95
(25.0%)
0.60
25.88
Table 6.
Input/Output Signals of the Subblock Index Calculation Unit
Signal Name
I/O
Bit Width [bit]
Description
In
25
coordinate of the object point
In
25
coordinate of the object point
–
7
th STFT block
–
7
th STFT block
–
7
coordinate of the subblock
–
7
coordinate of the subblock
Block count
–
8
The number of blocks
–
7
Block size
–
4
Subblock size
–
25
coordinate of the STFT block
–
25
coordinate of the STFT block
Table 7.
Input/Output Signals of the STFT Calculation Unit
Signal Name
I/O
Bit Width [bit]
Description
In
13
Variable depending on the coordinate of the object point
In
16
Variable depending on the coordinate of the object point
–
13
is the STFT block size multiplied by the pixel pitch
–
16
STFT coefficient
Table 8.
Input/Output Signals of the PHC Unit
Signal Name
I/O
Bit Width [bit]
Description
In
16
Superposed STFT coefficient
PHC
–
11
PHC values are stored in a LUT
Out
16
STFT coefficient after PHC
Table 9.
Resources Required for the FPGA Implementation
Resource
Used Resources
Available Resources
Utilization (%)
LUT
116,528
230,400
50.58
LUTRAM
67,650
101,760
66.48
FF
10,387
460,800
2.25
BRAM
75
312
24.04
DSP
156
1728
9.03
BUFG
7
544
1.29
Table 10.
Calculation Parameters
Hologram Size
Wavelength ()
520 nm
Pixel pitch ()
8.0 µm
Block size ()
32
Subblock size ()
4
Table 11.
Comparison of the Software and Hardware Calculation Speedsa,b
Execution Time
Speedup Rate
Software [Intel Core i5-4670 at 3.40 GHz]
1.12 s
1.0
Data transfer time of the object point (from the embedded CPU to the FPGA)
0.28 ms
1.53
STFT coefficient calculation
0.57 s
Data transfer time of the superposed STFT coefficients (from the FPGA to the embedded CPU)
0.13 ms
Inverse STFT calculation on the embedded CPU
0.16 s
The hologram size is ${512}\;{\rm px} \times {512}\;{\rm px}$, and the number of object points is 11,646.
The execution time on the software includes the inverse STFT calculation.
Tables (11)
Table 1.
Computer Specifications
OS
Windows 10 Home
CPU
Intel Core i5-4670 at 3.40 GHz
RAM
8 GB
Compiler
Microsoft Visual C++ 2015
Table 2.
Calculation Parameters
CGH Resolution
Pixel pitch ()
8.0 µm
Wavelength ()
520 nm
Block size ()
32
Table 3.
CGH Calculation Time
Calculation Method
Hayabusa
Tyrannosaurus Skeleton
Calculation Time [s]
Speedup Rate
Calculation Time [s]
Speedup Rate
Fresnel approximation
1233
1.0
1782
1.0
STFT-based algorithm ()
46
27
67
26
STFT-based algorithm ()
202
6.1
296
6.0
STFT-based algorithm ()
779
1.6
1145
1.6
STFT-based algorithm ()
3154
0.4
4603
0.4
-based algorithm ()
4.3
287
6.0
296
-based algorithm ()
14
89
20
89
-based algorithm ()
45
27
66
27
-based algorithm ()
162
7.6
236
7.5
Table 4.
Image Quality of the STFT-Based Algorithm Using the LUTs
Hayabusa
Tyrannosaurus Skeleton
Subblock Size []
PSNR [dB]
PSNR [dB]
30.53
26.84
31.33
27.14
31.55
27.17
31.60
27.18
Table 5.
Calculation Times and PSNRs of the WRP Method and Method
Calculation Method
Calculation Time [s]
PSNR [dB]
WRP
4.1
17.62
(6.25%)
0.34
25.07
(12.5%)
0.57
25.95
(25.0%)
0.60
25.88
Table 6.
Input/Output Signals of the Subblock Index Calculation Unit
Signal Name
I/O
Bit Width [bit]
Description
In
25
coordinate of the object point
In
25
coordinate of the object point
–
7
th STFT block
–
7
th STFT block
–
7
coordinate of the subblock
–
7
coordinate of the subblock
Block count
–
8
The number of blocks
–
7
Block size
–
4
Subblock size
–
25
coordinate of the STFT block
–
25
coordinate of the STFT block
Table 7.
Input/Output Signals of the STFT Calculation Unit
Signal Name
I/O
Bit Width [bit]
Description
In
13
Variable depending on the coordinate of the object point
In
16
Variable depending on the coordinate of the object point
–
13
is the STFT block size multiplied by the pixel pitch
–
16
STFT coefficient
Table 8.
Input/Output Signals of the PHC Unit
Signal Name
I/O
Bit Width [bit]
Description
In
16
Superposed STFT coefficient
PHC
–
11
PHC values are stored in a LUT
Out
16
STFT coefficient after PHC
Table 9.
Resources Required for the FPGA Implementation
Resource
Used Resources
Available Resources
Utilization (%)
LUT
116,528
230,400
50.58
LUTRAM
67,650
101,760
66.48
FF
10,387
460,800
2.25
BRAM
75
312
24.04
DSP
156
1728
9.03
BUFG
7
544
1.29
Table 10.
Calculation Parameters
Hologram Size
Wavelength ()
520 nm
Pixel pitch ()
8.0 µm
Block size ()
32
Subblock size ()
4
Table 11.
Comparison of the Software and Hardware Calculation Speedsa,b
Execution Time
Speedup Rate
Software [Intel Core i5-4670 at 3.40 GHz]
1.12 s
1.0
Data transfer time of the object point (from the embedded CPU to the FPGA)
0.28 ms
1.53
STFT coefficient calculation
0.57 s
Data transfer time of the superposed STFT coefficients (from the FPGA to the embedded CPU)
0.13 ms
Inverse STFT calculation on the embedded CPU
0.16 s
The hologram size is ${512}\;{\rm px} \times {512}\;{\rm px}$, and the number of object points is 11,646.
The execution time on the software includes the inverse STFT calculation.