GPU Acceleration of Computational Electromagnetics...
Transcript of GPU Acceleration of Computational Electromagnetics...
GPU Acceleration of Computational Electromagnetics MethodsVivek Venugopalan and Cagatay Tokgoz
• Computational Electromagnetics (EM) is widely used in various aerospace applications. • Main challenges include: (a) development of a CAD model that captures the physics and material properties of the scene and is useful for simulations, (b) modeling of RF propagation to yield sufficient accuracy, (c) simulations expensive in terms of computational complexity and algorithm latency.• Different methods widely used for prediction of antenna coverage - our focus on Iterative physical optics (IPO).• Applications of EM include: (1) antenna coverage prediction of autonomous vehicles in urban environments, (2) wireless propagation modeling in complex environments, e.g. to model the strength and propagation of Wi-Fi signals or critical communication signals within an aircraft cabin.
[1] P. H. Pathak, “High frequency techniques for antenna analysis,” Proceedings of the IEEE, vol. 80, no. 1, pp. 44–65, 1992.. [2] R. J. Burkholder, C. Tokgoz C. Reddy, and W. O. Coburn, “Iterative physical optics for radar scattering predictions,” Applied Computational Electromagnetics Society Journal, vol. 24, no. 2, pp. 241–258, 2009.[3] C. Tokgoz and V. Venugopalan, “GPU accelerated iterative physical optics to predict RF propagation in urban environments”, IEEE Antennas & Propagation Society International Symposium (APSURSI), 2013[4] V. Venugopalan and C.Tokgoz, “GPU acceleration of iterative physical optics-based electromagnetic simulations”. In IEEE High Performance Extreme Computing Conference (HPEC), 2015.
Implementation and Results
Introduction
Conclusion
References
Iterative Physical Optics
• CPU results - Intel Xeon CPU clocked at 2.67 GHz
Email:{venugov, tokgozc}@utrc.utc.com
• IPO methods targeted for two different use cases and accelerated using GPUs.• Near real-time results obtained using the mobile GT650M - provides a transition path to Jetson GPUs
• IPO is better suited for wireless propagation modeling in complex environments – does not require a ray tracing or shadowing algorithm.• IPO is also faster as compared to the MoM based full-wave techniques – no impedance matrix and runs on coarser mesh.• IPO does not account for surface waves and diffractions and is not guaranteed to converge.• Computational cost increases when modeling for higher frequencies.
Final position of Rx antenna
Initial position of Rx antennaInitial position
of Tx antenna
Complex Path planning requires retaining communication while navigating urban environments.
Wi-Fi wireless antenna coverage requ ires optimal placement of an tennae for l i nk budget prediction and less EM interference inside aircraft cabin.
calculate triangle parameters and
accumulated surface currentsGPU
compute impedance for each antenna segment
GPU
compute receiver antenna impedances
GPU
# of receiving antenna locations
single thread
# of triangles
calculate error tolerance
GPU
# of triangles
Not within range or less than max
# of iterations
converged
convergence value DMA fetched by CPU
Copy all values from CPU to GPU device memory
# of antenna segments
Copy impedance values to CPU and plot results
CPU
CPU
Transmit antenna
Receive antenna
Passenger cabin
EscatEscat
Escat
Escat
ELOS
• Assuming a hemispherical antenna radiation pattern, define field points where the antenna coverage is evaluated.• The scene/location is meshed using triangles and the triangle facets are assigned material properties. Meshes are refined based on wavelength at desired frequency of operation.• The PO currents are determined and they are reradiated to account for multiple bounces until they converge.
GPU
Spe
edup
1875
3750
5625
7500
Frequency (MHz)
400 800 1200 1600 2000 2400
3227
25122082
1734
882465
7366
5880
4339
3495
1619
784
K40c SpeedupGT650M Speedup
GPU
Spe
edup
300
600
900
1200
Frequency (MHz)
100 200 400 600 800 1600
301221215224
113110
973
11591107
997
682644
K40c SpeedupGT650M Speedup
0 0.5 1 1.5 2 2.5 3−45
−40
−35
−30
−25
−20
−15
−10
−5
Antenna separation (m)
|S21
| (dB
)
MLFMM (FEKO)IPO
No iteration
31 iterations (converged)
0 0.5 1 1.5 2 2.5 3−45
−40
−35
−30
−25
−20
−15
−10
−5
Antenna separation (m)
|S21
| (dB
)
MLFMM (FEKO)IPO
0 0.5 1 1.5 2 2.5 3−65
−60
−55
−50
−45
−40
−35
−30
−25
−20
−15
Antenna separation (m)
|S21
| (dB
)
IPO (CPU version)IPO (GPU version)
0 0.5 1 1.5 2 2.5 3−40
−35
−30
−25
−20
−15
−10
−5
Antenna separation (m)
|S21
| (dB
)
IPO (CPU version)IPO (GPU version)
Convergence of fields - no iteration
Convergence of fields - 3 iterations
Convergence IPO currents
Coverage with IPO currents
f = 400 MHz
f = 2400 MHz • Grid search on the number of iterations for simulations to converge using the IPO vs MLFMM method.• Number of iterations differed between CPU and GPU as frequency increases.
Characteristics GT650M K40c
Clock rate (MHz) 700 745
# of cores 384 2880
Global memory (GB) 1 12
L2 cache (KB) 256 1536
Use caseSpeedup
GT650M K40c
RF propagation urban environments
110-301 644-973
Antenna placement 465-3227 784-7366
• GPU implementation of IPO - compute bound or memory bound (denser scenes with more triangles and at higher frequencies do not fit on the GT650M RAM).• K40c is 2-5x faster than the GT650M - 7.5x increase in the number of cores and 12x increase in device RAM.
contact name
Vivek Venugopalan: [email protected]
P6316
category: robotics & autonomous machines - ram02