GPU Acceleration of Computational Electromagnetics...

1
GPU Acceleration of Computational Electromagnetics Methods Vivek Venugopalan and Cagatay Tokgoz Computational Electromagnetics (EM) is widely used in various aerospace applications. Main challenges include: (a) development of a CAD model that captures the physics and material properties of the scene and is useful for simulations, (b) modeling of RF propagation to yield sufficient accuracy, (c) simulations expensive in terms of computational complexity and algorithm latency. Different methods widely used for prediction of antenna coverage - our focus on Iterative physical optics (IPO). Applications of EM include: (1) antenna coverage prediction of autonomous vehicles in urban environments, (2) wireless propagation modeling in complex environments, e.g. to model the strength and propagation of Wi-Fi signals or critical communication signals within an aircraft cabin. [1] P. H. Pathak, “High frequency techniques for antenna analysis,” Proceedings of the IEEE, vol. 80, no. 1, pp. 44–65, 1992.. [2] R. J. Burkholder, C. Tokgoz C. Reddy, and W. O. Coburn, “Iterative physical optics for radar scattering predictions,” Applied Computational Electromagnetics Society Journal, vol. 24, no. 2, pp. 241–258, 2009. [3] C. Tokgoz and V.Venugopalan, “GPU accelerated iterative physical optics to predict RF propagation in urban environments”, IEEE Antennas & Propagation Society International Symposium (APSURSI), 2013 [4] V.Venugopalan and C.Tokgoz, “GPU acceleration of iterative physical optics-based electromagnetic simulations”. In IEEE High Performance Extreme Computing Conference (HPEC), 2015. Implementation and Results Introduction Conclusion References Iterative Physical Optics CPU results - Intel Xeon CPU clocked at 2.67 GHz Email:{venugov, tokgozc}@utrc.utc.com IPO methods targeted for two different use cases and accelerated using GPUs. Near real-time results obtained using the mobile GT650M - provides a transition path to Jetson GPUs IPO is better suited for wireless propagation modeling in complex environments – does not require a ray tracing or shadowing algorithm. IPO is also faster as compared to the MoM based full-wave techniques – no impedance matrix and runs on coarser mesh. IPO does not account for surface waves and diffractions and is not guaranteed to converge. Computational cost increases when modeling for higher frequencies. Final position of Rx antenna Initial position of Rx antenna Initial position of Tx antenna Complex Path planning requires retaining communication while navigating urban environments. Wi-Fi wireless antenna coverage requires optimal placement of antennae for link budget prediction and less EM interference inside aircraft cabin. calculate triangle parameters and accumulated surface currents GPU compute impedance for each antenna segment GPU compute receiver antenna impedances GPU # of receiving antenna locations single thread # of triangles calculate error tolerance GPU # of triangles Not within range or less than max # of iterations converged convergence value DMA fetched by CPU Copy all values from CPU to GPU device memory # of antenna segments Copy impedance values to CPU and plot results CPU CPU Transmit antenna Receive antenna Passenger cabin E scat E scat E scat E scat E LOS Assuming a hemispherical antenna radiation pattern, define field points where the antenna coverage is evaluated. The scene/location is meshed using triangles and the triangle facets are assigned material properties. Meshes are refined based on wavelength at desired frequency of operation. The PO currents are determined and they are reradiated to account for multiple bounces until they converge. GPU Speedup 1875 3750 5625 7500 Frequency (MHz) 400 800 1200 1600 2000 2400 3227 2512 2082 1734 882 465 7366 5880 4339 3495 1619 784 K40c Speedup GT650M Speedup GPU Speedup 300 600 900 1200 Frequency (MHz) 100 200 400 600 800 1600 301 221 215 224 113 110 973 1159 1107 997 682 644 K40c Speedup GT650M Speedup 0 0.5 1 1.5 2 2.5 3 45 40 35 30 25 20 15 10 5 Antenna separation (m) |S 21 | (dB) MLFMM (FEKO) IPO No iteration 31 iterations (converged) 0 0.5 1 1.5 2 2.5 3 45 40 35 30 25 20 15 10 5 Antenna separation (m) |S 21 | (dB) MLFMM (FEKO) IPO 0 0.5 1 1.5 2 2.5 3 65 60 55 50 45 40 35 30 25 20 15 Antenna separation (m) |S 21 | (dB) IPO (CPU version) IPO (GPU version) 0 0.5 1 1.5 2 2.5 3 40 35 30 25 20 15 10 5 Antenna separation (m) |S 21 | (dB) IPO (CPU version) IPO (GPU version) Convergence of fields - no iteration Convergence of fields - 3 iterations Convergence IPO currents Coverage with IPO currents f = 400 MHz f = 2400 MHz Grid search on the number of iterations for simulations to converge using the IPO vs MLFMM method. Number of iterations differed between CPU and GPU as frequency increases. Characteristics GT650M K40c Clock rate (MHz) 700 745 # of cores 384 2880 Global memory (GB) 1 12 L2 cache (KB) 256 1536 Use case Speedup GT650M K40c RF propagation urban environments 110-301 644-973 Antenna placement 465-3227 784-7366 GPU implementation of IPO - compute bound or memory bound (denser scenes with more triangles and at higher frequencies do not fit on the GT650M RAM). K40c is 2-5x faster than the GT650M - 7.5x increase in the number of cores and 12x increase in device RAM. CONTACT NAME Vivek Venugopalan: [email protected] POSTER P6316 CATEGORY: ROBOTICS & AUTONOMOUS MACHINES - RAM02

Transcript of GPU Acceleration of Computational Electromagnetics...

Page 1: GPU Acceleration of Computational Electromagnetics Methodson-demand.gputechconf.com/gtc/2016/posters/GTC_2016... · 2016-03-15 · GPU Acceleration of Computational Electromagnetics

GPU Acceleration of Computational Electromagnetics MethodsVivek Venugopalan and Cagatay Tokgoz

• Computational Electromagnetics (EM) is widely used in various aerospace applications. • Main challenges include: (a) development of a CAD model that captures the physics and material properties of the scene and is useful for simulations, (b) modeling of RF propagation to yield sufficient accuracy, (c) simulations expensive in terms of computational complexity and algorithm latency.• Different methods widely used for prediction of antenna coverage - our focus on Iterative physical optics (IPO).• Applications of EM include: (1) antenna coverage prediction of autonomous vehicles in urban environments, (2) wireless propagation modeling in complex environments, e.g. to model the strength and propagation of Wi-Fi signals or critical communication signals within an aircraft cabin.

[1] P. H. Pathak, “High frequency techniques for antenna analysis,” Proceedings of the IEEE, vol. 80, no. 1, pp. 44–65, 1992.. [2] R. J. Burkholder, C. Tokgoz C. Reddy, and W. O. Coburn, “Iterative physical optics for radar scattering predictions,” Applied Computational Electromagnetics Society Journal, vol. 24, no. 2, pp. 241–258, 2009.[3] C. Tokgoz and V. Venugopalan, “GPU accelerated iterative physical optics to predict RF propagation in urban environments”, IEEE Antennas & Propagation Society International Symposium (APSURSI), 2013[4] V. Venugopalan and C.Tokgoz, “GPU acceleration of iterative physical optics-based electromagnetic simulations”. In IEEE High Performance Extreme Computing Conference (HPEC), 2015.

Implementation and Results

Introduction

Conclusion

References

Iterative Physical Optics

• CPU results - Intel Xeon CPU clocked at 2.67 GHz

Email:{venugov, tokgozc}@utrc.utc.com

• IPO methods targeted for two different use cases and accelerated using GPUs.• Near real-time results obtained using the mobile GT650M - provides a transition path to Jetson GPUs

• IPO is better suited for wireless propagation modeling in complex environments – does not require a ray tracing or shadowing algorithm.• IPO is also faster as compared to the MoM based full-wave techniques – no impedance matrix and runs on coarser mesh.• IPO does not account for surface waves and diffractions and is not guaranteed to converge.• Computational cost increases when modeling for higher frequencies.

Final position of Rx antenna

Initial position of Rx antennaInitial position

of Tx antenna

Complex Path planning requires retaining communication while navigating urban environments.

Wi-Fi wireless antenna coverage requ ires optimal placement of an tennae for l i nk budget prediction and less EM interference inside aircraft cabin.

calculate triangle parameters and

accumulated surface currentsGPU

compute impedance for each antenna segment

GPU

compute receiver antenna impedances

GPU

# of receiving antenna locations

single thread

# of triangles

calculate error tolerance

GPU

# of triangles

Not within range or less than max

# of iterations

converged

convergence value DMA fetched by CPU

Copy all values from CPU to GPU device memory

# of antenna segments

Copy impedance values to CPU and plot results

CPU

CPU

Transmit antenna

Receive antenna

Passenger cabin

EscatEscat

Escat

Escat

ELOS

• Assuming a hemispherical antenna radiation pattern, define field points where the antenna coverage is evaluated.• The scene/location is meshed using triangles and the triangle facets are assigned material properties. Meshes are refined based on wavelength at desired frequency of operation.• The PO currents are determined and they are reradiated to account for multiple bounces until they converge.

GPU

Spe

edup

1875

3750

5625

7500

Frequency (MHz)

400 800 1200 1600 2000 2400

3227

25122082

1734

882465

7366

5880

4339

3495

1619

784

K40c SpeedupGT650M Speedup

GPU

Spe

edup

300

600

900

1200

Frequency (MHz)

100 200 400 600 800 1600

301221215224

113110

973

11591107

997

682644

K40c SpeedupGT650M Speedup

0 0.5 1 1.5 2 2.5 3−45

−40

−35

−30

−25

−20

−15

−10

−5

Antenna separation (m)

|S21

| (dB

)

MLFMM (FEKO)IPO

No iteration

31 iterations (converged)

0 0.5 1 1.5 2 2.5 3−45

−40

−35

−30

−25

−20

−15

−10

−5

Antenna separation (m)

|S21

| (dB

)

MLFMM (FEKO)IPO

0 0.5 1 1.5 2 2.5 3−65

−60

−55

−50

−45

−40

−35

−30

−25

−20

−15

Antenna separation (m)

|S21

| (dB

)

IPO (CPU version)IPO (GPU version)

0 0.5 1 1.5 2 2.5 3−40

−35

−30

−25

−20

−15

−10

−5

Antenna separation (m)

|S21

| (dB

)

IPO (CPU version)IPO (GPU version)

Convergence of fields - no iteration

Convergence of fields - 3 iterations

Convergence IPO currents

Coverage with IPO currents

f = 400 MHz

f = 2400 MHz • Grid search on the number of iterations for simulations to converge using the IPO vs MLFMM method.• Number of iterations differed between CPU and GPU as frequency increases.

Characteristics GT650M K40c

Clock rate (MHz) 700 745

# of cores 384 2880

Global memory (GB) 1 12

L2 cache (KB) 256 1536

Use caseSpeedup

GT650M K40c

RF propagation urban environments

110-301 644-973

Antenna placement 465-3227 784-7366

• GPU implementation of IPO - compute bound or memory bound (denser scenes with more triangles and at higher frequencies do not fit on the GT650M RAM).• K40c is 2-5x faster than the GT650M - 7.5x increase in the number of cores and 12x increase in device RAM.

contact name

Vivek Venugopalan: [email protected]

P6316

category: robotics & autonomous machines - ram02