TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD
Transcript of TensorFlow Parser Scope Fusion Rules Reference - HUAWEI CLOUD
CANNV100R020C20
TensorFlow Parser Scope FusionRules Reference
Issue 01
Date 2021-02-08
HUAWEI TECHNOLOGIES CO., LTD.
Copyright © Huawei Technologies Co., Ltd. 2021. All rights reserved.
No part of this document may be reproduced or transmitted in any form or by any means without priorwritten consent of Huawei Technologies Co., Ltd. Trademarks and Permissions
and other Huawei trademarks are trademarks of Huawei Technologies Co., Ltd.All other trademarks and trade names mentioned in this document are the property of their respectiveholders. NoticeThe purchased products, services and features are stipulated by the contract made between Huawei andthe customer. All or part of the products, services and features described in this document may not bewithin the purchase scope or the usage scope. Unless otherwise specified in the contract, all statements,information, and recommendations in this document are provided "AS IS" without warranties, guaranteesor representations of any kind, either express or implied.
The information in this document is subject to change without notice. Every effort has been made in thepreparation of this document to ensure accuracy of the contents, but all statements, information, andrecommendations in this document do not constitute a warranty of any kind, express or implied.
Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. i
Contents
1 Introduction.............................................................................................................................. 1
2 Fusion Rules.............................................................................................................................. 72.1 ScopeLayerNormPass............................................................................................................................................................. 72.2 ScopeLayerNormGradPass................................................................................................................................................... 92.3 ScopeBasicLSTMCellPass.................................................................................................................................................... 112.4 ScopeDynamicLSTMPass.................................................................................................................................................... 122.5 ScopeClipBoxesPass.............................................................................................................................................................. 132.6 ScopeROIAlignPass............................................................................................................................................................... 152.7 ScopeRpnProposalsPass...................................................................................................................................................... 162.8 ScopeFastrcnnPredictionsPass.......................................................................................................................................... 192.9 ScopeDecodeBboxPass........................................................................................................................................................ 222.10 ScopeToAbsoluteBBoxPass.............................................................................................................................................. 242.11 ScopeNormalizeBBoxPass................................................................................................................................................ 272.12 ScopeDecodeBboxV2Pass.................................................................................................................................................302.13 ScopeBatchMultiClassNMSPass..................................................................................................................................... 312.14 ScopeKeepRatioResizeBilinearPass............................................................................................................................... 342.15 ScopeBatchMultiClassNonMaxSuppressionPass...................................................................................................... 372.16 ScopeDynamicGRUPass.................................................................................................................................................... 402.17 ScopeDynamicRNNPass................................................................................................................................................... 41
CANNTensorFlow Parser Scope Fusion Rules Reference Contents
Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. ii
1 Introduction
OverviewScope fusion is a scope-based fusion capability that replaces small operators in ascope with one larger operator or a combination of operators to improveefficiency.
This document describes the built-in scope fusion patterns. A collection of APIs isopened to developers to customize scope fusion patterns. You can find detaileddescription about the APIs in TensorFlow Parser Scope Fusion Pattern DeveloperGuide.
Built-in Fusion Pattern Summary
Table 1-1 Built-in scope fusion patterns
No. Fusion Pattern Description ApplicableNetwork
Classification
1 ScopeLayerNormPass
Fuses the layernorm/batchnorm and layernorm/moments scopes generated bytf.layernorm into a LayerNormoperator.
BERT Generalfusionpattern
2 ScopeLayerNormGradPass
Fuses the layernorm/batchnorm and layernorm/moments scopes generated bytf.layernorm into a LayerNormoperator.
BERT Generalfusionpattern
CANNTensorFlow Parser Scope Fusion Rules Reference 1 Introduction
Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 1
No. Fusion Pattern Description ApplicableNetwork
Classification
3 ScopeBasicLSTMCellPass
Fuses the small operatorswithin the scope generated bytf.nn.rnn_cell.BasicLSTMCellinto a BasicLSTMCell operator.
Non-loopinferencenetworkthat uses asingleBasicLSTMCell, such asthe NMT
Non-generalfusionpattern
4 ScopeDynamicLSTMPass
Fuses small operators withinthe scope generated bytf.nn.dynamic_rnn ortf.nn.bidirectional_dynamic_rnninto a DynamicLSTM operator.Currently, only the loopscenario where the cell result isBasicLSTMcel is supported, andonly some shapes aresupported.
Inferencenetworkthat usesdynamic_rnn and asingleBasicLSTMCell
Non-generalfusionpattern
5 ScopeClipBoxesPass
Fuses the clip_boxes scope intoa ClipBoxes operator. The scopeincludes the tf.Maximum,tf.ReverseV2, tf.Tile, and tf.Minimum operators, and doesnot include the Gather_2,TopKV2, Reshape_2, Split,Greater, Squeeze, Gather,boolean_mask, anddecode_bbox_target operators.
2D-H1 Non-generalfusionpattern
6 ScopeROIAlignPass
Fuses tf.AvgPool andtf.image.CropAndResize into anROIAlign operator, excludingthe Merge operator.
2D-H1 Non-generalfusionpattern
7 ScopeRpnProposalsPass
Fuses generate_rpn_proposalsScope into an RpnProposalsoperator. The scope includes:tf.NonMaxSuppressionV2operators, tf.TopKV2 operators,a multiple of four tf.Whereoperators, and a multiple of sixtf.Gather operators. TheExpandDims, Switch, andtranspose operators are notincluded.
2D-H1 Non-generalfusionpattern
CANNTensorFlow Parser Scope Fusion Rules Reference 1 Introduction
Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 2
No. Fusion Pattern Description ApplicableNetwork
Classification
8 ScopeFastrcnnPre-dictionsPass
Fuses the fastrcnn_predictionsScope operator into aFastrcnnPredictions operator.The scope includes a multipleof two tf.TopKV2 operators, amultiple of three tf.Whereoperators, tf.NonMaxSuppressionV2operators, tf.Less operators,and tf.LoopCond operators. TheExpandDims, clip_boxes, anddecode_bbox_target operatorsare excluded.
2D-H1 Non-generalfusionpattern
9 ScopeDecodeBboxPass
Fuses a scope containing thefollowing operators into aDecodeBbox operator: amultiple of three tf.Reshapeoperators, a multiple of twotf.Split operators, tf.Minimumoperators, a multiple of threetf.Add operators, tf. ConcatV2operators, and a multiple oftwo tf.Sub operators. TheGreater, Squeeze, Gather_2,TopKV2 and boolean_maskoperators are excluded.
2D-H1 Non-generalfusionpattern
10 ScopeToAbsoluteBBoxPass
Fuses a scope containing amap that has a while node,ToAbsoluteCoordinates underthe while node, and Scaleunder ToAbsoluteCoordinates.In addition, there are four Muloperators under the scopeoperator, which are fused intothe ToAbsoluteBBox operator.
Fast R-CNN Non-generalfusionpattern
11 ScopeNormalizeBBoxPass
Fuses a scope containing amap that has a while node,ToNormalizedCoordinatesunder the while node, andScale underToNormalizedCoordinates. Inaddition, there are four Muloperators under the scopeoperator, which are fused intothe NormalizeBBox operator.
Fast R-CNN Non-generalfusionpattern
CANNTensorFlow Parser Scope Fusion Rules Reference 1 Introduction
Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 3
No. Fusion Pattern Description ApplicableNetwork
Classification
12 ScopeDecodeBboxV2Pass
Fuses the following two scopesinto a DecodeBboxV2 operator:Scope 1 contains at least twoExp operators, four Muloperators, four Sub operators,a multiple of two RealDivoperators, two Unpackoperators, one Pack operator,and three transpose operators,excluding the Softmaxoperator.Scope 2 contains at least twoExp operators, four Muloperators, 10 Sub operators, amultiple of two RealDivoperators, two Unpackoperators, one Pack operator,three transpose operators,three Rank operators, andthree Range operators,excluding the Sigmoidoperator.
Fast R-CNNSSD-Resnet34SSD-Resnet50V1-FPN
Non-generalfusionpattern
13 ScopeBatchMulti-ClassNMSPass
Fuses the following scope intoa BatchMultiClassNonMaxSup-pression operator. The scopepath is map/while/MultiClassNonMaxSuppres-sion/.
Fast R-CNNSSD-Resnet50V1-FPNMask R-CNN
Non-generalfusionpattern
14 ScopeKeepRatioR-esizeBilinearPass
Fuses a specific scope into acombination ofKeepRationResizeBilinearoperators, Shape operators, amultiple of two Slice operators,Expandims operators,ConcatV2 operators, Tileoperators, and a multiple offour Const operators. Thescope includes the graphstructure map/while/ResizeToRange/, and operatorsMaximum, Minimum, Roundand ResizeBilinear.
Fast R-CNN Non-generalfusionpattern
CANNTensorFlow Parser Scope Fusion Rules Reference 1 Introduction
Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 4
No. Fusion Pattern Description ApplicableNetwork
Classification
15 ScopeBatchMulti-ClassNonMaxSuppressionPass
Fuses the specified scope into aBatchMultiClassNonMaxSup-pression operator, when any ofthe following four fusionpatterns is matched:Scope1 contains oneNonMaxSuppressionV3operator and excludes thetranspose operator.Scope2 contains oneNonMaxSuppressionV3operator, five Range operators,one ConcatV2 operator, and 80Fill operators.
FaceBoxRetinanet
Non-generalfusionpattern
16 ScopeDynamicGRUPass
Fuses a scope containing thefollowing operators into aDynamicGRU operator: Thescope includes five AddV2operators, three Mul operators,and one Tanh operator, anddoes not contain Transposeoperators.
DeepSpeech2
Non-generalfusionpattern
17 ScopeDynamicRNNPass
Fuses a scope containing thefollowing operators into aDynamicRNN operator: Thescope contains a whilesubscope and does not containTranspose operators. The whilesubscope contains a multipleof 4 BiasAdd operators, amultiple of 2 Tanh operators,eight MatMul operators, and aSplit operator.
TacoTron Non-generalfusionpattern
General and Non-General Fusion Patterns● General fusion patterns are applicable to all networks. They are enabled by
default and cannot be manually disabled.● Non-general fusion patterns are applicable to specific networks. By default,
they are disabled. You can enable the non-general fusion patterns as required.
CANNTensorFlow Parser Scope Fusion Rules Reference 1 Introduction
Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 5
Table 1-2 Enabling a non-general fusion pattern
Use Case Method
TensorFlow modelbuilding using ATC
Use the model conversion command parameterenable_scope_fusion_passes to specify the fusionpatterns that need to take effect. Separate fusionpatterns by commas (,).--enable_scope_fusion_passes = DecodeBboxV2ScopeFusionPass
Execution in theTensorFlow framework
Use the running configuration parameterenable_scope_fusion_passes of the TensorFlowframework to specify the fusion patterns thatneed to take effect. Separate fusion patterns bycommas (,).import tensorflow as tffrom npu_bridge.estimator import npu_opsfrom tensorflow.core.protobuf.rewriter_config_pb2 import RewriterConfig
config = tf.ConfigProto()custom_op = config.graph_options.rewrite_options.custom_optimizers.add()custom_op.name = "NpuOptimizer"custom_op.parameter_map["use_off_line"].b = Truecustom_op.parameter_map["enable_scope_fusion_passes"].s = tf.compat.as_bytes("DecodeBboxV2ScopeFusionPass")config.graph_options.rewrite_options.remapping = RewriterConfig.OFF
with tf.Session(config=config) as sess: sess.run()
CANNTensorFlow Parser Scope Fusion Rules Reference 1 Introduction
Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 6
2 Fusion Rules
2.1 ScopeLayerNormPass
2.2 ScopeLayerNormGradPass
2.3 ScopeBasicLSTMCellPass
2.4 ScopeDynamicLSTMPass
2.5 ScopeClipBoxesPass
2.6 ScopeROIAlignPass
2.7 ScopeRpnProposalsPass
2.8 ScopeFastrcnnPredictionsPass
2.9 ScopeDecodeBboxPass
2.10 ScopeToAbsoluteBBoxPass
2.11 ScopeNormalizeBBoxPass
2.12 ScopeDecodeBboxV2Pass
2.13 ScopeBatchMultiClassNMSPass
2.14 ScopeKeepRatioResizeBilinearPass
2.15 ScopeBatchMultiClassNonMaxSuppressionPass
2.16 ScopeDynamicGRUPass
2.17 ScopeDynamicRNNPass
2.1 ScopeLayerNormPass
DescriptionFuses the layernorm/batchnorm and layernorm/moments scopes generated bytf.layernorm into a LayerNorm operator.
CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules
Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 7
Scope Details
batchnorm unfolded:
moments unfolded:
CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules
Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 8
Result Operator Prototype
LayerNorm. For details, see CANN Operator List (Ascend 310).
Fusion Mapping
When there are Cast nodes, the input of the first Cast node is used as the firstinput x of the fused operator.
The input gamma of the Mul node is used as the second input gamma of thefused operator.
The input beta of the last Add node is used as the third input beta of the fusedoperator.
The fourth begin_norm_axis uses the default value 1.
The fifth begin_param_axis uses the default value –1.
2.2 ScopeLayerNormGradPass
Description
Fuses the backward scope of tf.layernorm into a LayerNormGrad operator.
CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules
Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 9
Scope Details
mul_grad unfolded:
sub_grad unfolded:
Result Operator PrototypeLayerNorm. For details, see CANN Operator List (Ascend 310).
CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules
Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 10
Fusion MappingThe backward input of LayerNorm is used as the first input dy of the fusedoperator.
The forward input of LayerNorm is used as the second input x of the fusedoperator.
The third forward output variance is used as the third backward input variance.
The second forward output mean is used as the third backward input mean.
The second forward input gamma is used as the fourth backward input gamma.
The first backward output connects to the output of the last addN node in thebackward graph.
The second backward output gamma_backprop connects to the output of the Mulnode in mul_grad that will connect to a cast node.
The third backward output beta_backprop connects to the output of the Sum nodein sub_grad that will connect to a cast node.
2.3 ScopeBasicLSTMCellPassDescription
Fuses the small operators within the scope generated bytf.nn.rnn_cell.BasicLSTMCell into a BasicLSTMCell operator.
Scope Details
CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules
Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 11
Result Operator Prototype
BasicLSTMCell. For details, see CANN Operator List (Ascend 310).
Fusion Mapping
Input 1 of the concat operator is used as input 1 x after fusion.
Input 2 of the concat operator is used as input 2 h after fusion.
Input 1 of the Mul operator is used as input 3 c after fusion.
Input 2 of the MatMul operator is used as input 4 w after fusion.
Input 2 of the BiasAdd operator is used as input 5 b after fusion.
The output of Add_1 is used as output 0 ct after fusion.
The output of Mul_2 is used as the output 1 ht after fusion.
2.4 ScopeDynamicLSTMPass
Description
Fuses small operators within the scope generated by tf.nn.dynamic_rnn ortf.nn.bidirectional_dynamic_rnn into a DynamicLSTM operator. Currently, only theloop scenario where the cell result is BasicLSTMcel is supported, and only someshapes are supported.
Scope Details
The scope structure corresponding to dynamic_rnn is as follows.
Alternatively, the two dynamic_rnn values in bidirectional_dynamic_rnn are FWand BW, respectively.
CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules
Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 12
Result Operator PrototypeDynamicLSTM. For details, see CANN Operator List (Ascend 310).
Fusion MappingWhen time_major is set to False:
Input 1 of the rnn/transpose node is used as input 1 x after fusion.
The input of the rnn/while/basic_lstm_cell/MatMul/Enter node is used as input 2w after fusion.
The input of the rnn/while/basic_lstm_cell/BiasAdd/Enter node is used as input 3 bafter fusion.
The output of the rnn/transpose_1 node is used as the output output_h afterfusion.
When time_major is set to True:
Input 3 of the rnn/TensorArrayUnstack/TensorArrayScatter/TensorArrayScatterV3node is used as input 1 x after fusion.
The input of the rnn/while/basic_lstm_cell/MatMul/Enter node is used as input 2w after fusion.
The input of the rnn/while/basic_lstm_cell/BiasAdd/Enter node is used as input 3 bafter fusion.
The output of the rnn/TensorArrayStack/TensorArrayGatherV3 node is used as theoutput output_h after fusion.
NO TE
In the preceding scope example, time_major is set to True.
2.5 ScopeClipBoxesPass
DescriptionFuses the clip_boxes scope into a ClipBoxes operator. The scope includes thetf.Maximum, tf.ReverseV2, tf.Tile, and tf. Minimum operators, and does not include
CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules
Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 13
the Gather_2, TopKV2, Reshape_2, Split, Greater, Squeeze, Gather, boolean_mask,and decode_bbox_target operators.
Scope Details
Result Operator PrototypeClipBoxes. For details, see CANN Operator List (Ascend 310).
Fusion MappingThe input of clip_boxes/Maximum is used as the first input boxes_input of thefused operator.
The input of clip_boxes/ReverseV2 is used as the second input im_info of the fusedoperator.
The output of clip_boxes/fastrcnn_all_boxes (Minimum) is used as the outputboxes_output of the fused operator.
The output of clip_boxes/ReverseV2 is used as the input of clip_boxes/Tile.
The output of clip_boxes/Tile is used as the input of clip_boxes/ToFloat.
The input of clip_boxes/ToFloat is used as the second input of clip_boxes/fastrcnn_all_boxes.
The output of clip_boxes/Maximum is used as the first input of clip_boxes/fastrcnn_all_boxes.
CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules
Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 14
2.6 ScopeROIAlignPass
DescriptionFuses tf.AvgPool and tf.image.CropAndResize into an ROIAlign operator, excludingthe Merge operator.
Scope Details
Result Operator PrototypeROIAlign. For details, see CANN Operator List (Ascend 310).
Fusion MappingThe input of crop_and_resize/Shape/Switch is used as the first input features ofthe fused operator.
The input of Shape is used as the second input rois of the fused operator.
The output of Avgpool is used as the output y of the fused operator.
The output of Shape is used as the input of StridedSlice.
CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules
Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 15
The output of StridedSlice is used as the input of zeros.
The output of crop_and_resize/Shape/Switch is used as the input ofcrop_and_resize/Shape/Shape and crop_and_resize/transpose.
The output of crop_and_resize/Shape/Shape is used as the input ofcrop_and_resize/strided_slice.
The output of crop_and_resize/strided_slice is used as the input ofcrop_and_resize/transform_fpcoor_for_tf.
The outputs of crop_and_resize/transform_fpcoor_for_tf and crop_and_resize/transpose are used as the input of CropAndResize.
The output of crop_and_resize or CropAndResize is used as the input ofcrop_and_resize/transpose_1.
The output of crop_and_resize/transpose_1 is used as the input of AvgPool.
2.7 ScopeRpnProposalsPass
DescriptionFuses generate_rpn_proposals Scope into an RpnProposals operator. The scopeincludes: tf.NonMaxSuppressionV2 operators, tf.TopKV2 operators, a multiple offour tf.Where operators, and a multiple of six tf.Gather operators. TheExpandDims, Switch, and transpose operators are not included.
CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules
Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 16
Scope Details
Result Operator PrototypeRpnProposals. For details, see CANN Operator List (Ascend 310).
CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules
Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 17
Fusion MappingThe input of transpose is used as the first input rois of the fused operator.
The inputs of filtered_boxes, filtered_scores, and Gather_1 are used as the secondinput cls_bg_prob of the fused operator.
The input of clip_boxes/ReverseV2 is used as the third input img_info of the fusedoperator.
The output of boxes is used as the output of sorted_box of the fused operator.
The output of filtered_boxes is used as the input of Where.
The outputs of transpose and Where are used as the input of Gather.
The output of Gather is used as the input of Reshape.
The output of filtered_scores is used as the input of Where_1.
The output of Where_1 is used as the input of Gather_1.
The output of Gather_1 is used as the input of Reshape_1.
The output of Reshape_1 is used as the input of TopK V2 and size.
The output of Size is used as the input of Minimum.
The output of Minimum is used as the input of TopKV2.
The output of TopKV2 is used as the input of Gather_2 and boolean_mask_1.
The output of Gather_2 is used as the input of clip_boxes/Maximum.
The output of clip_boxes/ReverseV2 is used as the input of clip_boxes/Tile.
The output of clip_boxes/Tile is used as the input of clip_boxes/ToFloat.
The outputs of clip_boxes/Maximum and clip_boxes/ToFloat are used as the inputof clip_boxes/Minimum.
The output of clip_boxes/Minimum is used as the input of Reshape_2.
The output of Reshape_2 is used as the input of boolean_mask and split.
The output of Split is used as the input of sub.
The output of Sub is used as the input of Squeeze.
The output of Squeeze is used as the input of Greater.
The output of Greater is used as the input of All.
The output of All is used as the input of boolean_mask and boolean_mask_1.
The output of boolean_mask is used as the inputs of Reshape_3 and ReverseV2.
The output of ReverseV2 is used as the input of nms_input_boxes.
The output of nms_input_boxes is used as the input of non_max_suppression.
The output of boolean_mask_1 is used as the input of non_max_suppression.
The outputs of non_max_suppression and Reshape_3 are used as the input ofBoxes.
CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules
Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 18
2.8 ScopeFastrcnnPredictionsPass
DescriptionFuses the fastrcnn_predictions Scope operator into a FastrcnnPredictions operator.The scope includes a multiple of two tf.TopKV2 operators, a multiple of threetf.Where operators, tf. NonMaxSuppressionV2 operators, tf.Less operators, andtf.LoopCond operators. The ExpandDims, clip_boxes, and decode_bbox_targetoperators are excluded.
CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules
Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 19
Scope Details
CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules
Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 20
Result Operator Prototype
FastrcnnPredictions. For details, see CANN Operator List (Ascend 310).
Fusion Mapping
The inputs of fastrcnn_predictions/transpose and fastrcnn_predictions/GatherNdare used as the input rois after fusion.
The input of fastrcnn_predictions/strided_slice is used as the input score afterfusion.
The output of fastrcnn_predictions/TopKV2 is used as the output sorted_rois afterfusion.
The output of fastrcnn_predictions/GatherNd is used as the output sorted_scoresafter fusion.
The output of fastrcnn_predictions/Add is used as the output sorted_classes afterfusion.
The output of fastrcnn_predictions/strided_slice is used as the input offastrcnn_predictions/transpose_1.
The output of fastrcnn_predictions/transpose_1 is used as the inputs offastrcnn_predictions/map and fastrcnn_predictions/boolean_mask.
The output of astrcnn_predictions/map is used as the input offastrcnn_predictions/Where.
The output of fastrcnn_predictions/Where is used as the input offastrcnn_predictions/Gather.
The output of fastrcnn_predictions/boolean_mask is used as the inputs offastrcnn_predictions/Size and fastrcnn_predictions/TopKV2.
The output of fastrcnn_predictions/Size is used as the input offastrcnn_predictions/Minimum.
The output of fastrcnn_predictions/Minimum is used as the input offastrcnn_predictions/TopKV2.
fastrcnn_predictions/TopKV2 is used as the output sorted_rois after fusion andinput of fastrcnn_predictions/Gather.
The output of fastrcnn_predictions/Gather is used as the input offastrcnn_predictions/filtered_indices.
The output of fastrcnn_predictions/filtered_indices is used as the inputs offastrcnn_predictions/GatherNd and fastrcnn_predictions/ToFloat.
The output of fastrcnn_predictions/ToFloat is used as the input offastrcnn_predictions/strided_slice_1.
The output of fastrcnn_predictions/strided_slice_1 is used as the input offastrcnn_predictions/Add.
CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules
Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 21
2.9 ScopeDecodeBboxPass
DescriptionFuses a scope containing the following operators into a DecodeBbox operator: amultiple of three tf.Reshape operators, a multiple of two tf.Split operators,tf.Minimum operators, a multiple of three tf.Add operators, tf. ConcatV2 operators,and a multiple of two tf.Sub operators. The Greater, Squeeze, Gather_2, TopKV2and boolean_mask operators are excluded.
Scope DetailsThere are two types of scopes based on whether the transpose operator isincluded.
The transpose operator not included:
The transpose operator included:
CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules
Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 22
Result Operator PrototypeDecodeBbox. For details, see CANN Operator List (Ascend 310).
Fusion Mapping● If the transpose operator is not included:
The input of Reshape is used as the input box_predictions after fusion.The inputs of Shape and Reshape_1 are used as the input anchors afterfusion.The output of Reshape_2 is used as the output decoded_boxes after fusion.The output of Reshape is used as the input of Split.The output of Split is used as the inputs of Minimum and Mul.The output of Minimum is used as the input of Exp.The output of Exp is used as the input of mul.The output of Reshape_1 is used as the input of split_1.The output of split_1 is used as the inputs of Sub and Add.The outputs of Sub and Add are used as the input of Mul.The output of Mul is used as the inputs of Add_1, Sub_1, and Add_2.The output of Add_1 is used as the inputs of Sub_1 and Add_2.The outputs of Sub_1 and Add_2 are used as the input of concat.The outputs of Shape and concat are used as the input of Reshape_2.Operators such as Greater, Squeeze, Gather_2, TopKV2 and boolean_mask areexcluded.
● If the transpose operator is included:The input of transpose is used as the input box_predictions after fusion.The input of transpose_1 is used as the input anchors after fusion.
CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules
Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 23
The output of transpose_2 is used the output decoded_boxes after fusion.The output of transpose is used as the input of Reshape.The output of Reshape is used as the input of Split.The output of Split is used as the inputs of Minimum and Mul.The output of Minimum is used as the input of Exp.The output of Exp is used as the input of mul.The output of transpose_1 is used as the inputs of Reshape_1 and Shape.The output of Reshape_1 is used as the input of split_1.The output of split_1 is used as the inputs of Sub and Add.The outputs of Sub and Add are used as the input of Mul.The output of Mul is used as the inputs of Add_1, Sub_1, and Add_2.The output of Add_1 is used as the inputs of Sub_1 and Add_2.The outputs of Sub_1 and Add_2 are used as the input of concat.The outputs of Shape and concat are used as the input of Reshape_2.The output of Reshape_2 is used as the input of transpose_2.
2.10 ScopeToAbsoluteBBoxPass
DescriptionFuses a scope containing a map that has a while node, ToAbsoluteCoordinatesunder the while node, and Scale under ToAbsoluteCoordinates. In addition, thereare four Mul operators under the scope operator, which are fused into theToAbsoluteBBox operator.
Scope Detailswhile exists under map_1:
CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules
Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 24
ToAbsoluteCoordinates exists under while:
CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules
Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 25
Scale exists under ToAbsoluteCoordinates:
CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules
Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 26
Result Operator PrototypeToAbsoluteBBox. For details, see CANN Operator List (Ascend 310).
Fusion MappingThe input of Shape or TensorArrayUnstack/Shape is used as the first input of thefused operator.
The input of while/strided_slice/Enter is used as the second input of the fusedoperator.
The output of TensorArrayStack/TensorArrayGatherV3 is used as the first output ofthe fused operator.
The fused scope contains the while/ToAbsoluteCoordinates/Scale subgraphstructure.
2.11 ScopeNormalizeBBoxPass
DescriptionFuses a scope containing a map that has a while node, ToNormalizedCoordinatesunder the while node, and Scale under ToNormalizedCoordinates. In addition,there are four Mul operators under the scope operator, which are fused into theNormalizeBBox operator.
Scope Detailswhile exists under map:
CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules
Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 27
CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules
Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 28
Result Operator Prototype
NormalizeBBox. For details, see CANN Operator List (Ascend 310).
Fusion Mapping
The input of map/shape is used as the first input of the fused operator.
The input of map/TensorArrayUnstack_1/Shape is used as the second input of thefused operator.
The input of TensorArrayStack/TensorArrayGatherV3 is used as the first output ofthe fused operator.
The scope has the while/ToNormalizedCoordinates/Scale/ graph structure.
CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules
Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 29
2.12 ScopeDecodeBboxV2Pass
Description
Fuses the following two scopes into a DecodeBboxV2 operator:
Scope 1 contains at least two Exp operators, four Mul operators, four Suboperators, a multiple of two RealDiv operators, two Unpack operators, one Packoperator, and three transpose operators, excluding the Softmax operator.
Scope 2 contains at least two Exp operators, four Mul operators, 10 Sub operators,a multiple of two RealDiv operators, two Unpack operators, one Pack operator,three transpose operators, three Rank operators, and three Range operators,excluding the Sigmoid operator.
Scope Details
Scope 1:
Scope 2:
CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules
Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 30
Result Operator PrototypeDecodeBboxV2. For details, see CANN Operator List (Ascend 310).
Fusion MappingScope 1:
The input of the transpose operator is used as the first input of the fused operator.
The input of get_center_coordinates_and_sizes/transpose is used as the secondinput of the fused operator.
The output of transpose_1 is used as the first output of the fused operator.
Scope 2:
The input of transpose/Rank is used as the first input of the fused operator.
The input of get_center_coordinates_and_sizes/transpose/Rank is used as thesecond input of the fused operator.
The output of transpose_1 is used as the first output of the fused operator.
2.13 ScopeBatchMultiClassNMSPass
DescriptionFuses the following scope into a BatchMultiClassNonMaxSuppression operator.The scope path is map/while/MultiClassNonMaxSuppression/.
CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules
Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 31
Scope Details
CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules
Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 32
Result Operator PrototypeBatchMultiClassNonMaxSuppression. For details, see CANN Operator List(Ascend 310).
CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules
Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 33
Fusion Mappingmap/TensorArrayUnstack/Shape is used as the first input of the fused operator.
map/TensorArrayUnstack_1/Shape is, used as the second input of the fusedoperator.
map/TensorArrayUnstack_3/Shape, if any, is used as the third input of the fusedoperator.
map/TensorArrayUnstack_4/Shape, if any, is used as the fourth input of the fusedoperator.
map/TensorArrayStack/TensorArrayGatherV3, if any, is used as the first output ofthe fused operator.
map/TensorArrayStack_1/TensorArrayGatherV3, if any, is used as the secondoutput of the fused operator.
map/TensorArrayStack_2/TensorArrayGatherV3, if any, is used as the third outputof the fused operator.
map/TensorArrayStack_4/TensorArrayGatherV3, if any, is used as the fourth outputof the fused operator.
2.14 ScopeKeepRatioResizeBilinearPass
DescriptionFuses a specific scope into a combination of KeepRationResizeBilinear operators,Shape operators, a multiple of two Slice operators, Expandims operators,ConcatV2 operators, Tile operators, and a multiple of four Const operators. Thescope includes the graph structure map/while/ResizeToRange/, and operatorsMaximum, Minimum, Round and ResizeBilinear.
CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules
Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 34
Scope Details
CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules
Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 35
Result Operator Prototype
KeepRationResizeBilinear + Shape + Slice x 2 + Expandims + ConcatV2 + Tile +Const x 4. For details, see CANN Operator List (Ascend 310).
CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules
Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 36
Fusion MappingThe output of the scope name+/Shape node is used as the input of the subgraphafter fusion.
The output of TensorArrayStack/TensorArrayStack/TensorArrayGatherV3 is used asthe first output of the subgraph after fusion.
The output of TensorArrayStack_1/TensorArrayStack_1/TensorArrayGatherV3 isused as the second output of the subgraph after fusion.
After the fusion, the output of the KeepRationResizeBilinear operator of thesubgraph connects to the first output of the scope.
After the fusion, the output of the subgraph Tile connects to the second output ofthe scope.
2.15 ScopeBatchMultiClassNonMaxSuppressionPass
DescriptionFuses a scope into a BatchMultiClassNonMaxSuppression operator. The fusionscope contains the scope path.
The fusion pattern contains two child patterns:
ScopeFaceBoxesBatchMultiClassNMSPattern: includes one NonMaxSuppressionV3operator and excludes the transpose operator.
ScopeFilteredBatchMultiClassNMSPattern: includes one NonMaxSuppressionV3operator, five Range operators, one ConcatV2 operator, and 80 Fill operators.
CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules
Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 37
Scope Details
CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules
Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 38
Result Operator PrototypeBatchMultiClassNonMaxSuppression. For details, see CANN Operator List(Ascend 310).
Fusion Mappingmap/TensorArrayUnstack/Shape is used as the first input of the fused operator.
map/TensorArrayUnstack_1/Shape is, used as the second input of the fusedoperator.
map/TensorArrayUnstack_3/Shape, if any, is used as the third input of the fusedoperator.
CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules
Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 39
map/TensorArrayUnstack_4/Shape, if any, is used as the fourth input of the fusedoperator.
map/TensorArrayStack/TensorArrayGatherV3, if any, is used as the first output ofthe fused operator.
map/TensorArrayStack_1/TensorArrayGatherV3, if any, is used as the secondoutput of the fused operator.
map/TensorArrayStack_2/TensorArrayGatherV3, if any, is used as the third outputof the fused operator.
map/TensorArrayStack_4/TensorArrayGatherV3, if any, is used as the fourth outputof the fused operator.
2.16 ScopeDynamicGRUPass
DescriptionFuses a scope containing the following operators into a DynamicGRU operator:The scope includes five AddV2 operators, three Mul operators, and one Tanhoperator, and does not contain Transpose operators.
Scope DetailsSee the following scope example.
The while scope contains five AddV2 operators, three Mul operators, one Tanhoperator, and does not contain Transpose operators. After fusion, all operators inthe red box are fused into one DynamicGRU operator.
Result Operator PrototypeDynamicGRUV2. For details, see Operator List.
CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules
Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 40
Fusion Mapping
The input of TensorArrayUnstack/TensorArrayScatter/TensorArrayScatterV3 is usedas the first input of the fused operator.
The input of while/ReadVariableOp_1/Enter is used as the second input of thefused operator.
The input of while/ReadVariableOp_4/Enter is used as the third input of the fusedoperator.
The input of while/ReadVariableOp_00/Enter is used as the fourth input of thefused operator.
The input of while/ReadVariableOp_01/Enter is used as the fifth input of the fusedoperator.
The output of TensorArrayStack/TensorArrayGatherV3 is used as the first output ofthe fused operator.
2.17 ScopeDynamicRNNPass
Description
Fuses a scope containing the following operators into a DynamicRNN operator:The scope contains a while subscope and does not contain Transpose operators.The while subscope contains a multiple of 4 BiasAdd operators, a multiple of 2Tanh operators, eight MatMul operators, and a Split operator.
Scope Details
See the following scope example.
The while subscope contains a multiple of 4 BiasAdd operators, a multiple of 2Tanh operators, eight MatMul operators, and a Split operator.
CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules
Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 41
Result Operator PrototypeDynamicRNN. For details, see Operator List.
Fusion MappingThe input of TensorArrayUnstack/TensorArrayScatter/TensorArrayScatterV3 is usedas the first input of the fused operator.
The input of while/split/ReadVariableOp/Enter is used as the second input of thefused operator.
The input of while/split_1/ReadVariableOp/Enter is used as the third input of thefused operator.
The output of TensorArrayStack/TensorArrayGatherV3 is used as the first output ofthe fused operator.
CANNTensorFlow Parser Scope Fusion Rules Reference 2 Fusion Rules
Issue 01 (2021-02-08) Copyright © Huawei Technologies Co., Ltd. 42