王顺– Google Cloudˆ†...深度学习实践 from Tensorflow to AI-Hub 王顺–Google Cloud

38
深度学习实践 from Tensorflow to AI-Hub 王顺 – Google Cloud

Transcript of 王顺– Google Cloudˆ†...深度学习实践 from Tensorflow to AI-Hub 王顺–Google Cloud

Page 1: 王顺– Google Cloudˆ†...深度学习实践 from Tensorflow to AI-Hub 王顺–Google Cloud

深度学习实践

from Tensorflow to AI-Hub

王顺 – Google Cloud

Page 2: 王顺– Google Cloudˆ†...深度学习实践 from Tensorflow to AI-Hub 王顺–Google Cloud

目录 CONTENTS 从零开始

初步修改

业务升级

实践指南

Page 3: 王顺– Google Cloudˆ†...深度学习实践 from Tensorflow to AI-Hub 王顺–Google Cloud

1 从hello world开始以深度学习的第一个案例MNIST为例

学习Tensorflow框架的使用及代码编写风格

Page 4: 王顺– Google Cloudˆ†...深度学习实践 from Tensorflow to AI-Hub 王顺–Google Cloud

理解TF

Page 5: 王顺– Google Cloudˆ†...深度学习实践 from Tensorflow to AI-Hub 王顺–Google Cloud

Mac CPU运行结果

Page 6: 王顺– Google Cloudˆ†...深度学习实践 from Tensorflow to AI-Hub 王顺–Google Cloud

GPU运行结果

Page 7: 王顺– Google Cloudˆ†...深度学习实践 from Tensorflow to AI-Hub 王顺–Google Cloud

TPU运行结果

Page 8: 王顺– Google Cloudˆ†...深度学习实践 from Tensorflow to AI-Hub 王顺–Google Cloud

TPU的创建和使用

Page 9: 王顺– Google Cloudˆ†...深度学习实践 from Tensorflow to AI-Hub 王顺–Google Cloud

TPU训练MNIST的改动

Page 10: 王顺– Google Cloudˆ†...深度学习实践 from Tensorflow to AI-Hub 王顺–Google Cloud

TPU训练MNIST的改动

https://www.tensorflow.org/guide/distribute_strategy

resolver = tf.distribute.cluster_resolver.TPUClusterResolver()tf.tpu.experimental.initialize_tpu_system(resolver)tpu_strategy = tf.distribute.experimental.TPUStrategy(resolver)

Page 11: 王顺– Google Cloudˆ†...深度学习实践 from Tensorflow to AI-Hub 王顺–Google Cloud

2 初步修改针对第一个python代码执行、思考和改进

如何能做的更好?

Page 12: 王顺– Google Cloudˆ†...深度学习实践 from Tensorflow to AI-Hub 王顺–Google Cloud

TPU Pod

Page 13: 王顺– Google Cloudˆ†...深度学习实践 from Tensorflow to AI-Hub 王顺–Google Cloud

BERT 训练时间短

https://github.com/google-research/bert

Page 14: 王顺– Google Cloudˆ†...深度学习实践 from Tensorflow to AI-Hub 王顺–Google Cloud
Page 15: 王顺– Google Cloudˆ†...深度学习实践 from Tensorflow to AI-Hub 王顺–Google Cloud

Data

Page 16: 王顺– Google Cloudˆ†...深度学习实践 from Tensorflow to AI-Hub 王顺–Google Cloud

数据不均

• Why are my tip predictions

bad in the morning hours?

16

Chicago Taxi Cab Dataset

Page 17: 王顺– Google Cloudˆ†...深度学习实践 from Tensorflow to AI-Hub 王顺–Google Cloud

Tensorflow/Keras中的网络

Page 18: 王顺– Google Cloudˆ†...深度学习实践 from Tensorflow to AI-Hub 王顺–Google Cloud

Custom training with TPUs

• https://www.tensorflow.org/tutorials/distribute/tpu_custom_training

Page 19: 王顺– Google Cloudˆ†...深度学习实践 from Tensorflow to AI-Hub 王顺–Google Cloud

3 业务升级以上已经针对MNIST做了一些深入学习

接下来思考如何满足实际业务上的需要

Page 20: 王顺– Google Cloudˆ†...深度学习实践 from Tensorflow to AI-Hub 王顺–Google Cloud

LEGO积木

Page 21: 王顺– Google Cloudˆ†...深度学习实践 from Tensorflow to AI-Hub 王顺–Google Cloud

22

Component: ExampleGen

examples = csv_input(os.path.join(data_root, 'simple'))

example_gen = CsvExampleGen(input_base=examples)

Configuration

ExampleGen

Raw Data

Inputs and Outputs

CSV TF Record

Split TF Record

Data

Training

Eval

Page 22: 王顺– Google Cloudˆ†...深度学习实践 from Tensorflow to AI-Hub 王顺–Google Cloud

23

Component: StatisticsGen

statistics_gen =StatisticsGen(input_data=example_gen.outputs.examples)

Configuration

Visualization

StatisticsGen

Data

ExampleGen

Inputs and Outputs

Statistics

Page 23: 王顺– Google Cloudˆ†...深度学习实践 from Tensorflow to AI-Hub 王顺–Google Cloud

24

Component: SchemaGen

SchemaGen

Statistics

StatisticsGen

Inputs and Outputs

Schema

infer_schema = SchemaGen(stats=statistics_gen.outputs.output)

Configuration

Visualization

Page 24: 王顺– Google Cloudˆ†...深度学习实践 from Tensorflow to AI-Hub 王顺–Google Cloud

25

Component: ExampleValidator

ExampleValidator

Statistics Schema

StatisticsGen SchemaGen

Inputs and Outputs

Anomalies Report

validate_stats = ExampleValidator(stats=statistics_gen.outputs.output,schema=infer_schema.outputs.output)

Configuration

Visualization

Page 25: 王顺– Google Cloudˆ†...深度学习实践 from Tensorflow to AI-Hub 王顺–Google Cloud

26

Component: Transform

transform = Transform(input_data=example_gen.outputs.examples,schema=infer_schema.outputs.output,module_file=taxi_module_file)

Configuration

for key in _DENSE_FLOAT_FEATURE_KEYS:outputs[_transformed_name(key)] = transform.scale_to_z_score(

_fill_in_missing(inputs[key]))# ...

outputs[_transformed_name(_LABEL_KEY)] = tf.where(tf.is_nan(taxi_fare),tf.cast(tf.zeros_like(taxi_fare), tf.int64),# Test if the tip was > 20% of the fare.tf.cast(

tf.greater(tips, tf.multiply(taxi_fare, tf.constant(0.2))), tf.int64))

# ...

CodeTransform

Data Schema

Transform Graph

Transformed Data

ExampleGen SchemaGen

Trainer

Inputs and Outputs

Code

Page 26: 王顺– Google Cloudˆ†...深度学习实践 from Tensorflow to AI-Hub 王顺–Google Cloud

27

Component: Trainer

trainer = Trainer(module_file=taxi_module_file,transformed_examples=transform.outputs.transformed_examples,schema=infer_schema.outputs.output,transform_output=transform.outputs.transform_output,train_steps=10000,eval_steps=5000,warm_starting=True)

Configuration

Code: Just TensorFlow :)

Trainer

Data Schema

Transform SchemaGen

Evaluator

Inputs and Outputs

Code

Transform Graph

Model Validator

Pusher

Model(s)

Page 27: 王顺– Google Cloudˆ†...深度学习实践 from Tensorflow to AI-Hub 王顺–Google Cloud

28

Component: Evaluator

Evaluator

Data Model

ExampleGen Trainer

Inputs and Outputs

Evaluation Metrics

model_analyzer = Evaluator(examples=examples_gen.outputs.output,eval_spec=taxi_eval_spec,model_exports=trainer.outputs.output)

Configuration

Visualization

Page 28: 王顺– Google Cloudˆ†...深度学习实践 from Tensorflow to AI-Hub 王顺–Google Cloud

29

Component: ModelValidator

Model Validator

Data

ExampleGen Trainer

Inputs and Outputs

Validation Outcome

Model (x2)

model_validator = ModelValidator(examples=examples_gen.outputs.output,model=trainer.outputs.output,eval_spec=taxi_mv_spec)

Configuration

● Configuration options○ Validate using current eval data○ “Next-day eval”, validate using unseen data

Page 29: 王顺– Google Cloudˆ†...深度学习实践 from Tensorflow to AI-Hub 王顺–Google Cloud

30

Component: Pusher

Pusher

Validation Outcome

ModelValidator

Inputs and Outputs

PusherPusherDeployment

Options

pusher = Pusher(model_export=trainer.outputs.output,model_blessing=model_validator.outputs.blessing,serving_model_dir=serving_model_dir)

Configuration

● Block push on validation outcome

● Push destinations supported today○ Filesystem○ TF Serving model server

Page 30: 王顺– Google Cloudˆ†...深度学习实践 from Tensorflow to AI-Hub 王顺–Google Cloud

4 实践指南实践经验小结

Page 31: 王顺– Google Cloudˆ†...深度学习实践 from Tensorflow to AI-Hub 王顺–Google Cloud

单击此处添加标题1. Data

Page 32: 王顺– Google Cloudˆ†...深度学习实践 from Tensorflow to AI-Hub 王顺–Google Cloud

2. Tensorboard

Page 33: 王顺– Google Cloudˆ†...深度学习实践 from Tensorflow to AI-Hub 王顺–Google Cloud

3. Fine tune

Page 34: 王顺– Google Cloudˆ†...深度学习实践 from Tensorflow to AI-Hub 王顺–Google Cloud

4. checkpoint

Page 35: 王顺– Google Cloudˆ†...深度学习实践 from Tensorflow to AI-Hub 王顺–Google Cloud

ExampleGen

StatisticsGen SchemaGen

ExampleValidator

Transform Trainer

Evaluator

ModelValidator

Pusher

TFX Config

Metadata Store

Training + Eval Data

TensorFlow Serving

TensorFlow Hub

TensorFlow Lite

TensorFlow JS

5. Pipeline

Kubeflow RuntimeAirflow Runtime

Page 36: 王顺– Google Cloudˆ†...深度学习实践 from Tensorflow to AI-Hub 王顺–Google Cloud

6. 协作

Page 37: 王顺– Google Cloudˆ†...深度学习实践 from Tensorflow to AI-Hub 王顺–Google Cloud

Takeaways

• 在不同设备上执行训练

• 基于AI产品的全流程

• 深度学习实践:• 质量• 效率• 专注• 稳定

• 参与和行动!!!

Page 38: 王顺– Google Cloudˆ†...深度学习实践 from Tensorflow to AI-Hub 王顺–Google Cloud

THANK YOU

希望对大家有所帮助和启发