Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two...

64
Data-driven machine control Citation for published version (APA): Mehrafrouz, M., & Technische Universiteit Eindhoven (TUE). Stan Ackermans Instituut. Software Technology (ST) (2014). Data-driven machine control: a feasibility study on YieldStar. Eindhoven: Technische Universiteit Eindhoven. Document status and date: Published: 01/10/2014 Document Version: Publisher’s PDF, also known as Version of Record (includes final page, issue and volume numbers) Please check the document version of this publication: • A submitted manuscript is the version of the article upon submission and before peer-review. There can be important differences between the submitted version and the official published version of record. People interested in the research are advised to contact the author for the final version of the publication, or visit the DOI to the publisher's website. • The final author version and the galley proof are versions of the publication after peer review. • The final published version features the final layout of the paper including the volume, issue and page numbers. Link to publication General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal. If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license above, please follow below link for the End User Agreement: www.tue.nl/taverne Take down policy If you believe that this document breaches copyright please contact us at: [email protected] providing details and we will investigate your claim. Download date: 12. Jul. 2020

Transcript of Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two...

Page 1: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

Data-driven machine control

Citation for published version (APA):Mehrafrouz, M., & Technische Universiteit Eindhoven (TUE). Stan Ackermans Instituut. Software Technology(ST) (2014). Data-driven machine control: a feasibility study on YieldStar. Eindhoven: Technische UniversiteitEindhoven.

Document status and date:Published: 01/10/2014

Document Version:Publisher’s PDF, also known as Version of Record (includes final page, issue and volume numbers)

Please check the document version of this publication:

• A submitted manuscript is the version of the article upon submission and before peer-review. There can beimportant differences between the submitted version and the official published version of record. Peopleinterested in the research are advised to contact the author for the final version of the publication, or visit theDOI to the publisher's website.• The final author version and the galley proof are versions of the publication after peer review.• The final published version features the final layout of the paper including the volume, issue and pagenumbers.Link to publication

General rightsCopyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright ownersand it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.

• Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal.

If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license above, pleasefollow below link for the End User Agreement:www.tue.nl/taverne

Take down policyIf you believe that this document breaches copyright please contact us at:[email protected] details and we will investigate your claim.

Download date: 12. Jul. 2020

Page 2: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

Data-Driven Machine Control

A Feasibility study on YieldStar

Mohsen Mehrafrouz

September 2014

Page 3: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

Data-Driven Machine Control A Feasibility study on YieldStar

Mohsen Mehrafrouz

September 2014

Page 4: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project
Page 5: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

Data-Driven Machine Control A Feasibility study on YieldStar

By: Mohsen Mehrafrouz

Eindhoven University of Technology

Stan Ackermans Institute / Software Technology

Partners

ASML Holding N.V. Eindhoven University of Technology

Steering Group Rik Peeters

Pieter Cuijpers

Harold Weffers

Ad Aerts

Date September 2014

Page 6: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

Contact

Address

Eindhoven University of Technology

Department of Mathematics and Computer Science

MF 7.090, P.O. Box 513, NL-5600 MB, Eindhoven, The Netherlands

+31402474334

Published by Eindhoven University of Technology

Stan Ackerman’s Institute

Printed by Eindhoven University of Technology

Universiteit Drukkerij

ISBN 978-90-444-1324-3

Abstract These comments should be the abstract issued for the ISBN-request.

Keywords

These keywords should be at least the keywords as issued for the ISBN-request

Preferred reference

, Data-Driven Machine Control: A Feasibility Study on YieldStar. Eindhoven University of Technology, SAI Technical Report, November, 2014. (978-90-444-1324-3)

Page 7: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

Partnership This project was supported by Eindhoven University of Technology and ASML Holding.

Disclaimer

Endorsement

Reference herein to any specific commercial products, process, or service by trade name, trademark, manufacturer, or otherwise, does not necessarily constitute or imply its endorse-ment, recommendation, or favoring by the Eindhoven University of Technology or ASML Holding. The views and opinions of authors expressed herein do not necessarily state or re-flect those of the Eindhoven University of Technology or ASML Holding, and shall not be used for advertising or product endorsement purposes.

Disclaimer

Liability

While every effort will be made to ensure that the information contained within this report is accurate and up to date, Eindhoven University of Technology makes no warranty, represen-tation or undertaking whether expressed or implied, nor does it assume any legal liability, whether direct or indirect, or responsibility for the accuracy, completeness, or usefulness of any information.

Trademarks Product and company names mentioned herein may be trademarks and/or service marks of their respective owners. We use these names without any particular endorsement or with the intent to infringe the copyright of the respective owners.

Copyright Copyright © 2014. Eindhoven University of Technology. All rights reserved.

No part of the material protected by this copyright notice may be reproduced, modified, or redistributed in any form or by any means, electronic or mechanical, including photocopy-ing, recording, or by any information storage or retrieval system, without the prior written permission of the Eindhoven University of Technology and ASML Holding.

Page 8: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project
Page 9: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

Foreword Traditionally machine control software focusses on the control flow; this is also the situation within ASML and YieldStar. With the increased complexity of the machine control software more and more data is needed to accurately control a tool like YieldStar. In other software application areas, like web applications, these similar problems are addressed by more data-centered solutions applying design patterns like producer-consumer or pipes and filters. This approach has resulted in standard im-plementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project is about prototype these new .Net implementations within the context of machine control software and em-bedded systems for the YieldStar.

Within this project Mohsen quickly gathered knowledge about the YieldStar software design and implementation. He also was able to learn quickly the new .Net technolo-gies of Dataflow and Reactive extensions. By adapting the existing software he was able to show the feasibility of these techniques within the existing YieldStar machine control software. After these prototyping steps Mohsen also made design improve-ments to the YieldStar machine control software to fully use the possibilities of data driven software. This resulted in helping the team to quickly apply Reactive exten-sions for image acquisitions where timing and performance is very important.

Rik Peeters

Project Supervisor

Page 10: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project
Page 11: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

Acknowledgements First and foremost, I would like to thank to my supervisors of this project: Rik Peeters and Pieter Cuijpers for the valuable guidance and advice. They inspired me greatly to work in this project. Their willingness to motivate me contributed tremendously to my project. Deepest gratitude is also due to the members of the FaFi team; without whose knowledge and assistance this study would not have been successful. I wish to express my sincere gratitude to Borre Sanders, group leader of AP DE SW in YieldStar for providing me an opportunity to do my pro-ject in a well-facilitated environment.

Mohsen Mehrafrouz

Page 12: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project
Page 13: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

Table of Contents

Foreword .................................................................................................... i

Acknowledgements .................................................................................. iii

Table of Contents ...................................................................................... v

List of Tables ........................................................................................... vii

1. Introduction ....................................................................................... 9

1.1 Context ......................................................................................... 9

1.2 Outline ........................................................................................ 10

2. Problem Analysis ............................................................................. 11

2.1 Domain Analysis ........................................................................ 11 2.1.1. Machine Controlling ................................................................ 12

2.2 Problem description ................................................................... 12

2.3 Roadmaps ................................................................................... 13 2.3.1. Technology Roadmap .............................................................. 13 2.3.2. Product Roadmap .................................................................... 13

2.4 Stakeholders ............................................................................... 13

2.5 Design Opportunities ................................................................. 15

2.6 Risks and challenges .................................................................. 15

3. Literature Review ............................................................................ 17

3.1 Introduction ................................................................................ 17 3.1.1. What is Reactive Extensions (Rx) ........................................... 17 3.1.2. What is TPL Dataflow (TDF) .................................................. 19

3.2 Rx and Data-Driven Software .................................................... 20

3.3 TDF and Task Communications ................................................. 21

3.4 Conclusion.................................................................................. 23

4. System Requirements ...................................................................... 24

4.1 The Process ................................................................................ 24

5. System Architecture ........................................................................ 27

5.1 Introduction ................................................................................ 27 5.1.1. T systems (integrated) Vs. S Systems (stand-alone) ............... 27

5.2 Machine Control (MC) ............................................................... 28

5.3 Data-Driven YieldStar ............................................................... 30

6. System Design .................................................................................. 32

6.1 Introduction ................................................................................ 32

Page 14: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

6.2 The Data-Driven Patterns .......................................................... 33 6.2.1. The Simple Producer/Consumer scenario ............................... 33 6.2.2. The Slow Consumer ................................................................ 38 6.2.3. The Backward Channel ........................................................... 41 6.2.4. Pipeline extensions .................................................................. 42

7. Project Management ....................................................................... 45

7.1 Introduction ................................................................................ 45

7.2 Work-Breakdown Structure (WBS)............................................. 46 7.2.2. Activity-Cost Estimations ........................................................ 46

7.3 Project Planning Tools ............................................................... 47

7.4 Feasibility Study ......................................................................... 47 7.4.1. Mockups .................................................................................. 47 7.4.2. Conclusions ............................................................................. 48

7.5 Project Plan ................................................................................ 48

8. Conclusions ...................................................................................... 50

8.1 Validation ................................................................................... 50

8.2 Technical Conclusions ............................................................... 51

8.3 Future Developments ................................................................. 51

Bibliography ............................................................................................ 53

References ............................................................................................. 53

About the Authors .................................................................................. 55

vi

Page 15: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

List of Tables

Table 6.1 – Dataflow properties in YieldStar ………………………………35

Page 16: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project
Page 17: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

1.Introduction

This set of materials reports on the Data Driven Machine Control project at ASML. This project is the final assignment of the PDEng Software Technology program at Eindhoven University of Technology. This report will cover all the theoretical and practical aspects of the project ranging from literature review of require technologies to steps towards the design of the final deliverables of the project. Please note that due to congeniality concerns within ASML, description of product specific design details have been avoided. In this chapter the reader will receive the big picture of the project domain and problem description.

1.1 Context YieldStar is a relatively young and evolving product of ASML. It provides accurate feedbacks for the exposer phase. These feedbacks are the result of measuring a prede-fined number of targets on an exposed wafer. At the end of the day, YieldStar ena-bles the customer to have exposures in smaller scale with higher accuracy. And that is in semiconductor industry the equivalent of more benefit.

In order to capture the correct targets YieldStar is given a set of instructions namely, the Process-Job. The process Job refers to a “recipe” from which a set of measuring instructions is derived for each wafer.

In a bigger picture, YieldStar receives a set of data (i.e. the Process-Job) and produc-es a different set of data (i.e. measurement results). One can already see the YS as a dataflow system. The entire functionality of the YieldStar machine and the software boils down into producing reports of measurements, which are the results of accurate instructions. Thus, data plays a crucial role here.

On the other side, as YS Software is growing, maintenance and change management become bigger problems. For example, the learning curve of the current software is

Page 18: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

considered to be too steep. That is mainly due to the complexity of the code and at many points the legacy code with no clear descriptions neither updated documents. But, as .NET platform evolves there are more emerging opportunities to reduce the complexity and increase the readability of the code and thus improve the maintaina-bility and testability of our code base.

Reactive Extensions (Rx) from Microsoft’s DevLabs and TPL Dataflow(TDF) as a part of TPL library are two dataflow-based technologies. Therefore they can be good candidates for our dataflow system (i.e. YieldStar). This project is defined as a feasi-bility study of applying those two technologies to the Machine Controlling module of YieldStar project. That is to investigate the potential capabilities of two libraries. Then define the procedure to apply them to the software and most importantly be able to prove that the whole initiative has improved the software from maintainability and testability points of view.

The expected results of this project are:

• A guideline that explains the know-how of applying Rx and TDF to YS

• A estimation of required effort (e.g. man/hours) for the refactoring

• A prototype of the Machine Controlling where Rx and/or TDF have been applied and passed the test

1.2 Outline First six chapters of the report are presenting the general context of the project. That information builds up the understanding of the domain and supports the technical discussions within the next chapters. Chapter seven to twelve follow the same struc-ture. That is after a short introduction and description of the content (perspective), each scenario is discussed separately with respect to the perspective of the current chapter (e.g. architecture, implementation). That approach enables the selective usage of the report for readers who are interested in top-down procedure only for certain scenarios.

10

Page 19: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

2.Problem Analysis

So far the reader has received a general description of the problem as well as detailed report of people and organizations that are involved in the project.

The next logical step is to define the problem in a level of details which is a helpful input for design and a concrete measure for both the quality and the quantity of the delivery. In this chapter the expectations of all the stakeholders are reflected in terms of functional and non-functional requirements. Moreover, the reason and motivation behind each is explained.

2.1 Domain Analysis ASML has a relatively new product; YieldStar (YS) is capable of measuring the overlay, critical dimension and focus of structures that have been printed on wafers by a TwinScan. The main goal is to deliver a closed loop solution to customers where the drift of a TwinScan can be corrected automatically by measuring the printed wa-fers with YieldStar.

Unlike the situation for TwinScan machine, ASML is not the world leader in meas-urement machines like YS.

The YieldStar software (YS) is 32bit Windows application written in C# on .NET 4. The basic inputs of YS are the Process Job, Recipe and a set of wafers. The Machine Control is the component within YS that is in charge of produce and run certain in-structions in an optimal order on a wafer based on the recipe which is provided as a part of the Process Job definition.

Before we go through more details about machine control it is important to know about a number of accurate definitions of certain components.

The list below defines the very fundamental concepts, objects and operations regard-ing the context of this project:

• Process Job A Process job has list of wafers to be processed. Each wafer is represented as a set of instructions called WPI. (In reality each wafer can have multiple WPIs which are out of the scope of this project). The list of wafers is being filtered by host during the creation of the Process Job.

After YS is finished with a Process Job, the flag "pjPROCESS_COMPELETE" goes true and IMM or LOT Control modules will trigger the reporting.

• Wafer Processing Instructions The WPI contains information regarding the recipe and selected sample scheme(s) to be measured. Based on the WPI information, instructions need to be created for wa-fer alignment and target acquirement, sensing and measuring.

• Recipe A process program is the pre-planned and reusable set of instructions, settings, and parameters that determine the processing environment seen by the manufactured ob-ject. Process programs are also called recipes. It might be changed between runs and processing cycles.

Process programs allow the equipment process, and/or the parameters used by that process, to be set and modified by the engineer to achieve different results. Different process programs may be required for different chips, while often the same process program will be used for all lots of a given wafer. The engineer must be able to create

Page 20: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

such programs, to modify current programs, and to delete programs from equipment storage.

For the host to ensure that the proper process programs are on the equipment there must be a means of transferring them from equipment to the host and from the host to the equipment. The host also may need to delete process programs from the equip-ment storage to make room for a new process program. In addition, the host must be kept informed whenever a local change occurs in the contents or status of a process program.

2.1.1. Machine Controlling YS has two major inputs, a set of wafers and a Process Job. In the context of YS “Machine Controlling” refers to a software component that is in charge of generating different sets of instruction for YS to react according to the Process Job.

The cluster ‘machine control’ involves controlling the modules in the YS in order to perform optimal wafer processing. The central modules for wafer processing are the ‘processor’ and ‘machine control acquire’.

The processor module receives “wafer processing” instructions that, together with the recipe, must result in a series of actions to acquire and measure the requested targets on the wafer in an optimal way. Two important processor aspects concerns are:

• Instruction generation for acquire, sense and measure modules

• Instruction sorting for optimized throughput

The machine control acquire module takes care of handling acquire commands. Its most import task is to translate the acquire command into lower level commands to-wards PLC and motion controller.

These aspects are presented in the chapters that follow. Next to this, separate chapters are dedicated to important features within the machine control cluster, like ‘through-put braking’.

2.2 Problem description When YS was being developed, around 10 years ago, people who were in charge of programming were mostly mathematicians and physicists who had a very good knowledge of C programming. However, the software was supposed to be written in C#. As a result, the current implementation of YS is a C-like C# code. That is in many situations, metaphorically, the wheels are reinvented and/or the code is too complex to deal with intuitively. This will make it more difficult to maintain the code and find the potential points of improvement.

On the other side, Microsoft is introducing numerous libraries for .NET platform which can be used to reduce the complexity and improve the performance of the code. There are two of these newly introduced technologies that are the point of in-terest for ASML; Dataflow from Microsoft's TPL library and Reactive Extensions from Microsoft DevLabs.

The goal of this project is to invest these new approaches to improve the architecture of YS software in terms of less complexity and maintainability. Also the results of the project should contain a feasibility study to estimate the effort needed to trans-form the current implementation to a new architecture using Rx and Dataflow and a guideline to apply those technologies.

The scope of this project is the Machine Control component of YS software. Within Machine Control component, the Processor class is the main point of interest for the stakeholders of the project. Given a Wafer Process Instruction (WPI), Processor ex-tract the proper commands for each step of measuring a wafer and initiates (if need-ed) the threads responsible for further steps. At certain point those instructions are

12

Page 21: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

Figure 2.1

translated to PLC codes to perform the actual control of physical components of a YS machine.

2.3 Roadmaps

2.3.1. Technology Roadmap Throughout its development YieldStar step-by-step migrated from .Net 2.0 to .Net

4.0 and very recently to .Net 4.5. It is to mention that the solution has so far being deployed in .Net 4.5 but the code is still not using any libraries from the new version. Moving to .Net 4.5 was a very strategic update regarding this project. Because, TDF is only available on .Net 4.5. So, one should consider that as a pre-requirement in order to apply the modification proposed in this report.

2.3.2. Product Roadmap As part of ASML's holistic lithography roadmap, ASML is developing several appli-cation-specific optimization and control applications, such as LithoTuner Pattern Matcher and BaseLiner. These applications are all explicitly designed to improve the scanner process window (overlay, focus, CDU and matching). All these applications have in common that they require vast amounts of precise, accurate and process ro-bust wafer data (either taken on product stacks or on so-called monitor wafers). To provide such essential data in a cost-effective manner, ASML developed a metrology platform, called YieldStar. This platform is based on an angle-resolved high-NA scatterometer. It is versatile, as YieldStar's sensor can measure overlay, CD and focus in a single measurement. Thanks to its high speed, large amounts of measurements can be quickly collected. In this paper the latest generation YieldStar is presented, the so-called 200 platform. This YieldStar 200 can be used in a stand-alone configuration (S-200) or as an integrated module in a lithography track (T-200)

2.4 Stakeholders TU/e, ASML and ASML’s customers are the three major stakeholders of this project. (Fig. 2.1)

As it is depicted in Figure 2.1, ASML pays for a nine months project and naturally owns the project. TU/e will evaluate the project as it is going and finally in order to qualify the ST trainee for the PDEng. Degree. Moreover, the results of this project for one will improve the maintenance of the code that results in satisfaction of the customers who are using the YieldStar software. Hence, the customers of ASML will

Page 22: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

be implicitly affected by the results of this project and this will make them a sec-ondary stakeholder.

This project is carried out in FaFi team. FaFi Team is mainly in charge of develop-ment and maintenance of Machine Controlling of YieldStar. As it is depicted in fig-ure 2.2, this team has close communication with Application SW and Machine SW. That is due to the critical role of MC in extracting the instruction from recipe and compiling the report from measurement results. To guard the whole process is also of the guarantees that FaFi team provides for the rest of the YieldStar software team.

• TU/e as one of the main stakeholders of the project.

o Expects enough design challenges to prove the eligibility of OOTI trainee for PDEng degree. That requires providing enough measur-able metrics also a precise description of design stages and trend of design decisions along the way.

o Providing a clear description of the problem along with detailed list of deliverables plus the connection between the deliverables and requirements. That all should be made available in final technical report (this document). The final report will be scored and re-viewed by a steering group consist of TU/e and ASML supervisors plus an additional technical person from the management of the PDEng program.

Figure 2.2

14

Page 23: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

• ASML as the problem owner:

o Expects an accurate estimation based on a list of metrics to realize how far utilizing new technologies can improve YS software. The list of metrics is explained in details in chapter 6 where system re-quirements are discussed.

o Expects a design guideline in case ASML decides to apply the changes in their product

o Expects a working prototype with Rx and Dataflow being applied

2.5 Design Opportunities This problem introduces several design opportunities from different perspectives:

• Introduce new patterns and coding standards

Use of these new technologies introduces a new architectural approach that may disclose a number of problems in the system. TDF and Rx in particular add an-other layer of abstraction on top of the design of the dataflow systems. That is the designer and/or developer no longer has to be concerned with details with the complex low-level design. For example in the current implementation of Yield-Star in many scenarios one can observe explicit usage of threads and memory al-locations. Although in that case the designer has more freedom, the complexity of the design makes it very much time consuming to be maintained. Now imag-ine if the designer can look at the bigger picture and focus on the real problem, he can eventually solve more important and fundamental issues.

• Data Driven development and refactoring

Developing a new way of looking at software in a different way is another de-sign opportunity in this project. The YieldStar project is there and working so there are no new functionalities being developed here. This is a unique chance to play around industrial-scale software and focus the attention on the data rather than just the functionality.

• Focusing on maintainability and testability.

This is another interesting design challenge of this project, to reflect two non-functional requirements in the final design. Moreover to find quantify measures in order to measure these qualities.

2.6 Risks and challenges Considering the nature of this project (feasibility study) the risk analysis of the prob-lem is not the most crucial part of the report. However, here I have provided a list of risks and possible challenges that may appear as an obstacle on the way of achieving the results of this study.

• Porting YS to .NET 4.5

TDF is only provided on .Net 4.5 and by the time when this project was started the YieldStar solution was being developed on .Net 4.0. This could potentially undermine all the usage of TDF.

In order to mitigate this risk and having the initial go from the project su-pervisor, I initiated the migration of a snapshot of the solution to .Net 4.5.

Meanwhile I followed the technology roadmap of YieldStar making sure that the migration will be in progress at least before the end of this project. Fortunately, due to a bug in .Net 4.0 the order to migrate the solutions was issued earlier than expected assuring that the findings of the project will be applicable to the solution.

Page 24: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

• Failure in performance testing

Performance is not a strict requirement of this project. On the other side, performance testing is a final step (after implementation) in this context. It may be the case that these new libraries fail to pass certain performance tests at the end.

In order to reduce the effects of this risk, I started testing heavy loads of large pseudo-messages on my lite-implementations1.

Moreover, the Sensing Team within YieldStar software has successfully tested and will officially deliver a part of our joint-implementation in Sep-tember 2014. That was another reassuring factor that Rx can handle the heavily load of image and instruction message in practice also a time critical part of YS.

• Unclear Requirements

As for any other projects, unclear requirements can have a negative effect on the happiness of the customer thus failure of the project. The irony is in most case the customer herself is the main obstacle. Luckily in this case team members of FaFi team and other related teams were quite cooperative; the only problem being the distribution of the knowledge.

In order to mitigate this risk, first of all, I made sure to know the people from whom I need inputs and insights along with their roles and availability. Secondly, I tried to arrange a number of meetings with them in advance also to have an estimation of the limits of their knowledge.

Having lite-implementation of certain scenarios and regular verification of the results with my supervisors was another approach to prevent the unex-pected consequences of this challenge.2

• Unclear measures to qualify the results of the project

Having two stakeholders monitoring the results and the progress of the pro-ject brings up the necessity of having a common language to have clear communication in terms of expectations. On the other hand, where the main requirements are non-functional requirements, it is tactful to have quantifia-ble measures to prove the validity of your findings.

Tackling this challenge I came up with a number of measures that reflect the level to which the non-functional requirements of the project have been met.

1 You can find more about lite-implementations in chapter 5 2 You can find more about lite-implementations in chapter 5

16

Page 25: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

3.Literature Review Having a clear definition of the problem requires clear inputs. As a main input, this chapter presents a small part of the literature and its connection to the context of the current project. That is the results of studying Reactive Extensions (Rx) and TPL Dataflow (TDF) are gathered, categorized and compared. This helps us to find the answer to the following questions:

• What are Rx and TDF?

• When/Where are they useful?

Afterwards we conclude the chapter with a clear guideline for the use-cases of those technologies. The content of this chapter is mostly derived from the official website of Reactive Extensions, MSDN Library and related web-forums. Unfortunately, all the acquired knowledge from literature would not fit in the scope of this document. Thus the extra material and guidelines can be found as an appendix to this document.

3.1 Introduction At first glance TPL Dataflow may seem to overlap with Reactive Extensions. The set of features they each provide has a considerable overlap. While they both involve moving data around, Reactive Extensions focuses on the ability to write complicated push-based data streams in a very succinct fashion. TPL Dataflow is more about be-ing the fundamental building blocks for building up actors and agents, with an em-phasis on controlling aspects such as where to do buffering and when to block pro-ducers.

In order to decide which technology to use for the later stages of the project, a num-ber of selection criteria were needed. This was rendered using the requirements of the projects plus the feedbacks from the meetings with the project owner.

The evaluation criteria were as follows:

• Robustness • Maintainability • Performance • Credibility

• Terms and duration of support • Popularity • Project size

• Code Consistency • Learning curve

• Ease of Development • Number of newly introduced development concepts

3.1.1. What is Reactive Extensions (Rx) The Reactive Extensions (Rx) is a library for composing asynchronous and event-based programs using observable sequences and LINQ-style query operators.

Rx = Observables + LINQ + Schedulers

In oppose to traditional pull-based way of transferring the messages in dataflow sys-tems, Rx offers certain functionality to implement a push-based data sequence. Rx is way to look at the dataflow from a higher point of abstraction. Moreover it allows you to implement in the same fashion. Consider the following example:

Page 26: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

In this example, an Observable<int> object “o” is instantiated using the Create exten-sion method of Rx. This Observable represents a sequence of integer numbers and will post two numbers 42 and 24 and then declares the completion of the sequence.

Later we subscribe three Actions, in order, to response to OnNext(), OnError() and OnCompleted() calls. The Subscribe(…) method returns an IDisposable object. That object can later be used to discard the subscription and signal the garbage collector to wipe out the allocated memory.

Consequently, in this example the output of this part of the code is:

Next: 42 Next: 24 Done!

Microsoft Open Technologies, Inc. is open sourcing Rx. Its source code is now host-ed on CodePlex 3to increase the community of developers seeking a more consistent interface to program against, and one that works across several development lan-guages. The goal is to expand the number of frameworks and applications that use Rx in order to achieve better interoperability across devices and the cloud.

Rx was developed by Microsoft Corp. architect Erik Meijer and his team, and is cur-rently used on products in various divisions at Microsoft. Microsoft decided to trans-fer the project to MS Open Tech in order to capitalize on MS Open Tech’s best prac-tices with open development.

OnError and OnCompleted Both the OnError and OnCompleted signify the completion of a sequence. If your sequence publishes an OnError or OnCompleted it will be the last publication and no further calls to OnNext can be performed. Of course, you could implement your own IObservable<T> that allows publishing after an OnCompleted or an OnError, however it would not follow the precedence of the current Subject types and would be a non-standard implementation. Basically, it would be safe to say that the inconsistent behavior would cause unpredictable behav-ior in the applications that consumed your code.

An interesting thing to consider is that when a sequence completes or errors, you should still dispose of your subscription.

3 CodePlex is Microsoft's free open source project hosting site. You can create projects to share with the world, collaborate with others on their projects, and download open source software

IObservable<int> o = Observable.Create<int>(observer => {

observer.OnNext(42); observer.OnNext(24); observer.OnCompleted();

});

IDisposable subscription = o.Subscribe(

onNext: x => { Console.WriteLine("Next: " + x); },

onError: ex => { Console.WriteLine("Oops: " + ex); },

onCompleted: () => { Console.WriteLine("Done!"); }

);

18

Page 27: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

Rx library provide a considerable number of extension methods for IObservable<T> interface in order to support variety of operations on data sequences. For example, transformation, aggregation and inspection of the data elements are a few general examples of those operations.

Scheduling The various Scheduler types provided by Rx all implement the IScheduler interface. Each of these can be created and returned by using static properties of the Scheduler type. The ImmediateScheduler (by accessing the static Immediate property) will start the specified action immediately. TheCurrentThreadScheduler (by accessing the stat-ic CurrentThread property) will schedule actions to be performed on the thread that makes the original call. The action is not executed immediately, but is placed in a queue and only executed after the current action is complete. The DispatcherScheduler (by accessing the static Dispatcher property) wills schedule actions on the current Dispatcher, which is beneficial to Silverlight developers who use Rx. Specified actions are then delegated to the Dispatcher.BeginInvoke() method in Silverlight. NewThreadScheduler (by accessing the static NewThread property) schedules actions on a new thread, and is optimal for scheduling long running or blocking actions. TaskPoolScheduler (by accessing the static TaskPool property) schedules actions on a specific Task Factory. ThreadPoolScheduler(by accessing the static ThreadPool property) schedules actions on the thread pool. Both pool sched-ulers are optimized for short-running actions.

Subject: The Subject<T> type implements both IObservable<T> and IObserver<T>, in the sense that it is both an observer and an observable. You can use a subject to subscribe all the observers, and then subscribe the subject to a backend data source. In this way, the subject can act as a proxy for a group of subscribers and a source. You can use subjects to implement a custom observable with caching, buffering and time shifting. In addition, you can use subjects to broadcast data to multiple subscribers.

3.1.2. What is TPL Dataflow (TDF) TPL Dataflow is focused on providing building blocks for message passing and par-allelizing CPU- and I/O-intensive applications with high-throughput and low-latency, while also providing developers explicit control over how data is buffered and moves about the system. The TPL Dataflow Library consists of dataflow blocks, which are data structures that buffer and process data. The TPL defines three kinds of dataflow blocks: source blocks, target blocks, and propagator blocks. A source blocks acts as a source of data and can be read from. A Target Block acts as a receiver of data and can be written to. A propagator block acts as both a source block and a target block, and can be read from and written to. You can connect dataflow blocks to form pipe-lines, which are linear sequences of dataflow blocks, or networks, which are graphs of dataflow blocks.

The TPL Dataflow Library provides several predefined dataflow block types. These types are divided into three categories: buffering blocks, execution blocks, and grouping blocks. Buffering blocks hold data for use by data consumers. Execu-tion blocks call a user-provided delegate for each piece of received data. Grouping blocks combine data from one or more sources and under various constraints.

The dataflow programming model is related to the concept of message passing, where independent components of a program communicate with one another by sending messages. One way to propagate messages among application components is to call the Post<TInput> and DataflowBlock.SendAsync methods to send messages to target dataflow blocks post (Post<TInput>acts synchronously; SendAsync acts asynchronously) and the Receive, ReceiveAsync, and TryReceive<TOutput> methods to receive messages from source blocks. You can combine these methods with dataflow pipelines or networks by sending input

Page 28: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

data to the head node (a target block), and receiving output data from the terminal node of the pipeline or the terminal nodes of the network (one or more source blocks). You can also use the Choose method to read from the first of the provided sources that has data available and perform action on that data.

Source blocks offer data to target blocks by calling the ITargetBlock<TInput>.OfferMessage method. The target block responds to an offered message in one of three ways: it can accept the message, decline the message, or postpone the message. When the target accepts the message, the OfferMessage method returns Accepted. When the target declines the message, the OfferMessage method returns Declined. When the target requires that it no longer receives any messages from the source, OfferMessage returns DecliningPermanently. The predefined source block types do not offer messages to linked targets after such a return value is received, and they automatically unlink from such targets.

When a target block postpones the message for later use, the OfferMessage method returns Postponed. A target block that postpones a message can later calls the ISourceBlock<TOutput>.ReserveMessage method to try to reserve the offered message. At this point, the message is either still available and can be used by the target block, or the message has been taken by another target. When the target block later requires the message or no longer needs the message, it calls theISourceBlock<TOutput>.ConsumeMessage or ReleaseReservation method, re-spectively. Message reservation is typically used by the dataflow block types that operate in non-greedy mode. Non-greedy mode is explained later in this document. Instead of reserving a postponed message, a target block can also use the ISourceBlock<TOutput>.ConsumeMessage method to attempt to directly con-sume the postponed message.

3.2 Rx and Data-Driven Software This part of the report describes the connection between Rx and Data-Driven Soft-ware. Also describes the advantages and limitations of using Rx by providing exam-ple scenarios and code snippets.

Using Rx, you can represent multiple asynchronous data streams (that come from diverse sources, e.g., stock quote, tweets, computer events, web service requests, etc.), and subscribe to the event stream using the IObserver<T> interface. The IObservable<T> interface notifies the subscribed IObserver<T> interface whenever an event occurs.

First of all let us explore where Rx is applicable and what is the added value of ap-plying it:

• Simpler, more expressive asynchronous implementations

The Rx APIs allow a uniform approach to structure asynchronous flows using the observer pattern. Developers can do so by exposing a LINO-like Library around the interface pair IObservable<T> and IObserver<T>. By providing cleaner and more expressive asynchronous code, this library wraps up existing .NET infrastructures used in asynchronous programming—including events, thread-pool and Task Parallel Library (TPL).

• Reduced Learning curve

.NET developers who are familiar with LINQ can easily apply their knowledge along with the observer pattern to understand how to compose observables using Rx Libraries. (Composing observables in Rx Libraries is similar to composing collections using LINQ Library.)

• Easier method for designing asynchronous flows

While designing asynchronous applications, such features as events, threads and tasks can be implemented asynchronously. However, the Level, of care and at-

20

Page 29: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

tention that should be taken when implementing these features is very high. Rx simplifies this thought process by using observables and observers as the com-mon nomenclatures and internally implementing thread-safe data sharing.

3.3 TDF and Task Communications This part of the document describes how to use the TPL Dataflow Library to im-plement a producer-consumer pattern. The text and the example are chosen from MSDN Library. This example is very important in understanding of basic ele-ments in TDF also demonstrated one of the main possible usages of this library within YieldStar.

The following code example demonstrates a basic producer- consumer model that uses dataflow. The Produce method writes arrays that contain random bytes of data to a System.Threading.Tasks.Dataflow.ITargetBlock<TInput> object and the Consume method reads bytes from a System.Threading.Tasks.Dataflow.ISourceBlock<TOutput> object. By acting on the ISourceBlock<TOutput> andITargetBlock<TInput> interfaces, instead of their derived types, you can write reusable code that can act on a variety of data-flow block types. This example uses the BufferBlock<T> class. Because the BufferBlock<T> class acts as both a source block and as a target block, the producer and the consumer can use a shared object to transfer data.

The Produce method calls the Post<TInput> method in a loop to synchronously write data to the target block. After the Producemethod writes all data to the tar-get block, it calls the Complete method to indicate that the block will never have additional data available. The Consume method uses the async and await operators (Async and Await in Visual Basic) to asynchro-nously compute the total number of bytes that are received from the ISourceBlock<TOutput> object.

To act asynchronously, the Consume method calls the OutputAvailableAsync method to receive a notification when the source block has data available and when the source block will never have additional data available.

Page 30: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

using System; using System.Threading.Tasks; using System.Threading.Tasks.Dataflow; // Demonstrates a basic producer and consumer pattern that uses data-flow. class DataflowProducerConsumer { // Demonstrates the production end of the producer and consumer pat-tern. static void Produce(ITargetBlock<byte[]> target) { // Create a Random object to generate random data. Random rand = new Random(); // In a loop, fill a buffer with random data and // post the buffer to the target block. for (int i = 0; i < 100; i++) { // Create an array to hold random byte data. byte[] buffer = new byte[1024]; // Fill the buffer with random bytes. rand.NextBytes(buffer); // Post the result to the message block. target.Post(buffer); } // Set the target to the completed state to signal to the consum-er // that no more data will be available. target.Complete(); } // Demonstrates the consumption end of the producer and consumer pattern. static async Task<int> ConsumeAsync(ISourceBlock<byte[]> source) { // Initialize a counter to track the number of bytes that are processed. int bytesProcessed = 0; // Read from the source buffer until the source buffer has no // available output data. while (await source.OutputAvailableAsync()) { byte[] data = source.Receive(); // Increment the count of bytes received. bytesProcessed += data.Length; } return bytesProcessed; } static void Main(string[] args) { // Create a BufferBlock<byte[]> object. This object serves as the // target block for the producer and the source block for the consumer. var buffer = new BufferBlock<byte[]>(); // Start the consumer. The Consume method runs asynchronously. var consumer = ConsumeAsync(buffer); // Post source data to the dataflow block. Produce(buffer); // Wait for the consumer to process all data. consumer.Wait(); // Print the count of bytes processed to the console. Console.WriteLine("Processed {0} bytes.", consumer.Result); }} /* Output: Processed 102400 bytes.*/

22

Page 31: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

3.4 Conclusion After in-depth exploration of Rx and TDF I have decide to allocate the major part of my investigation on Rx. This decision was based on the following rea-sons:

• Rx has gone open source. This is a huge opportunity to receive sup-port from a bigger/ more variant society of developers. Therefore we also expect more extensions to be introduced by the developers in different disciplines. One can conclude that the size of the project will be increased on a daily basis by the contribution of the open-source society also Rx will have more users. As it becomes more popular, the common use of Rx eventually makes it easier to learn and adopt the libraries in practice.

• Rx introduces a standard coding convention which is in line with gradual changes to legacy software by adopting the interfaces. TDF also introduces common practices and methods of implementation but in a more flexible and sometimes confusing way. While both li-braries are introducing a new development concept, using TDF the developer can still adopt the building block following the same “pro-cess oriented” way of design while in this investigation the focus is more on how the data is being process and passed by.

• TDF focuses more on how the processes are being handled; the communications between different processes and related multi-threading issues. While Rx introduces a higher abstraction in produc-ing the dataflow structures

• In order to compare the performance of these technologies, I imple-mented two mockup applications where the same dataflow scenario using the same messages was tested. Each of the mockups is im-plemented with exclusive usage of Rx or TDF. The execution results showed that the timing and memory usages of TDF and Rx do not have a considerable difference. Both runtimes and library specific de-lays are not considerable to the load of network and I/O latencies in industrial scale software such as YieldStar.

Page 32: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

4.System Requirements

In this chapter the requirement elicitation step is presented. That is done through number of meetings and reading the available documents on YS. For this project most of the requirements are non-functional requirements, as the functionality of the system should remain the same. The functional requirements of related libraries are being guaranteed by a comprehensive set of unit tests currently available for YS. As a result the main functional requirement is keeping the current functionality of the system. Consequently, the main focus of this project will be on maintainability and testability of refined architecture. And those are measured based on specific metrics with the concurrence of project’s stakeholders.

4.1 The Process The very first and single requirement of this project was “improving the maintaina-bility of Machine Controlling module using Rx and TDF”. Of course this is too gen-eral to start with. So through a number of meetings with the supervisors of a more tangible list of requirements in two groups of functional and non-functional require-ments was developed.

The functional requirements of the project were quite simple as it is expected. In a feasibility study it is not very clear what the possibilities for functionalities are. Therefore, most of the functional requirements as it is mentioned below are there to guard the quality of the results:

• Use of Rx and Dataflow standard code

These standards can be found on MSDN and DevLabs Official web sites

• Passing the standard unit test and behavioral tests of YS

There are already a testing team working on a huge set of unit tests to ensure the overall and detailed behaviors of the system follows the requirements and coding guidelines of YieldStar. Therefore whatever changes that this project proposes, the code should still pas the same tests; with the exception of those tests that are addressing the removed/changed blocks of the code structure.

• Successful build and run on YS simulator

On every developer’s machine there is a simulator that allows testing of the software where the images are uploaded from disk instead of being captured from the cameras. On the functional standards of the project is to have suc-cessful build and deployment of the solution on the simulator after applying the target technologies.

• A guideline document to help ASML for later attempts of using above mentioned libraries

The situation is more complicated with the non-functional requirements. The word “Maintainable” can have indefinite interpretation. As a result, I tried to break it down, with the help of the stakeholders, to smaller pieces. It means to interpret the concept into concrete processes, measures to make closer to the functional level. This is to make it possible to evaluate the results of the project; also a main input for the planning and task definition.

24

Page 33: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

• Separation of control-flow and data-flow

• Less explicit use of multi-threading

• Replacing ASML specific code with generic C# code

• Adhering to .NET coding standards

Page 34: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project
Page 35: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

5.System Architecture

Understanding the current YS architecture is the highest priority and a very influen-tial pre-requirement to meet. This chapter starts with presentation of a helicopter-view of the design and behavior of whole MC module of YS. Afterwards the com-mon behavioral and structural architectural patterns within YS are described. These patterns are the later targets of the design phase. For each pattern the cons and pros along with the semantics and historical reasoning. It is expected that the results of this chapter give the reader enough technical background of the problem.

5.1 Introduction Modeling and understanding the YS software architecture was in general the most time-consuming activity of this project. This was mainly the result of ill documented code and the distribution of the knowledge within the development team. A detailed study was necessary to be able to see the problem. At some point semantics of the code was required to understand the behavior.

5.1.1. T systems (integrated) Vs. S Systems (stand-alone) There basically two types of YieldStar Machines; the integrated systems (aka T-Systems) and the stand alone systems (aka S-Systems). In the standalone the scenario is straightforward. YieldStar receives the wafer through so called WaferHandler and receives the ProcessJob through a network interface. As it is depicted, YS, as a data processing system has two main inputs; namely, ProcessJob and the Wafer.

In figure 5.1 you can see the overview of a T-System. The Track handles the integra-tion of YieldStar with the other modules (e.g. TwinScan) in the customer site.

As it is shown, the Track provides a wafer for YieldStar also the ProcessJob through the IMM software module as a common interface.

Figure 5.1

Page 36: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

5.2 Machine Control (MC) The next diagram shows the path of a ProcessJob object as it is delivered to the Ma-chine Control module until it gets to the processing level. (Figure 5.2)

Figure 5.2

28

Page 37: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

clas

s M

achi

ne C

ontr

ol

Pro

cess

or

+ S

ense

And

Mea

sure

Waf

er

Blo

ckB

uild

er

+ C

reat

eBlo

cks

Blo

ckIn

fo

+ A

ngle

dAcq

uire

men

ts+

Imag

eAcq

uisi

tionS

kips

+ M

easu

rem

entIn

stru

ctio

ns

+ E

stim

ateB

lock

Dur

atio

n

Blo

ckD

urat

ionE

stim

ator

+ B

lock

Dur

atio

n: d

oubl

e

+ E

stim

ate

Blo

ckP

roce

ssor

+ P

roce

ssIn

stru

ctio

nBlo

ck

Blo

ckin

gAut

oPum

pQue

ue<B

lock

Info

>

Tim

eCon

sum

er

+ U

pdat

eCon

sum

erD

urat

ion

Mov

eAcq

uire

men

tTim

eCon

sum

er

Alig

nmen

tTim

eCon

sum

er

Acq

uisi

tionC

ompl

eteT

oWaf

erR

eady

Unl

oadT

imeC

onsu

mer

Insp

ectio

nTim

eDur

atio

nEst

imat

or

+ E

stim

ateW

afer

Pro

cess

ingT

ime

Targ

etS

erie

sDur

atio

nEst

imat

or

+ T

arge

tDur

atio

n: d

oubl

e+

Tar

getS

erie

sDur

atio

n: d

oubl

e

+ A

ddM

oveA

cqui

reD

urat

ion

Inst

ruct

ionB

uild

er

+ B

uild

Inst

ruct

ions

Mea

sure

men

tBui

lder

+ C

reat

eMea

sure

men

tInst

ruct

ion

Alig

nmen

tBui

lder

+ C

reat

eAlig

nIns

truct

ion

Acq

uire

men

tBui

lder

+ C

reat

eAcq

uire

men

tM

achi

neC

ontr

olA

cqui

re Har

dwar

eSeq

uenc

eGen

erat

or

+ T

rans

late

Targ

etD

ata

OV

Targ

etD

ata

CD

Targ

etD

ata

Mic

roD

BO

Targ

etD

ata

Focu

sTar

getD

ata

Inst

ruct

ionS

orte

r

Acq

uire

Opt

imiz

er

Mea

surin

g

Sen

sing

Har

dwar

eSeq

uenc

eGen

erat

or20

0

- M

oveA

cqui

re

IIllu

min

atio

nCon

trol

Mot

ion

Con

trolle

rIP

LC

PE

PP

rov

ider

Rec

ipe

Lot c

ontro

lB

lock

Info

Cre

ator

+ C

reat

eBlo

cksF

orE

stim

atio

n+

Cre

ateB

lock

sFor

Pro

duct

ion

OV

reci

pe o

ptim

izer

AS

AC

Targ

etD

ata

CD

Acq

uisi

tionT

arge

tDat

aFo

cusA

cqui

sitio

nTar

getD

ata

Mic

roD

BO

AS

AC

Targ

etD

ata

0..*1

<<co

mpu

te fo

r TP

T b

rake

or in

spec

tion

time>

>

<<M

A ti

me

estim

ates

are

stor

ed in

Ang

ledA

cqui

rem

ents

>>

<<qu

eue

PLC

com

man

ds>>

<<pr

oces

s bl

ock>

>

<<Q

ueue

Mea

sure

men

tC

omm

ands

>>

<<Q

ueue

Imag

eA

cqui

sitio

nD

ata>

><<

Que

ues

Acq

uire

Imag

eC

omm

ands

>>

1..*

stor

es, p

er b

lock

1

<<qu

eue

bloc

k>>

<<qu

eue

Waf

er P

roce

ssin

g In

stru

ctio

n>>

<< T

2x0:

est

imat

e w

afer

insp

ectio

n tim

e>> <<

enum

erat

e ov

erse

lect

ed ta

rget

s>>

<<ap

ertu

reac

tions

>>

<<qu

eue

stag

em

oves

>>

Cre

ates

inst

ruct

ions

base

d on

targ

et d

ata

ofta

rget

s to

mea

sure

Figure 5.3

In this model the path of the dataflow and associated control flows within Machin Control Module are show. It is to mention that this diagram only supports the com-munications required for the main success scenario. Due to complexity of the model abort and exception dataflow are not shown.

Page 38: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

Figure 5.4

Figure 5.4

Figure 5.3 on the other side shows the complicated overview of the structure of the Machine Control Module in YieldStar. The main point of interest in this project is where instructions move from processor to the BlockProcessor and from there to MachineControl Acquire, Sensing and Measuring modules. These Modules in order capture the image from a wafer, refine the image and measure the predefined targets from the data they receive form measuring module.

It can be useful to know the hierarchy of the objects within YieldStar from the high-level ProcessJob to the measurement instructions. (Figure 5.4)

5.3 Data-Driven YieldStar One the most important models in the context of this project are the data-driven-models of YieldStar. These models will be the main input for the design chapter of this report.

Models like Figure 5.2 and Figure 5.5 are not standard UML mod-els instead they are depicting the dataflow along with the actors involved in the manipulation and production of the data.

Figure 5.6 shows the current dataflow model of YieldStar. This model is a BPMN with extra “call” associations to present to con-trol flow between different blocks inside Machin Control.

These models are the main verification tools for any future chang-es to the system. That is they are also used to communicate the changes and design decisions in this project while implementing the modification to the behavior and structure of the software.

30

Page 39: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

Figure 5.6

dfd Machine Control Dataflow - Big Picture

ProcessJobManager Processor

BlockBuilder

BlockProcessor

MeasurerSensing

BlockingAutoPumpQueue

BlockInfo

LOTControlLOTControl

Aligner MachineControlAcquire

DB

Reporting

Storage

BlockInfoCreator

WPI

«call»

Recipe

«call»

WPI

«call»

Page 40: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

6.System Design

This chapter starts with short contract, including definitions and terms regarding da-taflow systems. Also the connection between those concepts and YS is fully de-scribed through structural and behavioral diagrams.

By the end of this chapter the reader should be aware of different patterns of dataflow within YS and the way they are being addressed in the design. This is achieved through describing scenarios in MC and parameterizing the function and the behavior to identify the pattern. On the other side, with introduce solutions based on Rx and/or TPL Dataflow for each pattern. Please note that due to the explicit system require-ment about “keeping the ASML specific code to the minimum” there has been an effort to keep the solutions and implementation as generic as possible.

6.1 Introduction System Design is indeed a very general name for this chapter. That is, due to the na-ture of the project, defining the problem was required much field research and in-cluded a number of decision making processes. Therefore, I decided not only to in-troduce and verify my design but also describe how they relate to the requirement. In other words, the answer to the question, why do I think that certain part of the YS should be the target of my design?

There are two influential ideas behind most of the design decisions; YS as a Data-flow system and Actor-Based Programming. Indicating that in every situation that requires a design decision there are two center of attention. I have to identify the pa-rameters of the data flow as well as behavior of the nodes (Actors).

One of the simplest ways of presenting a dataflow can be as follows:

Figure 6.1

As it is shown in figure 6.1, A sends the data to B. Let us call them nodes. So, the data travels from node A to node B in form of separate packs of data that we call Messages in this context. Messages can be of any type of data. In this project mes-sages are standard YieldStar Objects. Of course, there are many variants of this sim-ple model which are different in certain features.

There are many different criteria in the literature regarding categorizing the dataflow systems. But, in order to make the categorization of those variants more relevant to the context of this project a limited set of features were select based on the require-ments of the project as well as investigation of common scenarios within the code.

In table 6.1 you can find the criteria based on which we will differentiate dataflow scenarios within YS.

32

Page 41: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

Feature Description

Multiplicity Defines the number of senders and receivers (nodes on each side) of a da-taflow

Production/Consumption Rate Balance In principle producer (e.g. node A) and Consumer (e.g. node B) are not always having the same throughput.

This feature indicates whether the Pro./Cons. rate is balanced or not

Degree of parallelism Defines the number of actors that are working on the task in the same node

Message Size * Indicates if the messages contain big chunks of data (<0.5 MB)

Direction If the communication between the nodes is unidirectional or bidirectional

Table 6.1 – Dataflow properties in YieldStar

(*) Message size is not in the same level of abstraction as other parameter neither an independent one. That is, wherever the throughput is balanced, the Message Size is not an influential factor anymore. However it can lead to failures in future designs if the reader of this report does not consider this factor in early stages. In fact Rx is making Observer queues transparent to the user still any system has its limitations in terms of memory. So if a process is allocated with a fixed amount of memory, it is wise to think of the Message Size in advance. Please note that the measure for the size of the message absolutely depends on the

Note: Our entering point for the design is to start looking at YS with a focus on data circulation within the software. That is mainly because of the nature of Rx and TDF. Those two are supporting the data and event streams as well as Asynchronous pro-cess and different methods to push data around. So if there is a way to find a mutual pattern where we can use Rx and/or TDF it is to look at YS in the same fashion.

6.2 The Data-Driven Patterns

6.2.1. The Simple Producer/Consumer scenario Normally, in this kind of scenario an instance of class A tries to make a piece of data available to class B. Both classes are working on their pre-assigned threads. The di-rection of the communication is from A to B. last but not the least both A and B in this case are balance in terms of throughput.

In the current implementation of YieldStar this is usually done using so called “Sig-nalQueue”s. This is a classic way of object communication where a signal queue notifies the receiver of the data as the message is added to the queue.

Let us explore this scenario in depth using an example from YieldStar.

BlockInfo and BlockingAutoPumpQueue

class Machine Control

BlockBuilder

+ CreateBlocksBlockProcessor

+ ProcessInstructionBlock

BlockingAutoPumpQueue<BlockInfo><<process block>> <<queue block>>

Figure 6.2

Page 42: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

dfd Machine Control Dataflow - Big Picture

ProcessJobManager Processor

BlockBuilder

BlockProcessor

BlockingAutoPumpQueue

BlockInfo

LOTControlLOTControl

BlockInfoCreator

WPIRecipe

WPI

BlockInfo: In order to avoid memory problems, measurement targets on a wafer are grouped together in ‘blocks’. The ‘BlockBuilder’ does the creation of blocks. This block builder creates all necessary blocks where every block contains the required instructions.

As you can see in figure 6.2, BlockBuilder queues the BlockInfos in BlockingAu-toPumpQueue in order to be processed by block processor. One can clearly notice the flow of BlockInfos from BlockBuilder (aka. producer) to BlockProcessor (aka Con-sumer). This process is happening in the heart of MC module inside Processor class.

Figure 6.3 depicts the dataflow of the current scenario. After receiving the WPI, Pro-cessor passes the recipe to BlockBuilder. Then the BlockBuilder produces the in-structions and groups them into BlockInfos.

So the next step is to address this scenario with an Rx design.

Considering the creating a pipeline using Rx, this can be as simple as follows:

• Implementing an IObservable<BlockInfo> interface for both Producer and consumer class

• Subscribing the appropriate functions to the IObservable<BlockInfo> inter-face on the consumer side remembering the fact that only on item will be processed at a time since Rx is single-threaded by default.

• Using a Subject<BlockInfo> on the producer side in order to inject newly produced BlockInfos into the data sequence.

• Assigning the Subject to the IObservable<BlockInfo> interface using AsOb-servable() interface of the subject

Note: In order to maintain the integrity of the sequence and to make sure that the no other classes other that the producer can manipulate the se-quence. Subjects should not be accessible outside the boundaries of the pro-ducer.

Design Decision: Using Subjects Vs. Observable.Create()

Generation of an observable sequence covers the complicated aspects of functional programming i.e. corecursion and unfold. You can also start a sequence by simply

Figure 6.3

34

Page 43: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

making a transition from an existing synchronous or asynchronous paradigm into the Rx paradigm.

Subjects are the mutable variables of Rx. So, in general using Subjects is sometimes a way to cop out of Thinking Functionally, and if you use them too much, you are trying to row upstream. However, Subjects are extremely useful when you're inter-facing with the non-Functional world of .NET. For instance, when wrapping an event or callback method Subjects are great for that or trying to put an Rx "interface" onto some existing code.

Of course, Observale.Create() is also a good way to wrap the transition from a IEnu-merable to IObservable. Still it does not provide the same flexibility in case other private modules of the same class try to post an exception or overrule the production function and stop the process. This may seem odd at the first glance still it is a very common scenario in YieldStar!

One Step Further So far so good we have the first scenario! But as I look further in the structure and behavior of MC module I realize there is another underlying dataflow pattern. As it is shown in Figure 9.5, the BlockBuilder creates the instructions and puts them in num-ber of BlockInfos. These blocks are then pushed into a queue waiting to be process by BlockProcessor; where they are again break down into instructions. That is, be-hind the BlockInfo structure there are streams of instructions. So I took one step fur-ther and started to change the structure in order to prove my idea.

So as it is show in figure 6.4, all the BlockInfo related classes were removed and instead of BlockBuilder now we have the InstructionProducer. InstructionProducer provide IObservable<T> interefaces of instructions. The example below shows one of the instruction interfaces of InstructionProducer that exposes an interface for An-gledAcquirement:

public IObservable<AngledAcquirements> AngledAcquirementsSeq { get { return m_AcquirementBuilder.AaSeq; } }

class Machine Control - Redesign

Processor

+ SenseAndMeasureWafer

Instruction Processor

+ ProcessInstruction

InstructionBuilder

+ BuildInstructions

InstructionProducer

+ CreateBlocksForEstimation+ CreateBlocksForProduction+ GetAngledAcquirements_S_xx0+ GetAngledAcquirements_T_xx0

Figure 6.4

Page 44: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

class Machine Control

BlockProcessor

+ ProcessInstructionBlock

MachineControlAcquire

HardwareSequenceGenerator

+ Translate

Measuring

Sensing

HardwareSequenceGenerator200

- MoveAcquire

IIl luminationControlMotion ControllerIPLC

<<queuePLCcommands>>

<<QueueMeasurementCommands>>

<<QueueImageAcquisitionData>>

<<QueuesAcquireImageCommands>>

<<apertureactions>>

<<queuestagemoves>>

BlockBuilder

+ CreateBlocksBlockProcessor

+ ProcessInstructionBlock

BlockingAutoPumpQueue<BlockInfo>

InstructionBuilder

+ BuildInstructions

MeasurementBuilder

+ CreateMeasurementInstruction

AlignmentBuilder

+ CreateAlignInstruction

AcquirementBuilder

+ CreateAcquirement

InstructionSorter

AcquireOptimizer

<<process block>> <<queue block>>

Figure 6.5

36

Page 45: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

Figure 6.6 shows the new dataflow diagram of the same scenario. The design is sim-pler and more intuitive now and it also conforms to the idea of actor-based program-ming where Processor is monitoring and controlling the dataflow between Instruc-tionProducer and InstructionHandler.

This transition will look like this in the implementation:

public class Processor : SwCtrlBase, IProcessor, IDataSink { … private void SenseAndMeasureWafer(IStarsContext waferCtx, Pro-cessWaferAdministration pwa, WaitHandle abortSignal) { … //Making an instance of Instruction Handler and //InstructionProducer var instructionConsumer = new InstructionHandler(m_Aligner, m_Acquire, m_SensingCmdControl, m_MeasuringSwController, m_MeasuringIdleTimerCurrentWaferNonLastBlock, pwa, abortSignal, m_EarlyWarningAdministration); var instructionProducer = new InstructionProducer (waferCtx, wpi.Recipe, wpi.SampleSchemeList, wpi.IsASACFinished, SwCtrl-Factory.Instance.PEPProvider, SwCtrlFactory.Instance.SensingConfig, SwCtrlFactory.Instance.SensingConfig.GetAvailableSensingLights(), pwa.AdvFineAlignmentInput); //Plug in the interfaces instructionConsumer.RegisterInstructionSequences(temp, instructionProduc-er.SensIns, instructionProducer.MeasureIns, instructionProduc-er.SensSkipIns); instructionConsumer.SubscribeASMRModules(); //Initiating the dataflow instructionProducer.ProduceInstructiuons(); … } … }

dfd Redesign Dataflow

ProcessJobManager

Processor

LOTControlLOTControl

InstructionProducerInstructionHandler

«call»

Instructions

Recipe

«call»

WPI

Figure 6.6

Page 46: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

6.2.2. The Slow Consumer This scenario there has an important difference and that is the throughput of A and B is not balanced in this case. This has a significant effect on the implementation if the message is of a big size. The effect is more visible when we talk in terms of Rx im-plementation.

Slow consumer in Rx world means that the OnNext(x) calls will be received and pushed into an internal queue behind the scene; waiting to be observed. In order to control the scheduling of the Observers on a sequence, Rx introduces ObserveOn() extension method. This method is especially useful if you want to apply a certain concurrency policy through a customized or generic Scheduler object. The main con-cern is that if the internal queue of the observers has no boundaries this may lead to memory problems; having for example a 1000 messages of size 1MB!

As I studied YS I noticed that for big messages there are two different situations. First, where producer has a uniform production rate; it means it has a continuous production with more or less fixed intervals between the messages. The second situa-tion is where producer has bulk production and produces batches of messages with a certain idle time between batches.

Figure 6.7 is depicting the scenario where the producer has a uniform production rate and in Figure 6.8 you can see the situation where producer has an idle time between the batches.

Backpressure the Producer:

This solution suggests that in order to avoid the memory problems lower the throughput of the producer by blocking it when consumer is still busy. Applying this solution the producer will not push any further messages as long as the actors at con-sumer side are still busy with previous messages. Implementing this solution an extension method IObservable<T> was developed. This method as it is presented below returns an IObservable of the same type but with a bounded Observer queue. The “maximumCount” defines the maximum num-ber of threads that are allowed to observe the sequence at the same time. It is also possible to pass a scheduler to the function. Still sending the current thread as the scheduler and control the scheduling through ObserveOn method would be better solution. That maintains the consistency of the Rx usage throughout the code. How-ever for simple scenarios with no complicated concurrency concerns the scheduler may be set from this point.

Figure 6.7

Figure 6.8

38

Page 47: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

public static IObservable<TSource> Semaphore<TSource>( this IObservable<TSource> source, int maximumCount, TimeSpan timeout, IScheduler scheduler) { return Observable.Create<TSource>( observer => { var gate = new Semaphore(maximumCount, maximumCount); var disposables = new CompositeDisposable(); #region Action Action<Action> blockAndSchedule = action => { if (!gate.WaitOne(timeout)) { observer.OnError(new TimeoutException()); return; } var schedule = new SingleAssignmentDisposable(); disposables.Add(schedule); schedule.Disposable = scheduler.Schedule(() => { action(); gate.Release(); disposables.Remove(schedule); }); }; #endregion try { var subscription = source.Subscribe( value => blockAndSchedule(() => observr.OnNext(value)), ex => blockAndSchedule(() => observer.OnError(ex)), () => blockAndSchedule(() => observer.OnCompleted())); disposables.Add(subscription); disposables.Add(gate); return disposables; } catch { gate.Dispose(); throw; } }); }

Design Decision: Having two solutions based on production rate.

It seems beneficial to address the above mentioned scenarios in different ways while implementing the solutions. In case of the Backpressure scenario consumer may stop a producer in the middle of a batch and still suffer the idle time of the production. Unless we provide a different solution to buffer the whole batch and use the idle time of the producer to consume the messages. Of course in case of large messages there is always a boundary for the size of the buffer (depending on the system specifica-tion). Otherwise we will run to the same memory problems that we have been trying to avoid.

Page 48: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

As a result, the buffering solution appears not be a consistent way to address the problem. However, in case that the parameters of the producer are fixed and isolated the following implementation strategy can be applied:

Design Decision: Using TDF or BlockingCollections to provide the Buffer for an Rx sequence

Both TDF and BlockingCollection can be a candidate to provide the buffer. In both cases (as it is shown in the code snippet below) the TDF block or the instance of blockingCollection can be connected to an IObservable using AsObservable() exten-sion method, which is provided by Rx.

BufferBlock<string> instructionBuffer=new BufferBlock<string>(); var InstructionSequence = instructionBuffer.AsObservable();

In case of TDF the implementation is a bit more complicated. That is, the messages are being offered to the block. If the buffer is full the message will be reserved and the producer will be blocked. This functionality is already provided in a Blocking Collection and naturally requires less “ASML specific” coding around the solution.

As a part of the investigation to find the optimal solution for this scenario I commu-nicated with Lee Campbell the author of “Introduction to Rx” one of the main refer-ences of this project. The question was how to deal with a slow-consumer where there a huge number of big messages being produced?

Here is the quotation of the answer from Lee Campbell:

“Your assumption is correct. The ObserveOn operator has an internal queue so that it can safely serialize your OnNext calls onto the provided scheduler.

As you said, this is a classic problem. You have many options to choose from:

1. Enqueue values that are produced faster than the consumer can process them, which is what ObserveOn is doing. (Buffer)

2. You might only take the most recent value, or aggregate all buffered data (count/sum/average) or apply another algorithm to deal with bursts of data. (Conflation)

3. Block the producer to stop it producing data. (Backpressure)

Each of these will have different impacts on your application and some are more ap-propriate for various solutions. Currently Rx doesn't support the Backpressure option. I would suggest that if you did want the backpressure option, then you may want to consider a different technology to Rx, maybe the Disruptor of the Blocking Collec-tions from TPL. If you really want to stick with Rx, then I would suggest creating your own version of the ObserveOn operator that still takes an IScheduler but also takes an integer for the max queue length.”

We are already familiar with the first and third solution. As for the second solution is not applicable in the case of YieldStar for the following reasons:

• “…You might only take the most recent value…” would not be possible as we cannot drop any of the message due to the integrity and accuracy of the measurements

• “… aggregate all buffered data (count/sum/average)” requires intense soft-ware changes in measurement modules to recognized bulk messages and adds extra processing load and usage of ASML specific code.

40

Page 49: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

6.2.3. The Backward Channel In dataflow systems backward channel is referred to the channel that passes data in a direction opposite to that of its associated forward channel. The backward channel is usually used for transmission of request, supervisory, acknowledgement, or error-control signals. The direction of flow of these signals is opposite to that in which user information is being transferred. The backward-channel bandwidth is usually less than that of the primary channel, that is, the forward (user information) channel.

One can consider this scenario as the bidirectional version of the simple or any other scenarios. The following scenario will help you to understand why this is important in this context.

Currently the same functionality is provided in YieldStar by the means of passing delegates and using exceptions as a feedback channel.

Passing delegates creates an unnecessary dependency between the blocks in the high-er abstraction and other classes in lower levels of the system. As for the exception it is been strongly advised not to use them for this purpose. Because, Not only excep-tions are slow (as the name implies, they're meant only to be used on exceptional cases), but a lot of try/catch blocks in your code makes the code harder to follow. Proper class design can accommodate common return values. If you're really in need to return data as an exception, probably your method is doing too much and needs to be split.

Our Options in Rx:

• Using recursion + IObservable<T>

Using this method first we have to implement an extension method for IObservable interface. This "feedback" method requires the produce and feedback function. The complexity of recursion implementation in Rx is high and it introduces risky memory overflow situations. After all, it will also be considered as ASML specific code which does not conform to the objectives of the project.

• Using In/Out IObservable<T> interfaces

Using this method the parent class can subscribe to the "out" IObservable interface in order to manage any feedback from lower layers. This solution seems to be very ge-neric and the closest way of implementation to the dataflow way of thinking.

• Using ISubject<T>

Following this approach the lower layers of the pipeline can also call OnComplete() and OnError() methods. This may seem convenient but still risks the integrity of the dataflow throughout different layers and also introduces a huge complexity to our implementation since any actor of the software can now call those methods and we have to provide a mechanism to manage the calls from different components.

The second solution here seems to be the solution as it provides a closer realization of the scenario. Also guarantees the integrity of the data sequence. As it is also shown in the example codes, the consumer also provides an “out” channel to which the producer can subscribe. Obviously, any changes in behavior of the producer re-garding the feedback will not affect the design of the rest of the classes.

Page 50: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

(*) An interesting part is how abort is being handled; a WaitHanlder is passed through the functions which can be signaled at any point. So it works like the parent object will pass a WaitHandler to make sure that it can stop everything at once and in return it will receive a n AbortException to make sure that it worked fine also react upon receiving the Exception.

Example: As a part of the wafer processing procedure, ProcessJobManger whenever the measurements of a wafer (WPI) are finished.

6.2.4. Pipeline extensions This is one was one the most interesting scenarios within YS. This scenario will show how the Rx sequences can be extended. For instance, in certain condition we may want to send the data form A to C and then from C to B. That means we have provide a forking mechanism for the dataflow that is also guarding a predefined con-dition. The Early Warning functionality in (integrated) YieldStar would be a good example of this feature.

Early Warning: The integrated YieldStar must send a pre-ready to unload trigger (a.k.a. ‘early warn-ing’ signal) to the track to prepare the next wafer for a quick wafer exchange with the current wafer.

The track material handling robot uses the pre-warning signal to move to the next wafer for YieldStar (e.g. stored in a buffer position) more efficiently. It picks the

0 -7

2s 2s 1s 2s

Pre-warning

move pick move sync exchange track

YieldStar Move-Acquiring prep exchange

Wafer exchange

Pre-warning time

move

finish FOCA FIWA M

internal BlockProcessor( IAlign aligner, IImageSequence acquire, ISensingCmdControl sensing, IMeasuringSwController measurer, … ProcessWaferAdministration pwa, WaitHandle abortSignal, (*)

EarlyWarningAdministration earlyWarningAdministration)

public void RegisterInstructionSequences( IObservable<AngledAcquirements> aa, IObservable<ImageAcquisitionData> iad, IObservable<MeasurementInstruction> mi, IObservable<SkippedSensorDataID> ssid, out IObservable<WPINotifications> notificationObservable)

EA

Figure 6.9

42

Page 51: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

dfd EarlyWarning DFD

Processor

MachineControlAcquire

wafer and moves to the YieldStar module. This takes about 7 seconds in total, this time would be on the critical path (for YieldStar) without pre-warning.

Figure 6.10 shows the classes that are involved in the EarlyWarnning scenario. The BlockDurationEstimator marks the instruction on which the EarlyWarning should be triggered during the execution. So regardless of the fact whether the code is running on an integrated for stand-alone machine this estimation is happening in the Block-Builder. That is an unnecessary dependency between the producer and consumer of the message. Also it makes the Machine Controlling module dependent to the type of the machine.

In an ideal case the consumer of instructions should be concerned with this scenario. But the estimation of the duration should be applied to all instruction in order to find the trigger-instruction. As a result a good way of achieving the same functionality in a dataflow model is to send the instructions to another filter and from there to the consumer module. Figure 6.11 shows the standard flow of the instruction form the Processor to the very first consumer that is MachineControlAcquire. Just consider a small detour to another module before the instruction go through execution. (Figure 6.12)

class Machine Control

BlockInfo+ AngledAcquirements+ ImageAcquisitionSkips+ MeasurementInstructions

+ EstimateBlockDuration

BlockDurationEstimator+ BlockDuration: double

+ Estimate

TimeConsumer

+ UpdateConsumerDuration

MoveAcquirementTimeConsumer

AlignmentTimeConsumer

AcquisitionCompleteToWaferReadyUnloadTimeConsumer 0..*

1

Figure 6.11

Figure 6.10

Page 52: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

In order the make the above mentioned changes to the MC module the structure of the code had to change as well. Conceptually, the BlockDurationEstimator should not be a part of the instruction production process. Based on our dataflow model, it can be one of sub-modules of the Processor. Then the BlockDurationEstimator was re-named to SequenceDurationEstimator and added directly to the Processor class (Fig-ure 6.13). Fortunately the internal design of the BlockDurationEstimator allows us to use it as independent module. That is in the new structure SequenceDurationEstima-tor acts as if the whole set of instructions is a big BlockInfo. The Number of instruc-tions can reach to 100,000 still the SequenceDurationEstimator goes through all of them and marks the trigger-instruction.

The key point of implementation phase is only the matter of registering the proper data sequence to the consumer. The code below is the same registeration function from the first scenario with a little difference. Where the system is integrated the data sequence is firs being sent to another function to handle the subscriptions of the Se-quenceDurationEsimator and it returns the same sequence but with the trigger-instruction being marked now.

instructionConsumr.RegisterInstructionSequences(

(m_IsIntegrated)?MarkInsObservable(instructionProducer.AcqIns):instructionProducer.AcqIns, instructionProducer.SensIns, instruc-tionProducer.MeasureIns, instructionProducer.SensSkipIns);

dfd EarlyWarning DFD

Processor

EarlyWarningHandlerMachineControlAcquire

class Machine Control - Redesign

Processor

+ SenseAndMeasureWafer

SequenceDurationEstimator+ BlockDuration: double

+ Estimate

Instruction Processor

+ ProcessInstruction

InstructionProducer

+ CreateBlocksForEstimation+ CreateBlocksForProduction+ GetAngledAcquirements_S_xx0+ GetAngledAcquirements_T_xx0

Figure 6.12

Figure 6.13

44

Page 53: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

7.Project Management In this chapter I will explain the course of my actions throughout the project and more importantly the reason behind them. Also I will make it explicit how they were adding the progress of the project. Apart from that, the actual data of the project in terms of time and effort are presented in concrete figures. This chapter is an input for further developments around this topic.

7.1 Introduction Project management of a feasibility study can be quite different from other types of the project. Of course, the basic concepts of the project management such as work-breakdown, time management and risk analysis also applies in this case. As for the risk analysis, there is not much to say than the actual political and/or technical issues that may be an obstacle on the way this feasibility study. Those risks are described in the “Problem Analysis” chapter of this report.

Management of this project followed a RUP-Based approach to manage the work breakdown. The iterative nature of the RUP fits best with the type of the project i.e. Feasibility study.

Figure 7.1 depicts the five main product efforts of all the iteration. Inside each circle you can see the general list of tasks for each phase. Naturally some the tasks will not execute in all iterations. For instance, setting up the YieldStar may happen only once. The extension of the scenarios was mostly depending on ASML’s point of interest. These extensions were decided in a few meeting upon the presentation or description of the results of the previous phase.

Figure 7.1

Page 54: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

7.2 Work-Breakdown Structure (WBS) Please find below the outline of the

Understanding the problem: • System (YieldStar)Overview

• Technology review

• Current Implementation

• Structural

• Behavioral

• Modeling the problem

• Dataflow models

• Class models

• Interaction models

Designing the Solution • Defining the scope

• Identifying the impact points

• Modeling the target design

• Defining the implementation spec.

Prototyping • Implementation Plan

o Target Scenarios

o Affected components

• Extreme Programming (XP)

Testing and Improvement

7.2.2. Activity-Cost Estimations In the beginning of the project, it seemed that the very first problem was the under-standing of the design problem, getting familiar with the YieldStar software and re-view the literature. The initial estimation was 40% of the project time will be spent on the literature review and clearing up the problem. By the end of the project the results of the planning showed that actually 60% of the time was assigned to such tasks. The remaining 40% was spending on implementation and documenting. That was mainly due to the wage nature of the problem. So in this case literature study and code investigation did not have a concrete goal in the initial phase. Because I had to figure out which applications of those technologies will fit with the behavior of the YS Software. So the literature study of the project had two phases, the first phase when I had to decide what to read and, afterwards reading and understanding the ac-tual material.

46

Page 55: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

7.3 Project Planning Tools There is a main concern in management of this project and that is the communication between the two stakeholders of the project. That needs a defined way of working based on which the progress and the state of the project would be available for both parties, as they require it. Assembla was chosen to be the tool

Assembla is a set of cloud-based task and code management tools for software de-velopers. Assembla hosts over 100,000 commercial and open-source projects and is used by over 800,000 users in more than 100 countries.

I mainly used Assembla for the following purposes:

• Sharing the milestones and tickets (tasks) with supervisors so they can be aware of the tasks in hand also the progress of the project plan

• Sharing documents and solution files

• Using the Git repository for the documentation and mockup development

Using of Assembla helped me in keeping the track of the activities. But it wasn’t very interesting for the team as it has a steep learning curve in some cases. Still the file sharing and repository were quite useful.

When dealing with new ideas and concept, which, was the case here, mind-maps can be very effective keeping the track of new ideas and the decisions you make through-out the way. Especially in the first phase of the literature review when the path of the research is still very dusty.

7.4 Feasibility Study First thing to say is that the nature of the project is a feasibility study. Naturally, do-ing a feasibility study for a feasibility study seemed not to bring much value to the context of the report. Therefore, through a meeting with supervisors of the project it was decided to use this part of the report to describe the feasibility of applying se-lected technologies to YS project.

Here is the overview of my activities:

• Two mockup applications were developed

• Special behaviors of YS are investigated

• Pre-conditions of possible solutions were explored and/or tested.

7.4.1. Mockups The Mockups or lite-implementations are actually two C# solutions providing the same blocks as in Machine Control module. These blocks are performing pseudo tasks instead of the actual processing. A delay is added to all the functionalities in order to test the scheduling features of Rx and TDF.

The implementation of Rx and TDF in the mockups was absolutely exclusive. That was solely to make sure if any of those two can meet all the requirements inde-pendently. If not, which was the case in TDF, how can we use the other on aiding the design.

The reason why I developed the mockups was that the YieldStar solution can be fair-ly complicated and fragile to changes in many parts. Therefore, trying to apply the changes to the actual software and test the functionality and behavior on the Yield-Star simulator can be unacceptably time consuming. Additionally, these two applica-tions were a mutual language between me and the stakeholders of the project. That is, I could verify my design ideas and their expected behaviors. If the behavior and the functionality were as expected, I could try the same code on the actual YS code base on the same blocks. So adhering the naming and coding convention in my mockups made them somehow a reusable source code for actual implementation.

Page 56: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

7.4.2. Conclusions TDF as it appeared in the literature is representing a lower abstract comparing to Rx. That is TDF is mostly useful when you have number of parallel task which com-municate to each other. In this project concurrency on one hand is not the main con-cern still TDF blocks seem to be very useful for managing the internal function of the classes. As for creating the pipeline between different classes Rx provides more effi-cient interfacing that allows the easy integration of the library into legacy code. The blocks of processing are much more complex and heavier of the fundamental block that is used in TDF for interfacing. Moreover, it is not a very good idea to send around the blocks to different classes because it will undermine the goals of the pro-ject regarding separation of the dataflow and the control flow. Of course, TDF is meant to create blocks with data flowing between them still those blocks are quite fundamental and have a single task running internally.

7.5 Project Plan The Figures 7.2 and 7.3 in order are presenting the previous and the updated version of the project timeline. There were different versions of the project plan still those two diagrams depict the two major versions. There are three main differences be-tween the two plans.

First of all, in order to expand the scenarios covered in Lite-Implementations and make the code reusable for the next milestone another two weeks was spent on this activity. Moreover having the complement mockups makes it very quick to have the design models and implementation plan. Therefore the remaining time for the next point of action seems more reasonable.

“Prototype Plus Abort Scenarios” was omitted from the model because it was already covered in the previous milestone. Considering the facilities that Rx provides like OnComplete and OnError channels implementing the abort and error scenarios it was easier to consider and implement them during the implementation of the main suc-cess scenarios.

Due to the extension of the requirements and by the agreement of the supervisors of the project the delivery of the prototype was delayed. This delivery is in the form of an independent reusable Machine Control module that can directly replace the cur-rent on in the YieldStar code base.

48

Page 57: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

Figure 7.2 Figure 7.3

Page 58: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

8.Conclusions

8.1 Validation Throughout the project and particularly during the requirement elicitation procedure a number of functional and non-functional requirements were defined. These re-quirements were interpreted into quantifiable measures in order to make it possible to validate the results of the project. In this part of the report I argue how and why the results are satisfying the requirements and justify the claims that are made regarding the measures.

In the conclusion of the Feasibility Study in the Project Management chapter I dis-cussed the reason I chose Rx over TDF for the most parts of the design.

• Rx is more popular within developers society based on the variety of appli-cations and the number of web forums that are now supporting reactive de-velopment using Rx

• Rx is open source now and receives support from a huge community now. Thus the project size is also growing much bigger.

• Rx implantation is going to be delivered in the next service pack of Yield-Star as a part of the Sensing Module.

Sensing is the closest layer of the system to the hardware and real-time part of the system (where the size of each message, e.g. images, may reach close to 2 MB). As a result robustness of the software is of a high importance in this module. The fact that the Rx implementation has passed the functional and performance tests, is a valid proof of its robustness and performance.

Now I like to describe why I think Rx was a good choice and how it did satisfied the system requirements.

One of the main requirements of the project was to improve the maintainability of YieldStar software. According to the System Requirements chapter “improving the maintainability” in the context of this project means:

• Separation of control-flow and data-flow

Creating pipelines and realizing the producer/consumer pattern was on the main features that Rx has provided. As a part of realization of Rx, creation of a data stream and subscription to a data stream are two separate actions in Rx. Taking the simple producer/consumer from chapter 6 as an example, one can clearly see that in the new design Processor only instantiates the producer (Instruc-tionBuilder) and the consumer (InstructionHandler) and link the proper interfac-es. In this case Processor is controlling and initiating the dataflow but the rest of the calculation and tasks are being handled inside InstructionBuilder and Instruc-tionHandler. This is one the examples of Rx design in this project that shows the separation of controlling the dataflow and actual generation and processing of the data.

• Less explicit use of multi-threading

In this project I used Schedulers as the means of providing multi-threading. This can be either by passing the standard implementations of Scheduler class or a custom implementation of IScheduler interface. Any implementation of ISched-uler interface can be passed to an Rx Sequence to schedule some action to be performed, either as soon as possible or at a given point in the future. And now the question is why do I think that using IScheduler interface leads to less explic-it use of multi-threading?

Implementing the IScheduler interface adds another level of abstraction over Threads in C#. This abstraction allows low-level plumbing to remain agnostic

50

Page 59: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

towards the implementation of the concurrency model. So when using Rx we just have to apply the proper scheduler with the proper behavior to the data se-quence and this is independent of where and how the threads are managed in re-ality. This will make the usage of multi-threading less explicit also facilitates a more intuitive way of thinking about concurrency issues in dataflow systems.

• Replacing ASML specific code with generic C# code

In the process of applying Rx to MachineControl some changes to the structure seemed to be beneficial. These changes (as described in Chapter 6) supported the principles of Data-Driven software development. Through several meetings with current developers and designers of YieldStar, most of the members find the new structure more intuitive and more of a natural way of describing the system.

These changes in some cases even led to removal of entire blocks of the structure (e.g. BlockInfo, BlockProcessor, BlockingAutoPumpQueue). To consider only one part, in MachineControl module 6 Blocks were removed, 10 blocks were modified, 7 blocks were replaced in order to support the data-driven way of de-sign and 2 blocks became internal part of other blocks.

As a result, applying Rx not only replaced the ASML specific parts of code but also in higher abstraction even led to removal of ASML specific structure blocks.

8.2 Technical Conclusions • Rx can be used for applying a data-driven technology to YieldStar and it can

cover variety of scenarios. As the target dataflow model is approved adding Rx component can be very quick (at most a day for each block) but that is only the case when the changes do not include intense architectural changes.

• 6 blocks of ASML specific code were omitted from the structure and been replaced by generic functionality of Rx and TDF.

• 10 blocks were modified in order to adopt the proposed changes

• The separation of dataflow and control flow one can follow the data se-quences throughout the code and the

• The main challenge of applying these technologies would be the learning curve of the libraries. Moreover the standard usages of Rx and TDF requires the adoption of a data-driven way of thinking otherwise it may add to the complexity of the solution.

8.3 Future Developments

• Fresh development of discussed scenarios in the design chapter from the scratch (instead of modification and adoption approach) in order to get the most of the facilities that these new libraries can provide.

• Dataflow analysis of the YieldStar as data-Driven system in order to im-prove the performance and the throughput of the pipelines. These changes are very fundamental and will require large scale of modification in the structure of YieldStas’s architecture.

• Providing a standard for the objects that are being passed along the dataflow network within YieldStar. This will also support the last two proposed op-tions by providing standard methods for memory management in the entire object involved in main success scenarios. This will also include the imple-mentation of certain standard Microsoft interfaces such as IDisposable.

Page 60: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project
Page 61: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

Bibliography References

1. Lee Campbell. (2012) Introduction to Rx. [Online]. Available from: http://www.introtorx.com//

2. Dataflow (Task Parallel Library). [Online] Available from: http://msdn.microsoft.com/en-us/library/hh228603(v=vs.110).aspx//

3. CodePlex. Reactive Extensions Home. [Online] Available from: https://rx.codeplex.com/

4. Highsmith, James. A., Adaptive Software Development: A Collaborative Approach to Managing Complex Systems. New York: Dorset House Pub-lishing, 2000. (0-932633-40-4)

5. Erich Gamma, Richard Helm, Ralph Johnson, John Vlissides. (1994) Design Patterns: Elements of Reusable Object-Oriented Software. Pearson Educa-tion

6. Stack Overflow Forum. [Online] Available from: http://stackoverflow.com/

Page 62: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project
Page 63: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project

About the Authors

Mohsen Mehrafrouz is holding a Bachelor’s degree from Sheikh Baha’i University of Isfahan in Iran. He carried out his final project on Design and Development of a “Secure Web-Mail Service Using X509 Certificates”. He also worked for 2 years as a part time software developer. He accomplished his Master’s course in Computer Securi-ty with a Distinction at University of Birmingham in 2010 where he worked on his dissertation regarding Android’s Security aspects which resulted in his own version of “Froyo” for Htc Hero. Moreover he released a review about security features of SSL/TLS. From 2011 to mid-2012 he worked as RFID project manager at Iran Soft-ware and Hardware Co. where he improved and designed a few software solutions for industrial applications. Currently, he is attending the PDEng Software Technolo-gy program at Eindhoven University of Technology and this document is the report of his final project.

Page 64: Data-driven machine control - Pure - Aanmeldenplementations for these patterns. Within .Net now two of these implementations are available: Dataflow and Reactive extensions. The project