Download - Static and Dynamic Modelling of Materials Forging · 2006. 3. 22. · materials. We analyse two types of forging, cold forging in which the microstructure develops statically upon

Static and Dynamic Modelling of Materials Forging

CorynA.L. Bailer-Jones,David J.C.MacKayCavendishLaboratory, Universityof Cambridge

email: [email protected];[email protected]

TanyaJ.Sabin and Philip J.WithersDepartmentof MaterialsScienceandMetallurgy, Universityof Cambridge

email: [email protected];[email protected]

ABSTRACT

The ability to model the thermomechanicalprocessingof materialsis an increasinglyimportantrequirementin many areasof engineering.This is particularly true in the aerospaceindustry wherehigh materialandprocesscostsdemandmodelsthat canreliably predictthe microstructuresof forgedmaterials.We analysetwo typesof forging,cold forging in which themicrostructuredevelopsstaticallyuponannealing,andhot forgingfor which it developsdynamically, andpresenttwo differentmodelsforpredictingtheresultantmaterialmicrostructure.For thecold forging problemwe employ theGaussianprocessmodel.Thisprobabilisticmodelcanbeseenasa generalisationof feedforwardneuralnetworkswith equally powerful interpolationcapabilities. However, as it lacks weightsand hiddenlayers, itavoidsadhoc decisionsregardinghow complex a ‘network’ needsto be. Resultsarepresentedwhichdemonstratetheexcellentgeneralisationcapabilitiesof thismodel.For thehot forgingproblemwehavedevelopedatypeof recurrentneuralnetworkarchitecturewhichmakespredictionsof thetimederivativesof statevariables.This approachallows us to simultaneouslymodelmultiple time seriesoperatingondifferenttimescalesandsampledatnon-constantrates.Thisarchitectureis verygeneralandlikely to becapableof modellingawide classof dynamicsystemsandprocesses.

1. Introduction

Theproblemin themodellingof materialsforging canbe broadlystatedas follows: Given a certainmaterialwhich undergoesa specifiedforging process,what arethefinal propertiesof this material?Typical final prop-ertiesin whichweareinterestedarethemicrostructuralproperties,suchas the meangrain sizeandshapeandthe extent of grain recrystallisation. Relevant forgingprocesscontrol variablesarethe strain,strainrateandtemperature,all of whichmaybefunctionsof time.

A trial-and-errorapproachto solving this problemhas often been taken in the materialsindustry, withmany differentforging conditionsattemptedto achieveagivenfinalproduct.Theobviousdrawbacksof thisap-proacharelargetimeandfinancialcostsandthelackofany reliablepredictivecapability. Anothermethodis todevelop a parameterised,physically-motivatedmodel,andto solvefor theparametersusingempiricaldata[1].However, the limitation with this approachis that intermsof thephysicaltheorythe microstructuralevolu-tion dependsuponseveral “intermediate”microscopicvariableswhich have to be measuredin orderto applythemodel.Someof thesevariables,suchasdislocationdensity, are difficult and time-consumingto measure,making it impracticableto apply suchan approachtolarge-scaleindustrialprocesses.

Our approachto thepredictionof forgedmicrostruc-tures is thereforeto develop an empirical model inwhich we definea parameterised,non-linearrelation-ship betweenthe microstructuralvariablesof interestand thoseeasily measuredprocessvariables. Suchamodelcouldbe implemented,for example,asa neuralnetworkwith thehiddennodesessentiallyplayingaroleanalogousto the“intermediate”microscopicvariables.

2. Materials Forging

When a material is deformed,potentialenergy is putinto thesystemby virtue of work having beendonetomove crystalplanesrelative to oneanother. Themate-rial is thereforenot in equilibrium andhasa tendencyto lower its potentialenergy by atomicrearrangement,through the competingprocessesof recovery, recrys-tallisation and grain growth. Theseprocessesare en-couragedby raisingthetemperatureof thematerial(an-nealing). Forge deformationprocessescanbe dividedinto two classes.In cold working the recrystallisationrateis so low that recrystallisationessentiallydoesnotoccurduringforging. Recrystallisationis subsequentlyachievedstaticallyby annealing.In contrast,hot work-ing refersto thehightemperatureforgingof materialsinwhich recrystallisationoccursdynamicallyduringforg-ing. This processis considerablymorecomplex than

Fig.1: Deformationgeometries.(a)Plane-straindiametricalcompres-sion. Theworkpieceis subsequentlysectionedinto many nominallyidenticalspecimenswhich areannealedat differentcombinationsoftemperatureandtime. Thecompressiongivesriseto anon-lineardis-tribution of strainsacrossthespecimen(seeFigure2b). Theseallowus to obtain many input training vectors, � , for our model using asinglecompressiontest.(b) Axisymmetricaxialcompression.

coldworkingasnow thefinal microstructureof thema-terial is generallya path-dependentfunctionof thehis-tory of the processvariables. This is particularly trueof the Aluminium–Magnesiumalloy consideredhere,which have a relatively long ‘memory’ of the process,thusnecessitatingamodelwhichkeepstrackof thehis-tory of thematerial.We shallconsidera modelfor thisdynamicprocessin Section4.

The ultimategoal of forge modelling is the inverseproblem: Given a set of desiredfinal propertiesfor acomponent,what is the optimal materialand forgingprocesswhich will realisetheseproperties?This is aconsiderablyharderproblemsincetheremaybea one-to-many mappingbetweenthe desiredpropertiesandthe necessaryforging process. This problemwill notbeaddressedin thispaper.

3. Static Modelling

Cold forgingcanin generalbemodelledwith theequa-tion �� (1)where� is amicrostructuralvariable,� is thesetof pro-cessvariablesand � is somenon-linearfunction.In ourparticularimplementationwe areinterestedin predict-ing asinglemicrostructuralvariable,namelygrainsize,in agivenmaterial(anAl-1%Mg alloy) asa functionofthetotalstrain,� , annealingtemperature, , andanneal-ing time, � . Theexperimentalset-upfor obtainingthesedatais asfollows. A workpieceof thematerialis com-pressedin plane-straincompressionat room tempera-ture,asshown in Figure1a.After thespecimenhasbeenannealed,it is etchedandthegrainsizesmeasuredwithan opticalmicroscope.The local strainexperiencedateachpoint in thematerialis evaluatedusingaFiniteEl-ement(FE) model,the parametersof this modelbeingdeterminedby the known materialproperties,forginggeometries,friction factorsandsoon. Figure2b shows

Fig. 2: (a) The left half of this diagramshows the microstructureofhalf of asectionedspecimenwhichhasbeendeformedunderaplane-strain compression.The materialhasbeenannealedat �� for30 mins producingmany recrystallisedgrains. (b) The right half ofthis diagramis the correspondingstrain contourmap producedbythe Finite Elementmodel. Note that the areasof high strain in (b)correspondto smallgrainsin (a).

an exampleof an FE map. Many grain sizeswithina singlesmall areaareaveragedto give a meangrainsize. Thus we now have a set of model inputs, � , and � , associatedwith a singlemeangrain sizewhichcanbeusedto developastaticmicrostructuralmodelofforging. Furtherdetailsof the experimentalprocedurecanbefoundin Sabinetal. [7].

3.1. The Gaussian Process Model

The Gaussianprocessmodel [3] [8] assumesthat theprior joint probability distribution of a set of any �observationsis given by an � -dimensionalGaussian,i.e.� �� "!$#&%'#)(*�+ (2), -�.0/�1�2435 ��627%8:9;(=� ��@2A%8CBD# (3)where �� = �E� > �E� > �#F�HGI��GJ�#�KLK�KL#F�H��"F is the set� of observationscorrespondingto the set of � in-put vectors, �L� � !NMO�L� > #F� G #�K�KLK�#F� � ! . % and ( � ,respectively themeanandcovariancematrixfor thedis-tribution,parameterisethis model.Theelementsof thecovariancematrix arespecifiedby thecovariancefunc-tion, which is a function of the input vectors, � � � ! ,and a set of hyperparameters. A typical form of thecovariancefunctionisPRQ�S �@T > -�.0/VU&2435"WYX�Z[ WYX >

�E\�] WY^Q 27\�] W_^S G` GWa=b TJG b Tdc�e Q�S K

(4)Thisequationgivesthecovariancebetweenany two val-ues � Q and � S with correspondingf -dimensionalinputvectors� Q an � S respectively, andis capableof imple-mentinga wide classof functions,� , thatcouldappearin equation1. (TheGaussianprocessmodelhasascalar‘output’, � ; to modelseveral microstructuralvariableswe would useseveral independentmodels.) The firsttermin equation4 expressesourbelief thatthefunctionwe aremodellingis smoothlyvarying, where ` W is thelengthscaleoverwhichthefunctionvariesin the ghEi in-putdimension.Thesecondtermallowsthefunctionsto

have a constantoffsetandthethird is a noiseterm: thisparticularform is a modelfor input independentGaus-siannoise.Thehyperparameters,` W ( g = 3 K�KLK f ), T > , T G ,T c , specifythefunction,andaregenerallyinferredfromaj setof trainingdatain afashionanalogousto traininganeuralnetwork. They arecalledhyperparametersratherthanparameters becausethey explicitly parameteriseaprobability distribution ratherthan the function itself.This distinguishesthemfrom weightsin a neuralnet-work, which are rather“arbitrary”, in that addingan-otherhiddennodecould changethe weightsyet leavetheinput–outputmappingessentiallyunaltered.

Oncethe hyperparametersareknown, the probabil-ity distribution of a new predictedvalue, �H�Rk > , corre-spondingto anew ‘input’ variable,� �lk > , is� �E� �Rk > � � � #��L� � !I#&� �Rk > #)( �lk > (5), -�.0/nmo2 �E�H�lk > 2qp�H�lk > G5Hr G stvu?wyx z # (6)i.e. a one-dimensionalGaussian,where p�H�lk > andr st u?wyx areevaluatedin termsof thecovariancefunctionand the training data. We would typically report ourpredictionas p�H�lk >l{ r st u?wyx . Theseerrorsreflectboththenoisein thedata(third term in equation4) andthemodeluncertaintyin interpolatingthetrainingdata.ThefactthattheGaussianprocessmodelnaturallyproducesconfidenceintervals on its predictionsis importantinthe materialsindustry wherematerialpropertiesmustoftenbespecifiedwithin certaintolerances.

Our modelassumesthat the measurementnoiseandthe prior probability of the unknown function can bedescribedby a Gaussiandistribution. In ourapplicationit is moresensibleto assumethat it is the logarithmofgrain sizeswhich aredistributedasa Gaussian,ratherthan the grain sizesthemselves. This is becauseun-certaintiesin measuringgrainsizescalewith themeangrain size, and are thereforemore appropriatelyex-pressedasafractionof themeangrainsizeratherthanafixedabsolutegrainsize.Moreover, empiricalevidencesuggeststhatgrainsizedistributionsarewell describedby a log normaldistribution.

3.2. Model Predictions

A Gaussianprocessmodelwastrainedusingasetof 46datapairsobtainedfrom theplane-straingeometry, with| K |~} � �| Kd , 5I~ C ~ C, 1 mins � 60minsastheinputs.Oncetrained,themodelwasusedto producepredictionsof grainsizesfor a rangeof theinput variables.Thesepredictions,shown in Figure3,agreewell with metallurgicalexpectations.

Oneof theassumptionsimplicit in ourmodelof coldforging is thatgiventhelocal strainconditions,themi-crostructureis independentof the materialshapeandforgegeometry. In otherwords,weassumethatpredic-tionscanbeobtainedgivenonly the local accumulatedstrain (and annealingconditions). This is an impor-tant requirementasit meansthat a singlemodelcould

Fig. 3: Grain size predictionsobtainedwith the Gaussianprocessmodel trainedon datafrom the plane-straincompressiongeometry.In eachof thethreeplots,two of theinput variablesareheldconstantand the othervaried. Whennot beingvaried, the inputswereheldconstantat: @@�� C; R@�� mins; @� � . The crossesinthestrainplot aredatafrom thetrainingset.As theGaussianprocessis an interpolationmodel,predictionsat any valuesof the inputsareconstrainedby theentiretrainingset.

be appliedto a rangeof industrial forging geometries,provided that the local strainscould be obtained(e.g.with an FE model). We testedthe validity of this as-sumptionby usingtheGaussianprocessmodeltrainedon plane-straindata to predict grain sizesin a mate-rial compressedusinga differentgeometry, namelyanaxial compression(Figure1b). As before,after com-pressionthematerialwasannealed,sectionedandgrainsizesmeasured.A new FEmodelgavetheconcomitantlocal strains. Theseprocessinputs were then usedtoobtainpredictionsof thegrainsizesusingthepreviousGaussianprocessmodel. Figure4 plots thesepredic-tions againstthe measurements.We seeremarkableagreement—wellwithin thepredictederrors—thusval-idatingourmodellingapproach.A practicalapplicationof ourmodelis to producediagramssuchasthatshownin Figure5, a mapof thegrainsizes.Sucha mapis im-portantfor engineerswho needto know thegrainsizesatdifferentpointsin thematerial,andcanthusassessitsresistanceto phenomenasuchascreepandfatigue.

It shouldbe noted that this methodcontainsotherimplicit assumptions.Thefinal materialmicrostructureis very stronglydependentuponthematerialcomposi-tion. It is well known that even small changesin thefractionsof thealloying constituents(andby extension,impurities)canhave a strongeffect on the thermome-chanicalprocessingof the material. Oneway forwardis to include further input variablescorrespondingtocomposition[2]. A secondimplicit assumptionhasbeenthe constancy of the initial microstructure.Depending

Fig. 4: Gaussianprocessmodelpredictionscomparedwith measuredvalues. The Gaussianprocessmodelwas trainedon datafrom onecompressionalgeometry(plane-strain)andits performanceevaluatedusingdatafrom anothergeometry(axial-compression)whichwasnotseenduring training. The line is to guidethe eye. Note thatnot even a perfectmodelwould producepredictionson this line dueto finite noisein thedata.

Fig. 5: The left half is an imageof the microstructurein the axiallycompressedspecimen.Theright half is thecorrespondinggrainsizepredictionsfrom theGaussianprocessmodelshown asacontourmap.

uponthematerialandthedegreeof thermomechanicalprocessing,the final microstructuremay retain some‘memory’ of its initial microstructure,thusnecessitat-ing amodelwhichhas“initial conditions”asadditionalinputvariables.

4. A Recurrent Neural Network forDynamic Process Modelling

For thehot working problem,we assumethat therearetwo setsof variableswhich are relevant in describingthe behaviour of the dynamicalsystem. The first, � ,areexternalvariableswhich influencethebehaviour ofthe system,suchasthe strain,strainrateandtempera-ture. It is assumedthat all of thesecanbe measured.The secondset of variables,� , are the statevariableswhich describethe systemitself. Thesearesplit intotwo categories. The first aremeasured,suchas grainsize,andthe secondareunmeasured,suchasdisloca-tion density. Notethattheunmeasuredvariablesarenotintrinsicallyunmeasurable:this is simplyacategoryforall of thestatevariableswhichwebelieveto berelevant

1.0

1.0

¡¢ £ ¤ £ ¥ Fig. 6: A recurrentneuralnetwork architecture(‘dynet’) for mod-elling dynamicalsystems.The outputs, , from the network arethetimederivativesof thestatevariablesof thedynamicalsystem.There-currentinputs, ¦ , arethesestatevariables.Thevaluesof ¦ at thenexttime stepareevaluated(usingequation9) from the outputs(via therecurrentconnections)andtheprevious valuesof ¦ . All connectionsandthetwo biasnodesareshown.

but which, for whatever reason,wedonotmeasure.Both � and� arefunctionsof time.A generaldynam-

ical equationwhich describesthetemporalevolutionofthestatevariablesin responseto theexternalvariablesis § �¨� � § � ��©"�E�¨� � �#F�ª� � FV# (7)where © is somenon-linearfunction. To a first-orderapproximation,wecanwrite�¨� � b e � ª�@�¨� � b § �¨� � § � e � K (8)Thisdynamicalsystemcanbemodelledwith therecur-rentneuralnetworkarchitectureshown in Figure6. Thisis a discretetime network in which the input dataareprovidedasadiscretelist of valuesseparatedby knowntime intervals. The input–outputmappingof this net-work implementsequation7 directly: Ratherthanpro-ducingthestatevariablesattheoutputof thenetwork,asis often thecasewith recurrentnetworks (e.g.[6]), weproducethe time derivativesof the statevariables,forreasonsthatareexpoundedonbelow. Thehiddennodescomputea non-linearfunctionof both theexternalandtherecurrentinputswith a sigmoidfunction(e.g.tanh),asconventionallyusedin feedforwardnetworks. A lin-earhidden–outputfunction is usedto allow for an ar-bitrary scaleof the outputs. The recurrentpart of thedynamicalsystem,viz. equation8, is implementedwiththe recurrentloops shown in Figure 6, by settingtheweightsof theserecurrentloopsto thesizeof the timestep, e � , betweensuccessive epochs.Explicity, the «yhEirecurrentinputat timestep¬ is givenby�~®� ¬ ¨�@�~0� ¬ 2 3 b¯ y� ¬ 2 3 :e � � ¬ (9)wherē 0� ¬ 2 3 ª� § �°� ¬ 2 3 v± § � and e � � ¬ is thetimebetweenepoch� ¬ 2 3 andepoch� ¬ .

Theprincipalreasonfor developinga network whichpredictsthetimederivativesof thestatevariablesis thatit canbe trainedon time-seriesdatain which thesepa-rationsbetweentheepochs,e � � ¬ , neednotbeconstant:atj eachepoch¬ we simply settheweightsof therecur-rentfeedbackloopsto e � � ¬ . Furthermore,thenetworkcanbetrainedonmultiple timeseriesin which thetimescalesfor eachtime seriesmaybevery different. Thisis important in forging applicationsas the forging oflargecomponentswould occurovera longertime scalethanfor smallcomponents,whereasthemicroscopicbe-haviour of the materialswould essentiallybe the same(for a given material). In sucha casewe would wantto incorporatedata from both forgings into the samemodel, but without having to obtain measurementsatthesameratein bothcases.

While our network is similar to that of Jordan[4],ourarchitecturehastheimportantattributesthat: 1) theoutputsaretime derivativesof the statevariables,and2) in training the network the error derivativescanbepropagatedvia therecurrentconnectionsto thearbitrar-ily distantpast. Our training algorithmcanbe seenasageneralisationof themethoddescribedby Williams &Zipser [9] extendedto multiple time series. Althoughnecessarilyonly the feedforwardweightsaretrainable,theinput–hiddenweights,for example,arenonethelessdependentuponthevaluesof thehiddennodesby virtueof the recurrentconnections,and this dependency istaken into account. Training proceedsby minimizinganerrorfunction,typically thesumof squareserror, bygradientdescentor aconjugategradientalgorithm.Theweightscanbeupdatedaftereachepochof eachtime-series(i.e. RealTime RecurrentLearning[9]), afterallepochsof all patterns,or atany intermediatepoint.

To train thenetwork weneedat leastonetargetvalueat at leastoneepoch. Note that the training algorithmis not restrictedto use targetsonly for the ‘outputs’:errors can be propagatedfrom any node. Generallywe would have valuesof the statevariables(recurrentinputs) for the final epoch. However, in metallurgicalapplicationswe would typically be able to obtain ad-ditional measurementsat intermediateepochs,thusim-proving theaccuracy of thederived input–outputfunc-tion. We will of coursenot have any target valuesforthe‘unmeasured’statevariables.Hencethesevariableswill not evencorrespondto any physicalvariables,in-steadactingas ‘hidden’ variableswhich convey somestateinformationnot containedin the ‘measured’statevariables.Nonethelesswemaybeableto providesomeloosephysicalinterpretationfor unmeasuredvariables.Oncetrained,thenetwork producesacompletetimese-quenceof the statevariablesgiven a sequenceof theexternalinputs,i.e. theforgingprocess.

5. Future Work

Future work will focus on the applicationof the re-currentneuralnetwork describedto thedynamicalhot-

forging problem. Given that the time-seriestrainingdatawill typicallybemadeupof many epochsfor whichthereareno targetoutputs,regularizationis likely to benecessary, andwe will thereforeexaminethe applica-tion of theBayesianmethodsdevelopedby Mackay[5].Furthermorewe will investigatethe feasibility of theHessianmatrix for evaluatingconfidenceintervals onthenetwork predictions.

Acknowledgements

The authorsaregrateful to the EPSRC(grantnumberGR/L10239),DERA andINCO Alloys Ltd. for finan-cial supportandto Mark Gibbsfor useof his Gaussianprocesssoftware.

References

[1] T. Furu, H.R. Shercliff, C.M. Sellars, M.F.Ashby, “Physically-basedmodelling of strength,microstructureandrecrystallisationduringthermo-mechanicalprocessingof Al–Mg alloys”, MaterialsSci.Forum, 217–222,pp.453–458,1996.

[2] L. Gavard, H.K.D.H. Bhadeshia,D.J.C.MacKay,S. Suzuki, “Bayesian neural network model forausteniteformationin steels”,MaterialsSci.Tech-nol., vol. 12,pp.453–463,1996.

[3] M.N. Gibbs,BayesianGaussianprocessesfor re-gressionand classification. PhD thesis,Universityof Cambridge,1997.

[4] M.I. Jordan,“Attractordynamicsandparallelismina connectionistsequentialmachine”,Proc. of theEight Ann.Conf. of the Cognitive Sci. Soc., Hills-dale,NJ:Erlbaum,1986.

[5] D.J.C.MacKay, “Probablenetworks andplausiblepredictions:areview of practicalBayesianmethodsfor supervisedneuralnetworks”, Network:Compu-tationin Neural Systems, vol. 6, pp.469–505,1995.

[6] A.J. Robinson,F. Fallside,“A recurrenterrorprop-agationnetwork speechrecognitionsystem”,Com-puter Speech and Language, vol. 5, pp. 259–274,1991.

[7] T.J. Sabin, C.A.L. Bailer-Jones, S.M. Roberts,D.J.C.MacKay, P.J. Withers, “Modelling the evo-lution of microstructuresin cold-worked and an-nealedaluminiumalloy”, Proc.of theInt. Conf. onThermomechanicalProcessing, in press,1997.

[8] C.K.I. Williams, C.E. Rasmussen,“Gaussianpro-cessesfor regression”, in D.S. Touretzky, M.C.Mozer, M.E. Hasselmo(eds),Neural InformationProcessingSystems8, Boston, MA: MIT Press,1996.

[9] R.J. Williams, D. Zipser, “A learning algorithmfor continually running fully recurrentneuralnet-works”, Neural Information ProcessingSystems,vol. 1, pp.270–280,1989.