Aggregaon of Epistemic Uncertaintykjs.nagaokaut.ac.jp/yamada/papers/ICTer2018-Keynote.pdf · ICTer...
Transcript of Aggregaon of Epistemic Uncertaintykjs.nagaokaut.ac.jp/yamada/papers/ICTer2018-Keynote.pdf · ICTer...
2018/Sep/27
ICTer2018,UniversityofColomboSchoolofCompu?ng 1
Aggrega?onofEpistemicUncertainty
KoichiYamadaNagaokaUniv.ofTech.
1
-CertaintyFactorsandPossibilityTheory-
WhatisEpistemicUncertainty?
2
-cogni?veuncertaintycausedbyincompleteknowledge/lackofinforma?on
-subjec?veuncertainty
EpistemicUncertainty AleatoricUncertainty (Sta?s?cal/Objec?veUncertainty)relatedtofrequency
•ExamplesofEpistemicUncertainty-Q1:Gravityaccelera?oninthisroom:about9.8m/s2
-Q2:whetherthesuspectarrestedistherealmurdererornot.
Theuncertaintycontainedintheanswerscannotberepresentedbyfrequency.Itisuncertaintyconsideredasadegreeofourbelief.
2018/Sep/27
ICTer2018,UniversityofColomboSchoolofCompu?ng 2
WhichUncertaintyDoesProbabilityRepresent?
3
•Originally,probabilityhadbeenameasuretorepresent"frequency."
•In1740s,ThomasBayesconsideredawaytodealwithdegreesofbeliefintheframeworkofProbabilitytheory,whichiscalledBayesianProbabilityorsubjec?veprobability.
ExamplesofSubjec?veprobabili?es:
Itwillraintomorrowat50%.Ourteamwillwintomorrowat99%.
Thesepercentagesarenotfrequencies,becausetomorrowwillcomejustonce.Theyareourbeliefs.
Probabilitycanbeusedbothforfrequency(Aleatoricuncertainty)andfordegreesofbelief(Epistemicuncertainty).
OtherTheoriesforRepresen?ngUncertainty
4
•Dempster-ShafertheoryofEvidence
•RoughSettheory
•Fuzzysettheory
•PossibilityTheory
•CertaintyFactor
-atheoryrelatedtoanadjec?ve,"Possible"
-ageneralizedtheoryofuncertainty
-Uncertaintyrepresenta?onemployedinMYCIN(1974)
-vaguenesscontainedinconceptsandwords
-indiscernibilityandapproxima?onduetoourlimitedknowledge
Note:ThesearealltheoriestodealwithEpistemicUncertainty,whichsuggeststherearemanyaspectsinEpistemicUncertainty.
•Mul?-valuedlogics-truthbetween"completelytrue"and"completelyfalse"
èfocusonthedegreeofbeliefs
2018/Sep/27
ICTer2018,UniversityofColomboSchoolofCompu?ng 3
WhatisImportantforDealingwithEpistemicUncertainty?
5
•Capabilitytodealwith"ignorance"/"unknownsitua?on"isimportant.
Example:Supposethereisaganggroupinasmalltown,andyouroldfriendTomhasjoinedit.Oneday,amurderhappenedinthetown.Therewasperfectevidencethatoneofthegangmembersdidit.Nootherinforma?onisgiven.
Howdoyourepresentuncertaintythat"Tomisthemurderer"?
Probabilitytheoryn:thenumberofthegangmembers,butwedonotknowtheexactnumber.
•Probabilitycannotrepresenttheuncertaintyofthissitua?on.
P(Tom) =1/ n
6
Representa?oninOtherTheoriesPossibilitytheory:anuncertainsitua?onisrepresentedbyapairofpossibili?es;
Dempster-ShafertheoryofEvidence:uncertaintyisrepresentedbytwomeasures;
π (Tom) =1.0π (Tom) =1.0
Pl(Tom) =1.0Bel(Tom) = 0.0
CertaintyFactorModelCF(Tom) = 0.0
:PossibilitythatTomisthemurdereris1.0.:PossibilitythatTomisNOTthemurdereris1.0.
:PlausibilitythatTomisthemurdereris1.0.:BeliefthatTomisthemurdereris0.0.
1.0≥CF≥-1.0
1.0≥Pl,Bel≥ 0.0
1.0≥π≥ 0.0
+1:perfectaffirma?on−1:perfectnega?on0:unknownornoinforma?on
2018/Sep/27
ICTer2018,UniversityofColomboSchoolofCompu?ng 4
Aggrega?onofEpistemicUncertainty
7
•Ineveryday-decisionmaking,wefrequentlygathermul?plepiecesofuncertaininforma?on,andaggregatetheinforma?on.
-Whichbeachresortshallwegotoduringthenextvaca?on?
-Whichistellingthetruth,PresidentTrumportheNewYorkTimes?-WhichjobshouldIchooseamongthemul?pleoffers?
-Manyapplica?onsneedinforma?onaggrega?onindecision-making,affec?veinforma?onprocessing,sensorfusion,flexibleinforma?onretrieval,etc.
•Weneedtogatherandaggregatemuchuncertaininforma?ontoanswertheseques?ons.
•Thereareonlyafewtheoriesthatprovideastandardaggrega?onfunc?on.
-Dempster-ShaferTheoryofEvidence:Dempster'sruleofcombina?on-CertaintyFactorModel
-Aggrega?onisoneofthemostimportantinforma?onprocessingforEpistemicUncertainty.
8
CertaintyFactorModel•CFwasdevisedandusedtorepresentuncertaintyofahypothesisgivensomeevidenceinsteadofProbabilityinafamousExpertSystemMYCIN.
•TheCFmodelwasevaluatedas"Prac?cal"bymanyprac??oners,butwasalsocri?cizedharshlybytheore?cians,blamingitistheore?callywrong.
-Therewasnosoundinterpreta?onoftheCFmodelintheframeworkofProbabilitytheory.
-Probabilitycannotexpresstheunknownsitua?on(ignorance).
-Nostandardwaytoaggregatemul?pleprobabilitydistribu?onsderivedfrommul?plepiecesofevidence.
because,
2018/Sep/27
ICTer2018,UniversityofColomboSchoolofCompu?ng 5
OriginalDefini?onofCertaintyFactors
9
Cf (h,e)∈ [−1, +1]
•CFofhypothesishgivenevidencee
+1:perfectaffirma?on−1:perfectnega?on0:neitherissupported(unknown,noevidence)
MB(h,e):degreethatbeliefinhisrevisedbyetowardaffirma?on
MD(h,e):degreethatbeliefinhisrevisedbyetowardnega?on
-Wedonotadoptthedefini?on,becauseitwasprovedthatthedefini?onisnotconsistentwiththeaggrega?onfunc?onofCFs.
Aggrega?onFunc?onofCFModel
10
• Letx(y)beCFofhypothesishgivenevidenceex(ey).
Then,theAggrega?on(combina?on)func?onisgivenasfollows;
•Theequa?oniscommuta?veandassocia?ve.So,aggrega?onresultsarenotdependentontheorderofsequence,whentherearemul?pleCFs.
•TheaggregatedCFcouldbeinterpretedasfollows;fM (x, y) =Cf (h,exey )
x =Cf (h,ex ) y =Cf (h,ey )
2018/Sep/27
ICTer2018,UniversityofColomboSchoolofCompu?ng 6
NewSoundInterpreta?onofCFswithPossibilityTheory
11
KoichiYamada:Aggrega?onofEpistemicUncertainty:ANewInterpreta?onoftheCertaintyFactorwithPossibilityTheoryandCausa?onEvents,SCIS&ISIS2018(submioed)
•Therestofthispresenta?ondiscussesanewsoundinterpreta?onofCertaintyFactorsusingPossibilitytheory.
•Examinenewaggrega?onfunc?onsintheframeworkofPossibilitytheory.
APossibilityDistribu?onACertaintyFactor
Transformabletoeachother
-Oneoftheaggrega?onfunc?onsisexactlythesameastheoneusedinMYCIN.
èThisgivesthetheore?calbasistotheMYCIN'saggrega?onfunc?on.
thebothrepresentthesameuncertainty
12
•“Possibility”isanotherscaletomeasureuncertaintywithavaluein[0,1]similarto"Probability.”
-Essen?ally,humansu?lizepossibilityratherthanprobabilityindecision-making.•AccordingtoL.A.Zadeh,
-VaguenesscontainedinNaturalLanguageisprincipallypossibilis?c.
•SomeimpressivestatementsaboutPossibilityandProbability
-Whatisimpossibleisimprobable.(Zadeh)
PossibilityTheory
-Whatispossiblecanbeimprobable.(Zadeh)-Whatisimprobableisnotimpossible,necessarily.(Zadeh)-Whatisprobablemustbepossible.(D.DuboisandH.Prade)
PossibilitySeemsappropriateforrepresen?ngepistemicuncertainty
2018/Sep/27
ICTer2018,UniversityofColomboSchoolofCompu?ng 7
PossibilityMeasure
13
Axiomsofpossibilitymeasures
if
]1,0[2: →Π U ]1,0[2: →UP
0)( =∅Π1)( =Π U
Π(A∪B) =max(Π(A),Π(B))
0)( =∅P
1)( =UP)()()( BPAPBAP +=∪
∅=∩ BAA and B do not need to be disjoint. �
PossibilityMeasure ProbabilityMeasure
Axiomsofprobabilitymeasures
BA⊆ )()( BPAP ≤
P(A) =1− P(AC )
€
P(A∪ B) = P(A) +P(B) −P(A∩ B)
Proper?es⇒
Proper?es)()( BA Π≤ΠBA⊆ ⇒
)(1)( cAAN Π−=
NecessitymeasureAisnecessary=“NotA”isnotpossible
U:theuniversalset
•AlgebraicsumofProbabilityisreplacedbyMaxopera?oninPossibility.•PossibilitymeasureisdefinedinthesimilarwaytoProbabilitymeasure.
PossibilityDistribu?on
14
})({)( ii uu Π=π
Π(A) =Maxui∈A
Π({ui}) =Maxui∈Aπ (ui )
Π(U ) =Maxui∈U
π (ui )
=Max{π (u1),π (u2 ),...,π (un )}=1
})({)( ii uPup =
∑∑∈∈
==Au
iAu
iii
upuPAP )(})({)(
1)(...)()(
)()(
21 =+++=
= ∑∈
n
Uui
upupup
upUPi
π :U → [0,1]
Possibilitydistribu?on Probabilitydistribu?onp :U → [0,1]
Proper?es Proper?es
•BothProbabilityandPossibilityhavedistribu?onfunc?ons.•AlgebraicsumofProbabilityisreplacedbyMaxopera?oninPossibility.
Apossibilitydistribu?onmustbe"normal."
2018/Sep/27
ICTer2018,UniversityofColomboSchoolofCompu?ng 8
Condi?onalPossibility
15
Condi?onalPossibility: Condi?onalProbability:)|( ABΠ )|( ABP
Π(A∩B) =min(Π(A),Π(B | A)) P(A∩B) = P(A)•P(B | A)
⎪⎩
⎪⎨
⎧
∩Π>Π
∩Π
≠∩Π=Π
=Π
)()( if ),(
0)()( if ,1)|(
BAABA
BAAAB
0)( if ,)(
)()|( ≠∩
= APAPBAPABP
AisindependentofBBisindependentofA.
P(A∩B) = P(A)•P(B)P(B | A) = P(B)(3)When(1)or(2)holds,AandBisnon-interac?ve.
Π(A∩B) =min(Π(A),Π(B))
(1)WhenAisindependentofB,Π(A | B) =Π(A)
(2)WhenBisindependentofA,Π(B | A) =Π(B)
P(A | B) = P(A)
•Condi?onsandIndependencearedefinedinasimilarwaydespitethatdetailsaredifferent.•AlgebraicproductofProbabilityisreplacedbyMinopera?oninPossibility.
HypothesisandOppositeHypothesis
16
•WeintroducetheOppositeHypothesisktoahypothesish,whichsa?sfiesthefollowinglogicalformulae.
h k
+1 -10male female
“unknown”
•ThehypothesishandO-hypothesisksa?sfythefollowing;
isnottautology.
isnotcontradic?on.
Note:Assumingthe“closedworldassump?on,theasser?onofmale(female)shouldberejected,ifthereisnoevidenceformale(female).Ifwehavenoevidencebothformaleandfemale,wehavetorejectbothasser?onsofmaleandfemale.
Whichisthemurderer,maleorfemale?
represents"unknown"becauseofnoevidence
2018/Sep/27
ICTer2018,UniversityofColomboSchoolofCompu?ng 9
Causa?onEvents
17
h:e-representsaneventthatevidence“e”supportshypothesis“h”
k:d-representsaneventthatevidence“d”supportsO-hypothesis“k”
Y.PengandJ.A.Reggia(1987)
Hypothesis,O-hypothesis,Causa?onEvent
18
Eh:thesetofallpiecesofevidencethatpossiblysupportsh.Ek:thesetofallpiecesofevidencethatpossiblysupportsk.
Hypothesishistrue⇔atleast,oneofpossiblepiecesofevidencesupportsh,andnopossiblepiecesofevidencesupportsk.
Ifwedefinehandkabove,theysa?sfytheformulae
2018/Sep/27
ICTer2018,UniversityofColomboSchoolofCompu?ng 10
Condi?onalCausa?onPossibility
19
•CondiBonalCausaBonPossibilityispossibilitythatevidenceeisupportshypothesish,onlygiventheevidenceei.
“!“representsthatonlyeiispresentandtheothersarenot.
Note:weassumeisnotcontradic?on,evenif.
Thisispossible,becauseweareconsideringtheepistemicworld.
representthatevidenceeiisnotpresent(found)norej.
Proposi?on
20
•Possibilitythat“histrue”onlygivenevidenceeisthesameasthepossibilitythat“theevidencesupportsthehypothesis”onlygiventheevidence.
(Itisbecausethereisnoevidencethatsupports“h,”otherthan"e")
•Possibilitythat“hisfalse”onlygiven“e”isthesameasthepossibilitythat“theevidencedoesnotsupportthehypothesis”onlygiventheevidence.
(Itisbecausethereisnoevidencethatsupportsh).
2018/Sep/27
ICTer2018,UniversityofColomboSchoolofCompu?ng 11
PossibilityDistribu?oncanberepresentedbyasinglevaluein[-1,1].
21
•Possibilitydistribu?onofhypothesis“h”givenonly“e”isrepresentedby
( π (h!e), π (h !e) ),wheremax( π (h!e), π (h !e) ) =1.0
•Thepossibilitydistribu?oncouldberepresentedbyasinglevaluegh(h!e)in[-1,1].
•Thenthepossibilitydistribu?onisrestoredfromgh(h!e)in[-1,1]usingthefollowingequa?ons.
Transforma?onbetweengh(h!e)and
PossibilityDistribu?ons
22
•gh(h!e)isregardedasasinglevaluerepresenta?onofapossibilitydistribu?on.
2018/Sep/27
ICTer2018,UniversityofColomboSchoolofCompu?ng 12
OurDefini?onofCertaintyFactors
23
•WedefinetheCFbythenextequa?on.
Aggrega?onofCFs(1)
24
•WhentwoCFsx,y≥0,itissupposedthattheevidence“ex”and“ey”supportthehypothesis“h”and
Inthiscase,xandyaretransformedtothepossibilitydistribu?ons.
•Thenwecalculatethepossibilitydistribu?onof“h”givenonly“ex”and“ey”,usingtheproposi?onsshownbefore,andassumingcondi?onalindependencyandnon-interac?vityofcausa?onevents.
π (h!ex ,ey ) =1.0 π (h !ex ,ey ) =min(1− x,1− y)
•Then,theabovepossibilitydistribu?oncanbetransformedbacktoaCF.
2018/Sep/27
ICTer2018,UniversityofColomboSchoolofCompu?ng 13
Aggrega?onofCFs(2)
25
•WhentwoCFsx,y≤0,itissupposedthattheevidence“ex”and“ey”supporttheO-hypothesis“k”and
Inthiscase,xandyaretransformedtothefollowingpossibilitydistribu?ons.
π (k!ex ,ey ) =1.0 π (k !ex ,ey ) =min(1+ x,1+ y)
•Then,theaggregatedCFisobtainedasfollows;
•Thenwecalculatethepossibilitydistribu?onof“h”givenonly“ex”and“ey”,usingtheproposi?onsassumingcondi?onalindependencyandnon-interac?vityofcausa?onevents.
Aggrega?onofCFs(3)
26
•WhentwoCFsx>0>y,itissupposedthattheevidence“ex”and“ey”support“h”and“k”,respec?velyand
Inthiscase,xandyaretransformedtothefollowingpossibilitydistribu?ons.
π (h∧k !ex ,ey ) =1+ y π (h ∧k!ex ,ey ) =1− x
•Then,theaggregatedCFisobtainedasfollows;
•Thenwecalculatethepossibilitydistribu?onof“h”givenonly“ex”and“ey”,usingtheproposi?onsassumingcondi?onalindependencyandnon-interac?vityofcausa?onevents.
2018/Sep/27
ICTer2018,UniversityofColomboSchoolofCompu?ng 14
Aggrega?onFunc?onsofCFs
27
•Bysummingupthethreecasesbefore,wegetthenextaggrega?onfunc?on.
•Thepossibilitydistribu?onobtainedinthecaseofx>0>yisnotnormal.(Itisbecausexandycontradicteachother)Ifwenormalizethedistribu?on,thentransformittoCF,wegetthenextaggrega?onfunc?on.
Aggrega?onFunc?onsofCFs(2)
28
•Thestandardopera?onsusedforPossibilitytheoryareMinandMax.Mathema?cally,itispossibletouseothert-normandt-conorm.•IfweusealgebraicproductandsuminsteadofMinandMax,wegetthefollowingaggrega?onfunc?ons.
-Ifwedonotnormalizethepossibilitydistribu?oninthecaseofx>0>y,
-Ifwenormalizethepossibilitydistribu?oninthecaseofx>0>y,
Note:ThisiscompletelythesameastheMYCIN’saggrega?onfunc?on.TheCertaintyFactormodelcanbejus?fiedintheframeworkofPossibilityTheory.
2018/Sep/27
ICTer2018,UniversityofColomboSchoolofCompu?ng 15
Mathema?calProper?esoftheFourAggrega?onFunc?ons
29
Agrega+onrule Commuta+vity Associa+vity Con+nuity Monotonicity
MIN/MAXw/onormaliza?on
◯
☓
◯
Non-decreasing
MIN/MAXw/normaliza?on
◯
☓
◯
Non-decreasing
AlgebraicProd/Sumw/onormaliza?on
◯
☓
◯
Increasing
AlgebraicProd/Sumw/normaliza?on
◯
◯
◯
Increasing
NumericalExample(1)
30
•Supposewegetfivepiecesofevidenceforahypothesish.TheCFsaregivensequen?allyby0.3,-0.5,0.8,0.4,-0.7.
fmin:Min/Maxoperators,Nonormaliza?onfm-nor:Min/Maxoperators,withnormaliza?onfalg:AlgebraicProduct/Sumoperators,Nonormaliza?onfa-nor:AlgebraicProduct/Sumoperators,withnormaliza?on
2018/Sep/27
ICTer2018,UniversityofColomboSchoolofCompu?ng 16
NumericalExample(2)
31
•Aggrega?onresultswhentheorderofthesequenceischanged.
Associa?vity
affectedstronglybytherecentinforma?on
EffectsofRepe??veO-hypothesiswithLowCF
32
•SupposewehavethehypothesishwithahighCF(0.9)atfirst,thentheO-hypothesiskwithalowCF(-0.1)isgivenrepeatedly.
•Simula?onResultAtfirst,CF=0.9,thenCF=-0.1isrepeated20?mes.
-Aggrega?onswithoutnormaliza?onaresensi?vetotherepe??veO-hypothesis,andthevalueofCFdecreasesrapidly.-ButinthecaseofMin/Maxopera?on,theresultisboundedbythelowCF(-0.1),whilethecaseofalgebraicopera?onaccumulatestheCFoftheO-hypothesis.
-Hypothesish:truenews(CF=0.9)
h->k
h->k
-O-hypothesisk:fakenews(CF=−0.1)
2018/Sep/27
ICTer2018,UniversityofColomboSchoolofCompu?ng 17
EffectsofRepe??veO-HypothesiswithLowCF(2)
33
•Thecasewherethehypothesishisgiveninthemiddle.
•Thecasewherethehypothesishisgivenintheend.
-Aggrega?onswithalgebraicopera?onshaveaneffecttoaccumulatethelowCFs.
h->k
h->k
=>Thiseffectisriskyincaseswheremuchfakeinforma?onisrepeated.=>Ontheotherhand,itwouldbeusefulwhenwecannotgetthecertaindirectevidence,andcangetmuchindirectbutreliableevidencewithalowCF.
LessonsfromExamples
34
•Whenmuchwrongevidenceiscontained,itisriskytouse"algebraicopera?on"becauseoftheaccumula?oneffectsofCFsofwrongevidence.
•itisbeoeruse"normaliza?on",becausedecreasingeffectoftheCFbywrongevidenceissmall.
Inthesitua?onswherethereismuchwrongevidence(e.g.noises),ItisbeoertouseMin/Maxopera?onwithnormaliza?on.
•IncaseswheremostofreliableevidencehaslowCFsandthereisliolewrongevidence,aggrega?onwith"algebraicopera?on"helpsus.
Intheapplica?onswheremostofevidenceisindirectwithalowCF,butitisreliable,itisbeoertousealgebraicopera?onwithnormaliza?on(e.g.clinicaldiagnosis).
2018/Sep/27
ICTer2018,UniversityofColomboSchoolofCompu?ng 18
Posi?onofEpistemicUncertaintyinAI
35
Ar?ficialIntelligence
EpistemicAI
DataScience/Engineering
Symbolis?cAI(Tradi?onalAI)
Computa?onalAI(Computa?onalIntelligence)
Connec?onism(NeuralNets)FuzzyLogicEvolu?onalComputa?onEpistemicUncertainty
Problem-solvingHeuris?cSearchDeduc?vereasoningKnowledgerepresenta?on
Symbolis?cAI(KnowledgeDiscovery)
Computa?onalAI(MachineLearning)(Sta?s?calAI)
AoributeOrientedInduc?onRoughSetModelAssocia?onLearning
DiscriminantAnalysisSupportVectorMachineDecisionTreesEnsembleLearningBayesianLearning
Thegoal:Inves?gatetheHumanIntelligence,anddevelopHumanLikeMachineIntelligence
Thegoal:analyzedata,findpaoerns,anddevelopmachinestojudgesomethingusingthepaoerns.
AIwithSymbolprocessingsymbols=concepts
AIwithnumericalprocessing
Conclusion
36
•EpistemicUncertaintywasdiscussed.Itisimportantevenintheeraofdatascience,aslongashumans/robotshavetomakedecisionsbasedonmuchuncertain/vagueinforma?on.
•Theoriesofepistemicuncertaintyshouldbeabletodealwith"ignorance"or"unknownsitua?on"causedbylackofinforma?on/knowledge.
•TheCFmodel,whichhadbeencri?cizedforalong?me,wasrecalled,interpretednewlywithPossibilitytheory,andtheaggrega?onfunc?onwasjus?fiedtheore?cally.
•Foursimpleaggrega?onfunc?onsofCertaintyFactorswereproposed,andthemathema?calproper?eswerediscussed.