Turing’s Imitation Game -...
Transcript of Turing’s Imitation Game -...
Turing’s Imitation Game: Role of Error-making in Intelligent Thought
Turing100:
Huma Shah (TCAC), Kevin Warwick (TCAC), Ian Bland, Chris Chapman & Marc Allen
Turing in Context II: 10-12 October, 2012
Overview
• Introduction to Turing’s two scenarios for his Imitation Game / Turing test
• University of Reading’s practical imitation game events 2008 & 2012
• Results from 120 (of 276) in which 3 error types observed
• Sample conversation – audience participation
• Why intelligent judges make errors in practical Turing tests
Turing in Context II: 10-12 October, 2012
Before we start
• Turing the man:
– If Turing, the genius, were alive today {and he’d be 100!} he’d be helping GCHQ fight cybercrime. Need more Turings (Iain Lobben, Head of GCHQ: Oct 2012)
– Lessons learnt (Cambridge historian, Prof Christopher Andrew: Sept 2012)
– Posthumous apology for mistreatment from a British PM (Gordon Brown: Sept 2009)
Turing in Context II: 10-12 October, 2012
Turing’s Imitation Game
• Plurality of views on:
– What is it?
– How useful/damaging it is for AI
• All points of view tolerated
• We focus on results from actual imitation games with over 50 judges, 40 hidden humans and 5 machines
Turing in Context II: 10-12 October, 2012
Turing’s Imitation Game: Scenario 1
Turing in Context II: 10-12 October, 2012
Simultaneous test: Judge interrogates two hidden entities in parallel: CMI, Mind 1950
Turing’s Imitation Game: Scenario 2
Turing in Context II: 10-12 October, 2012
Viva voce test- direct questioning : 1952 & 1950: s6.4 The Argument from Consciousness, p. 446 Note: Turing’s revised prediction: “At least 100 years” (1952)
What Turing felt
• Idea of intelligence emotional rather than mathematical (1948)
• Learning language one of the most accomplished human feats (1948)
• Question/answer method “suitable for introducing almost any one of the fields of human endeavour” (1950: p. 435)
• Five minutes (1950: p. 442): first impression (Willis & Todorov, 2005) and thin slice of behaviour (Albrechtsen et. al, 2009) sufficient duration for game
• Thinking examined through satisfactory and sustained responses to any questions (1950: p.447)
Turing in Context II: 10-12 October, 2012
Practical Turing tests reported here
• Scenario 1: simultaneous tests
• 120 machine-human Turing tests:
– 60 conducted at Reading in 2008
– 60 conducted at Bletchley Park in 2012
• Interrogators/Judges made errors in identifying one or both hidden chat partners in a simultaneous test.
Turing in Context II: 10-12 October, 2012
Participants 2012: 23 June Bletchley Park
Turing in Context II: 10-12 October, 2012
Judges
Hidden humans
Human participants • Human participants (interrogators and hidden humans)
recruited via calls including social media in 2012: – Local schools – Local and national newspapers – UoR internal call – STEMNET – Twitter – Facebook – Newsgroups
• Human interrogators and hidden humans included members of the public, comp scientists, philosophers, journalists; males/females; adults/ teenagers, and native/non-native UK English speakers.
Turing in Context II: 10-12 October, 2012
Machines
• 2008: five select machines invited following one-to-one online testing performance prior to main event
• 2012 five machines invited (four of them from 2008 tests): – Cleverbot, Jfred (not in 2008), Elbot, Eugene
Goostman, Ultra Hal
• Each machine successful in previous Turing test contests
Turing in Context II: 10-12 October, 2012
What the Interrogators were told
• Task of each interrogator is to uncover all the machines and hidden-humans.
• At the end of each 5 mins test ‘score’ hidden chatters: – If machine score it for conversation ability 0-100
– If human, identify male/female; age range (child/teenager/adult), FLE/Non-FLE
– ‘Unsure’ score allowed
• Sample score sheet ….
Turing in Context II: 10-12 October, 2012
What the Hidden-humans were told
• Please remember that it is the machines in the contest competing to show they are the humans, please do not make it easier for the machines by answering in robotic fashion. If you are not sure what this means, here is a machine's response to a question from a recent Turing test: – I can't deal with that syntactic variant yet.
• Also, please be aware the tests will be open to the public for viewing and the transcripts used for research, i.e. anyone will be able to read your answers to the judges questions! Please do not reveal identity (real name, sex, age range).
Turing in Context II: 10-12 October, 2012
Error types in Practical Turing Tests
• Double error in the same test: human interrogator classifies machine for human (Eliza effect); hidden human (foil for the machine) classified as machine (confederate effect);
• Type a single error – Eliza effect: human interrogator classifies machine as human
• Type b single error – confederate effect: human interrogator classifies hidden human as machine.
• Gender & age blur also feature in tests
Turing in Context II: 10-12 October, 2012
120 Simultaneous Tests
Error
type :
Double-
error
Eliza
effect
Confed-
erate
effect
Unsure Errors
Number
of Judge
errors
9 7 5 2 (1 machine;
1 hidden
human)
23
Turing in Context II: 10-12 October, 2012
Analysing Transcripts
• English language knowledge features in judges’ decisions:
– First language English (FLE) speaking judges misclassified non-FLE hidden humans as machines, and vice versa
• Lack of mutual knowledge
• Subjective opinion on what constitutes satisfactory responses
Turing in Context II: 10-12 October, 2012
Why do intelligent humans err?
• Some humans trust too easily, succumb to deception more than others
• Schulz 2010: – Humans regard being right as natural state
– Making errors make us feel deflated and embarrassed
– Capacity to err is crucial to human cognition
– “wrongness is a window into normal human nature” (p.5)
Turing in Context II: 10-12 October, 2012
Problem with misidentification in cyberspace
• CyberLover chatbot shows some humans susceptible to deception: developed to steal identity and conduct financial fraud in Internet chat rooms
Turing in Context II: 10-12 October, 2012
Practical Uses of Imitation Game
• Encourage more children to take up computer science (problem in the UK)
• Scale progress in machine conversation - chatbots used widely in e-commerce
• Neurology: thought translation for locked-in patients (Stins & Laureys, 2009)
• Raise awareness of malfeasant programmes designed to conduct a particular cybercrime
Turing in Context II: 10-12 October, 2012
References
– A.M. Turing. Intelligent Machinery (1948). In B. J. Copeland (Ed) The Essential Turing: The ideas that gave birth to the computer age. Clarendon Press: Oxford. 2004
– A.M. Turing (1950). Computing Machinery and Intelligence. MIND. October: 59(236), pp 433-460
– A.M. Turing. Can Automatic Calculating Machines be said to Think? (1952). In B. J. Copeland (Ed) The Essential Turing: The ideas that gave birth to the computer age. Clarendon Press: Oxford. 2004
– H. Shah. Deception-detection and Machine Intelligence in Practical Turing Tests. PhD Thesis .The University of Reading. October 2010
– J.F. Stins and S.Laureys (2009). Thought Translation, tennis and Turing tests in the vegetative state. Phenom Cogn Sci: DOI 10.1007/s11097-009-9124-8
– J.S. Albrechtsen, C.A. Meissner and K.J. Susa. (2009) Can intuition improve deception detection performance? Journal of Experimental and Social Psychology. 45: pp 1052-1055
– J. Willis and A. Todorov (2005). First Impressions: Making up your mind after 100ms exposure to a face. Psychological Science. 17(7), pp 592-598
– K. Schulz (2010). Being Wrong. Portobello Books: London
Turing in Context II: 10-12 October, 2012