Introduction to information theory LING 572 Fei Xia, Dan Jinguji Week 1: 1/10/06 1.

24
Introduction to information theory LING 572 Fei Xia, Dan Jinguji Week 1: 1/10/06 1

Transcript of Introduction to information theory LING 572 Fei Xia, Dan Jinguji Week 1: 1/10/06 1.

Introduction to information theory

LING 572

Fei Xia, Dan Jinguji

Week 1: 1/10/06

1

Today

• Information theory

• Hw #1

• Exam #1

2

Information theory

3

Information theory

• Reading: M&S 2.2

• It is the use of probability theory to quantify and measure “information”.

• Basic concepts:

– Entropy

– Cross entropy and relative entropy

– Joint entropy and conditional entropy

– Entropy of the language and perplexity

– Mutual information 4

Entropy

• Entropy is a measure of the uncertainty associated with a distribution.

• The lower bound on the number of bits that it takes to transmit messages.

• An example: – Display the results of horse races. – Goal: minimize the number of bits to encode the results.

x

xpxpXH )(log)()(

5

An example

• Uniform distribution: pi=1/8.

• Non-uniform distribution: (1/2,1/4,1/8, 1/16, 1/64, 1/64, 1/64, 1/64)

bitsXH 3)8

1log8

1(*8)( 2

bitsXH 2)64

1log

64

1*4

16

1log

16

1

8

1log8

1

4

1log4

1

2

1log2

1()(

(0, 10, 110, 1110, 111100, 111101, 111110, 111111)

Uniform distribution has higher entropy.MaxEnt: make the distribution as “uniform” as possible. 6

Cross Entropy

• Entropy:

• Cross Entropy:

• Cross entropy is a distance measure between p(x) and q(x): p(x) is the true probability; q(x) is our estimate of p(x).

xc

x

xqxpXH

xpxpXH

)(log)()(

)(log)()(

)()( XHXH c 7

Relative Entropy

• Also called Kullback-Leibler divergence:

• Another “distance” measure between probability functions p and q.

• KL divergence is asymmetric (not a true distance):

)()()(

)(log)()||( 2 XHXH

xq

xpxpqpKL c

),(),( pqKLqpKL 8

Reading assignment #1

• Read M&S 2.2: Essential Information Theory

• Questions: For a random variable X, p(x) and q(x) are two distributions: Assuming p is the true distribution.– p(X=a)=p(X=b)=1/8, p(X=c)=1/4, p(X=d)=1/2– q(X=a)=q(X=b)=q(X=c)=q(X=d)=1/4

(a) What is H(X)?(b) What is H(X, q)?(c) What is KL divergence D(p||q)?(d) What is D(q||p)?

9

H(X) and H(X, q)

D(p||q)

D(q||p)

Joint and conditional entropy

• Joint entropy:

• Conditional entropy:

x y

yxpyxpYXH ),(log),(),(

)(),(

)|(log),(

)|(log)|()(

)|()()|(

XHYXH

xypyxp

xypxypxp

xXYHxpXYH

x y

x y

x

13

Entropy of a language(per-word entropy)

• The entropy of a language L:

• If we make certain assumptions that the language is “nice”, then the cross entropy can be calculated as:

n

xpxp

LH nxnn

n

1

)(log)(

lim)(11

n

xp

n

xpLH nn

n

)(log)(loglim)( 11

14

Per-word entropy (cont)

• p(x1n) can be calculated by n-gram models

• Ex: unigram model

15

Perplexity

• Perplexity is 2H.

• Perplexity is the weighted average number of choices a random variable has to make.

=> We learned how to calculate perplexity in LING570.

16

Mutual information

• It measures how much is in common between X and Y:

• I(X;Y)=KL(p(x,y)||p(x)p(y))

• I(X;Y) = I(Y;X)

);(

),()()(

)()(

),(log),();(

XYI

YXHYHXH

ypxp

yxpyxpYXI

x y

17

Summary on Information theory

• Reading: M&S 2.2

• It is the use of probability theory to quantify and measure “information”.

• Basic concepts:

– Entropy

– Cross entropy and relative entropy

– Joint entropy and conditional entropy

– Entropy of the language and perplexity

– Mutual information 18

Hw1

19

Hw1

• Q1-Q5: Information theory

• Q6: Condor submit

• Q7: Hw10 from LING570. – You are not required to turn in anything for Q7. – If you want feedback on this, you can choose to turn it in.– It won’t be graded. You get 30 points for free.

Q6: condor submission

• http://staff.washington.edu/brodbd/orientation.pdf

• Especially Slide #22 - #28.

21

For a command we can run as:

mycommand -a -n <mycommand.in >mycommand.out

The submit file might look like this: save it to *.cmd

Executable = mycommand The command

Universe = vanilla

getenv = true

input = mycommand.in STDIN

output = mycommand.out STDOUT

error = mycommand.error STDERR

Log = /tmp/brodbd/mycommand.log A log file that stores the results

of condor sumbission

arguments = "-a -n“ The arguments for the command

transfer_executable = false

Queue

22

Submission and monitoring jobs on condor

• Submission:

condor_submit mycommand.cmd

=> get a job number

• List the job queue:

condor_q

Status changes from “I” (idle) to “R” (run) to – “H”: means the job fails. Look at the log file specified

in *.cmd– Disappeared from the queue: You will receive an email

• Use “man condor_q” etc. to learn more about those commands. 23

The path names for files in *.cmd

In the *.cmd file:

Executable = aa194.exec

input = file1

• The environment (e.g., ~/.bash_profile) might not be set properly

• It assumes that the files are in the current directory (the dir where the job is submitted)

=> Use the full part names if needed.24