Download - Lecture 10 The catalog of sources –Resolved sources –Selection biases –Luminosity (and mass) functions –Volume- vs flux-limited surveys. Cross-matching.

Lecture 10• The catalog of sources

– Resolved sources– Selection biases– Luminosity (and mass) functions– Volume- vs flux-limited surveys.

• Cross-matching two catalogs.

NASSP Masters 5003S - Computational Astronomy - 2009

Python and tut oddments• The module cPickle offers a useful way to save to

disk file python-generated data of arbitrary format.– See

http://www.python.org/doc/2.5/lib/module-cPickle.html– This can save you having to run a whole MC again just

to check the details of a plot!• I see that pyfits is set up to deliver numpy arrays

on the NASSP machines. (It only returns Numarray objects on Astronomy computers it seems.)

• Assessment: let us take ‘code must run’ to mean ‘it must run on NASSP machines.’– If I claim you code won’t run, and you think I am wrong,

by all means protest!


http://www.python.org/doc/2.5/lib/module-cPickle.html

NASSP Masters 5003F - Computational Astronomy - 2009

Detecting resolved sources.• Our earlier

assumption that we knew the form of S is no longer true.

• Some solutions:1. Combine results of

several filterings. (Crudely done in XMM.)• But, ‘space’ of

possible shapes is large.

• Difficult to calculate nett sensitivity.

2. Wavelet methods.


Wavelet example

Raw data Wavelet smoothed

F Damiani et al (1997)

Multi-scale wavelets can be chosen to return best-fit ellipsoids.


Selection biases• Fundamental aim of most surveys is to

obtain measurements of an ‘unbiased sample’ of a type of object.

• Selection bias happens when the survey is more sensitive to some classes of source than others.– Eg, intrinsically brighter sources, obviously.

• Problem is even greater for resolved sources.– Note: ‘resolved’ does not just mean in spatial

terms. Eg XMM or (single-dish HI surveys) in which most sources are unresolved spatially, but well resolved spectrally.


Examples• Optical surveys of galaxies. Easiest

detected are:– The brightest (highest apparent magnitude).– Edge-on spirals.

• HI (ie, 21 cm radio) surveys of galaxies. Easiest detected are:– Those with most HI mass (excludes

ellipticals).– Those which don’t ‘fill the beam’ (ie are

unresolved).• Note: where sources are resolved,

detection sensitivity tends to depend more on surface brightness than total flux.


Full spatial information• Q: We have a low-flux source - how do we

tell whether it is a high-luminosity but distant object, or a low-luminosity nearby one?

• A: Various distance measures.– Parallax - only for nearby stars – but Gaia will

change that.– Special knowledge which lets us estimate

luminosity (eg Herzsprung-Russell diagram).– Redshift => distance via the Hubble relation.

This is probably the most widely used method for extragalactic objects.


Luminosity function• Frequency distribution of

luminosity (luminosity = intrinsic brightness).

• The faint end is the hardest to determine.– Stars – how many brown

dwarfs?– Galaxies – how many dwarfs?

• Distribution for most objects has a long faint-end ‘tail’.– Schechter functions.

P Kroupa (1995)

P Schechter (1976)


HI mass function• Red shift is directly

measured.• Flux is proportional to

mass of neutral hydrogen (HI).– Hence: usual to talk

about HI mass function rather than luminosity function.

S E Schneider (1996)

FYI, HI is pronounced ‘aitch one’.


Relation to logN-logS• Just as flux S is related to luminosity L and

distance D by

• So is the logN-logS – or, to be more exact, the number density as a function of flux, n(S) - a convolution between the luminosity function n(L) and the true spatial distribution n(D).

• BUT…– The luminosity function can change with age

– that is, with distance! (And with environment.)

S α L/D2


Volume- vs flux-limited surveys• Information about the distance of sources

allows one to set a distance cutoff, within which one estimates the survey is reasonably complete (ie, nearly all the available sources are detected).

• Such a survey is called volume-limited. It allows the luminosity (or mass) function to be estimated without significant bias.– However, there may be few bright sources.

• Allow everything in, and you have a flux-limited survey.– Many more sources => better stats; but

biased (Malmquist bias).

Malmquist bias


Malmquist bias


Line of constant flux


Catalog cross-matching• It sometimes happens that you have 2 lists

of objects, which you want to cross-match.– Maybe the lists are sources observed at

different frequencies.– The situation also arises in simulations.

• I’ll deal with the simulations situation first, because it is easier.– So: we start with a bunch of simulated

sources. Let’s keep it simple and assume they all have the same brightness.

– We add noise, then see how many we can find.


Catalog cross-matching– In order to know how well our source-

detection machinery is working, we need to match each detection with one of the input sources.

• How do we do this?• How do we know the ‘matched’ source is the ‘right

one’?

...I haven’t done a rigorous search of the literature yet – these arejust my own ideas.

CAVEAT:


Catalog cross-matchingBlack: simulated sourcesRed: 1 of many detections (with 68% confidence interval).

This case seems clear.


Catalog cross-matchingBut what about these cases?

No matches insideconfidence interval.

Too many matchesinside confidence interval.


Catalog cross-matchingOr these?

Is any a good match? Which is ‘nearest’?


Catalog cross-matching• My conclusion:

1. The shape of the confidence intervals affects which source is ‘nearest’.

2. The size of the confidence intervals has nothing to do with the probability that the ‘nearest’ match is non-random.

1. ‘Nearest neighbour’ turns out to be a slipperier concept than we at first think. To see this, imagine that we have now 1 spatial dimension and 1 flux dimension:


Catalog cross-matching

S

x

S

x

Which is the best match???

This makes more sense.Let’s then define r as:

1

3

4

5

7

1

8

9

2

6

7

3

5

4

1

9

6

21

8

Source 5? Or source 8?


Catalog cross-matching2. As for the probability... well, what is the

null hypothesis in this case?– Answer: that the two catalogs have no

relation to each other.– So, we want the probability that, with a

random distribution of the simulated sources, a source would lie as close or closer to the detected source than rnearest.

– This is given by:

– where ρ is the expected density of sim sources and V is the volume inside rnearest.

Pnull = 1 – exp(-ρV)


Catalog cross-matching• So the procedure for matching to a

simulated catalog is:1. For each detection, find the input source for

which r is smallest.2. Calculate the probability of the null

hypothesis from Pnull = 1 – exp(-ρV).

3. Discard those sources for which Pnull is greater than a pre-decided (low) cutoff.

• What about the general situation of matching between different catalogs?


Catalog cross-matching

ASCA data – M Akiyama et al (2003)

Maybe a Bayesian approach would be best? Interesting area of research.