Fuzzy parameterization for analysis of natural phenomenon and use in other geophysical problems
description
Transcript of Fuzzy parameterization for analysis of natural phenomenon and use in other geophysical problems
Fuzzy parameterization for analysis of natural phenomenon and use in other
geophysical problems
Stanford Exploration Project SeminarStanford University, CA
12th December, 2008
Pritwiraj MoulikResearch Associate & Visiting Science student, Dept. of Earth Sciences, Univ. of Western Ontario,
CANADAUndergraduate student, Birla Institute of Technology & Science-Pilani, INDIA
Topics
• California fault system• Parkfield Earthquakes: Waveform modeling,
GIS• Earthquake nucleation: Pattern Informatics
• Taiwan Landslides: Neuro-fuzzy framework• Well log analysis
• Prydz Bay: Fuzzy Inference system• Costa Rica convergent margin: Neuro-fuzzy
framework• Climate modeling
• Monsoon Prediction• Paleo-climatic nonlinea time series analysis
Topics
• California fault system• Parkfield Earthquakes: Waveform modeling,
GIS• Earthquake nucleation: Pattern Informatics
• Taiwan Landslides: Neuro-fuzzy framework• Well log analysis
• Prydz Bay: Fuzzy Inference system• Costa Rica convergent margin: Neuro-fuzzy
framework• Climate modeling
• Monsoon Prediction• Paleo-climatic nonlinea time series analysis
Earthquakes: The unsolved questions…..
• Location of earthquakes: nucleation• Magnitude• Self organization & Ergodicity?• Precursory phenomenon ?
California fault system : Parkfield region• Aims:
• Characterize similarities in waveforms: GIS• Model the waveforms: Fuzzy membership functions
Lessons learnt… Similar magnitude and rupture extent: fault segmentation Long-term non randomness of earthquakes: remarkably similar in
size and location of rupture, albeit not in epicentre or rupture
Geospatial Analysis: Overview
Filter and cluster the voluminous seismic data System Constraints:
Earthquake Parameters : similar faulting mechanism, magnitude, rupture direction and have occurred on the same fault segment or the same epicenter. Lower variability may be achieved if events are further constrained to have the same rupture time history and distribution of slip.
Source and Station Characteristics : geological setting of the station, the source and the path of propagation is also a major consideration.
Why Parkfield?Why Parkfield?The 1934, 1966 and 2004 Parkfield earthquakes used to arrive at this model are remarkably similar in size and location of rupture, albeit not in epicenter or rupture propagation direction (Bakun & McEvilly (1979), Bakun et al. (2005)).
DetailsDetails Earthquake data used: 1934, 1966, 2004 Parkfield
earthquake [COSMOS] Conversion to Excel, then used in ArcMap 9.1 Soil layer data: NRCS, DEM data: CGIAR-CSI System constraints used: hypocenter parameters,
station/event parameters and sensor description.
Geospatial Analysis: Results I
Type Year Station MUID Area(sq. km)
Bedrock Depth(m)
Soil Profile(L1-L11)
I1966 Chalome Array 2 CA501 53642 152 9;9;9;9;6;6;6;6;6;15;15
1966 Chalome Shandom Array 5 & 8
CA502 187777 150 9;9;9;16;16;16;16;16;16;15;15
II2004 Vinayard Canyon CA561 59483.1 140 6;6;6;6;6;12;12;12;7;15;15
2004 Jack Canyon CA344 623444 73 6;6;3;3;3;15;15;15;15;15
1966
Temblor CA344 623444 73 6;6;3;3;3;15;15;15;15;15
III1966 San Luis Obispo CA515 831475 89 12;12;9;11;11;11;11;11;15;15;15
2004 San Luis Obispo CA515 831475 89 12;12;9;11;11;11;11;11;15;15;15
2004 Hollister; Airport Building
CA568 129241 152 12;12;12;12;12;12;12;12;12;15;15
IV1966 Taft, Lincoln
School TunnelCA347 2809850 152 3;3;3;3;3;3;3;3;3;15;15
1966 Chalome Array 12 CA503 23115.4 141 3;3;3;3;3;3;6;6;16;15;15
2004 Fresno;VA Medical Center
CA309 41103.9 147 3;3;3;3;3;7;16;16;16;15;15
2004 Fresno; NAMP USGS Office
CA307 2262820 152 3;3;3;3;3;3;3;3;3;15;15
2004
Parkfield;Eades CA503 29183.2 141 3;3;3;3;3;3;6;6;16;15;15
2004 Coalinga; Fire Station
CA346 394110 152 3;3;3;6;6;6;6;6;6;15;15
V2004 Hollister;City Hall CA548 172888 152 9;9;9;9;9;9;16;16;16;15;15
2004 Joaquin Canyon CA558 531502 63 9;9;9;9;9;11;15;15;15;15;15
2004 Donna Lee CA558 531502 63 9;9;9;9;9;11;15;15;15;15;15
2004 Parkfield;Froelich CA502 52690.8 150 9;9;9;16;16;16;16;16;16;15;15
VI2004 Middle Mountain CA555 49159.6 82 8;8;8;8;8;8;8;15;15;15;15
2004
Parkfield;Gold Hill CA505 529707 82 8;8;8;8;8;9;16;15;15;15;15
2004 Hog Canyon CA555 286145 82 8;8;8;8;8;8;8;15;15;15;15
2004 Work Ranch CA505 686713 82 8;8;8;8;8;9;16;15;15;15;15
2004 Parkfield;Red Hills CA505 529707 82 8;8;8;8;8;9;16;15;15;15;15
2004 Parkfield; UPSAR (1-3,5-13)
CA555 286145 82 8;8;8;8;8;8;8;15;15;15;15
List of stations grouped into six types.
Geospatial Analysis: Results• City Recreation Bldg-864 Santa Rosa, San
Luis Obispo had records of both earthquakes.
• Comparative analysis of both earthquakes: Average S.D.= 0.00693728 cm/s^2, visual similarity
Algorithm System: Basic Structure
If following P-S region Thresholds for any
Magnitude, retrieve data
Calculate grade for every magnitude
Undetected Quake
Clustered SeismicDatabase
Data processing
Filter magnitude< threshold
FuzzySegment Architecture
Incoming acceleration
Calculate cumulative Membership grade for
time interval
CheckThreshold
Alarm
Algorithm : Process I (Clustering)
Input: Seismic data identified by geospatial analysis Aim: To model the general waveform pattern from an active seismic zone. Three Processes:
ClusteringMembership Function developmentEvolutionary Algorithm
DetailsDetailsA graph, specific to an instant from
the onset of P waves, is plotted between acceleration and magnitude of the corresponding earthquake.
The clustering algorithm used in the process is Ward’s Method
The process is repeated to find curves for every instant in the P-S interval.
• Input: The incoming value of acceleration at that station is fed as input
• Output: earthquake magnitudes and corresponding membership grades
Algorithm : Process I (Clustering)
•Only magnitude: corresponding membership grade is greater than 0.8•Cumulative from t1 to t2•Above a threshold: 0.73, tested for the earthquakes
Limitations•Data dependent•Membership function development : computationally intensive
Topics
• California fault system• Parkfield Earthquakes: Waveform modeling,
GIS• Earthquake nucleation: Pattern Informatics
• Taiwan Landslides: Neuro-fuzzy framework• Well log analysis
• Prydz Bay: Fuzzy Inference system• Costa Rica convergent margin: Neuro-fuzzy
framework• Climate modeling
• Monsoon Prediction• Paleo-climatic nonlinea time series analysis
The Pattern Informatics Method
• The PI index is an analytical method for quantifying the spatiotemporal seismicity rate changes in historic seismicity (Tiampo et.al.,2002).
• The observed seismicity activity rate ψobs(xi,t) : proxy for the energy release, earthquakes per unit time (M>Mcutoff )within the box centred at xi at time t.• The average seismicity function S(xi,t0,t) over the time interval (t-t0) is defined as:
• The mean –zero, unit-norm function, obtained by deducting the average and dividing by the standard deviation is defined thereafter as:
• Physically, the important changes in seismicity are given by :
• The final calculation involves averaging over all the base years, t0, to reduce the effects of noise.
• The PI index, which represents the time-independent background, is denoted by:
0
00
1( , , ) ( , )
t
i obs i
t
S x t t x t dtt t
0 00
0
( , , ) ( , , )( , , )
( , , )i i
ii
S x t t S x t tS x t t
S x t t
1 2 0 2 0 1ˆ ˆ ˆ( , , ) ( , , ) ( , , )i i is x t t s x t t s x t t
2
2 2ˆ( , , ) ( , , )i i i i PP x t t S x t t
Pertinent questions…
• Optimal Temporal Regions?• Magnitude of forecasted earthquake?• Cutoff magnitude to filter?• Threshold PI for hotspots?
Target magnitude & Cutoff magnitudeThe success of a forecast is based on maximizing the fraction of earthquakes that occur in alarm cells and minimizing the fraction of alarm cells that do not result in earthquakes.
A closer look…
Threshold PI for identifying hotspots
Identify optimal temporal regions in a catalog• The TM fluctuation metric measures effective ergodicity, or the difference
between the time average of a quantity and its ensemble average over the entire system (Thirumalai et al., 1989).
• Identify the regions of parameter space which exhibit stationary nature and thereby give an optimal forecast (Tiampo et al., 2003, 2007).
Optimal forecast for California
Bin size, dX = 0.1Target forecasting magnitude, Mtarget=5.1Threshold PI for binary forecast = 0 for the used bin sizeCatalog magnitude cutoff , Mc=3.1tb=1932, t1=1968, t2=1986, t3=2004, where t2-t3 is the forecasting interval
Ongoing work…
• Inversion model for forecasting magnitudes of future earthquakes• Rupture area from PI (Tiampo, 2007)• Fault segmentation• PI value of the hotspot
Topics• California fault system
• Parkfield Earthquakes: Waveform modeling, GIS
• Earthquake nucleation: Pattern Informatics• Taiwan Landslides: Neuro-fuzzy framework• Well log analysis
• Prydz Bay: Fuzzy Inference system• Costa Rica convergent margin: Neuro-fuzzy
framework• Climate modeling
• Monsoon Prediction• Paleo-climatic nonlinea time series analysis
Landslide prediction
• Aim: to formulate and validate a neuro-fuzzy framework and compare with other empirical approaches
• Study Area:• Taiwan: circum-Pacific seismic belt• Fractured rock mass along jighways• Heavy rainfall
• Previous work (Lee et. al., 1996; Lu, 2001;Chang, 2005)• Typically in weathered soils at low elevation data• Happened at different
1. slope grades
2. Slope heights
3. Slope shapes
4. Geological formations
Framework synopsis
• Parameters• Topographic
• Grade
• Height
• Aspect
• Shape
• Geological• Formation
• Thickness of soil layer
Geological Formation
OutcomeKuantaoshan
Sandstone(type 1)
Tawo Sandstone(type 2)
Cholan Formation
(type 3)
Nanchuang Formation
(type 4)
Shihliufen Shale
(type 5)
Chinshui Shale
(type 6)
Correlation coefficients
Influencing Factor
I II III IV V VI VII VIII IX
I 1.000 -0.016 -0.059 -0.103 -0.047 -0.380 -0.026 -0.096 -0.025
II -0.016 1.000 0.089 0.108 0.070 0.001 -0.026 -0.096 -0.025
III -0.059 0.089 1.000 0.266 0.173 0.102 0.123 -0.074 0.152
IV -0.103 0.108 0.266 1.000 0.246 0.209 0.151 -0.240 0.130
V -0.047 0.070 0.173 0.173 1.000 0.209 0.215 -0.400 -0.014
VI -0.380 0.001 0.102 0.102 0.209 1.000 0.144 -0.098 0.034
VII -0.026 -0.026 0.123 0.123 0.215 0.144 1.000 0.020 0.010
VIII -0.096 -0.096 -0.074 -0.074 -0.400 -0.098 0.020 1.000 -0.056
IX -0.025 -0.025 0.152 0.152 -0.014 0.034 0.010 -0.056 1.000
Results
Frequency distribution of output (1-Landslide, 0-No landslide)
ANN(80.62%)
MSA(75.29%)
Neuro-Fuzzy(86.47%)
Topics
• California fault system• Parkfield Earthquakes: Waveform modeling,
GIS• Earthquake nucleation: Pattern Informatics
• Taiwan Landslides: Neuro-fuzzy framework• Well log analysis
• Prydz Bay: Fuzzy Inference system• Costa Rica convergent margin: Neuro-fuzzy
framework• Climate modeling
• Monsoon Prediction• Paleo-climatic nonlinea time series analysis
Objective and the studied region• The identification of groundwater,
oil and gas formation lithology from well log data largely depends on expert experience and some subjective rules: “if the natural gamma ray reading is high and the separation between shallow formation resistivity and deep formation resistivity is small, then the formation lithology is probably shale (Chapellier, 1992).”
• The well logging data from ODP Leg 188 boreholes site - 1166A and 1165C were taken as the case study for the present work (O’Brien et al. 2001)
Modeling Parameters• Input Variables used:
• Porosity
• Gamma ray
• Bulk density
• Transit time interval
• Resistivity difference
• Linguistic terms: very low (VL), low (L), medium (M), high (H), and very high (VH)
• Output variables: sand (%), gravel (%) and major soil component size (MSCS) H->clay, M->silt, and L->sand
• Characterization of diamicts, gravels/ conglomerates and breccias modified after Moncrieff (1989)
Linguistic term of output
variable
Grain size range of matrix
(cm)
Reference boundary of
linguistic term
Sand 20>AGS>2-4 [0,-4]
Silt 2-4>AGS>2-8 [-4,-8]
Clay 2-8>AGS>2-12 [-8,-12]
Input & Output trapezoidal membership functionsIf POR GR DEN ΔT ΔR is % Sand % Gravel MSCS Weights
VL M VH M NA M M H 0.8
L M VH M NA H M H 1
VL M H NA L H M M 0.8
NA L H M L M L M 1
L M NA H VL M VL M 1
VL M NA L L L VL M 1
M M NA NA VL L VL L 1
H L NA H VL L VL L 1
H L NA H VL L VL L 1
H M NA M H VH VL H 1
VH L NA H VH H VL H 1
NA L NA L L VH VL H 1
NA M H NA VL L VL L 1
NA H H NA VL L VL L 1
NA VH H NA VL L VL L 1
VL NA NA NA VL L VL L 1
VL M VH NA VL H M H 1
VL M VH NA L H M H 1
NA L NA NA Not L VH VL H 0.8
H NA M NA NA VH VL H 0.8
VL M NA VL NA L VL M 1
L M NA VL NA M VL M 1
NA M H M L L L M 0.8
NA M VH NA L M M H 0.8
Abbreviation: POR: porosity log; GR: gamma ray log; DEN: bulk density; ΔT: Compressional transit time interval; ΔR: separation between phasor deep induction and spherically focused resistivity Log; MSCS: major soil component's size; (N/A): rule did not use this component after system training.
1
2
3
-400
-300
-200
-100
0
-1
0
1
lithology depth
fuzzy lithology
true lithology
1
2
3
-400
-300
-200
-100
0
-1
0
1
depthsoil
Major Soil Component
1
2
3
-400
-300
-200
-100
0
-1
0
1
depthlithology
fuzzy lithology
true lithology
1
2
3
-400
-300
-200
-100
0
-1
0
1
depthsoil
Major Soil Component
Comparison between true lithology and fuzzy lithology 1-Diamictite, 2- Clay/Silt, 3- Sand
1165C
1166B
Well logs & fuzzy lithology – 1165C
0 80 160
G am m a R ay
0 1 2
S h allow R esistivityD eep R esistivity
0 0.8 1.6 2.4
B u lk D en sity
0 0.5 1
1000
900
800
700
600
500
400
300
200
P orosity
Well logs & fuzzy lithology – 1166A
0 100 200 300 400
G am m a R ay
0 2 4 6 8
S h allow R esistivityD eep R esistivity
1 1.5 2 2.5
B u lk D en sity
40 80 120 160 200
Tran sit T im eIn terval
0 0.5 1
350
300
250
200
150
100
50
Porosity
Performance analysis
• 80% training data; 20% testing data
• Borehole site 1166A:
• Training performance: 214 training data sets were identified correctly from the total 258 training data sets with a success rate of 82.95%
• Testing Performance: 57 test data sets were predicted correctly from the total of 65 testing data sets (Fig. 7) with an accuracy of 87.69%
• This technique is also capable of providing significant lithology information, where core recovery is incomplete.
• Core analysis provides a more subjective interpretation but well log analysis may easily:
• define a permeable sand formation• distinguish between silts and sands• determine grain size variation in sands.
• Error due to
• heterogeneous and/or anisotropic conditions existing at this depth between the two wells that resulted in the wrong prediction and
• Some factors that were not considered in this study such as photoelectric log, which may provide another perspective.
0.1 10 1000 100000
-1000
-800
-600
-400
-200
0
Dep
th (m
)
M eth an e (C 1)
E th an e (C 2)
P ropan e (C 3)
0.1 10 1000 100000
C n (ppmv)
100 1000 10000
-1000
-800
-600
-400
-200
0
Dep
th (m
)
100 1000 10000
C 1/ C 2
0 0.4 0.8 1.2 1.6 2
-1000
-800
-600
-400
-200
0
Dep
th (
m)
0 0.4 0.8 1.2 1.6 2O rgan ic carbon (wt %)
Correlation with Geochemical Analysis : 1165C
Conclusions
• Natural systems show evidence of imprecise parameters which may be modeled using Fuzzy Parameterization
• Earthquake fault systems show nucleation and ergodicity: may help in better forecasts using fuzzy logic
• Landslide prediction parameters are inherently imprecise and the best modeled using fuzzy parameterization
• Well log analysis may be made more subjective while incorporating the expertise of the analyst using a Inference engine
• There are limitations in each application which may be considered before using the paradigm.
THANK YOU!!!
• Stanford Exploration Project & Stanford Geophysics• Mentors, collaborators and supervisors
• Kristy Tiampo – University of Western Ontario
• Gerhard Pratt – Queen’s/ UWO
• J. Srinivasan – Indian Institute of Science
• Der-Har Lee – NCKU, Taiwan
• K. Srinivasa Raju – BITS-Pilani
• Upendra K. Singh – Indian School of Mines, Dhanbad
• Data Sources• COSMOS, CGIAR-CSI, NRCS, ANSS, ODP, NCKU
QUESTIONS….