SPARK Search Engine
description
Transcript of SPARK Search Engine
![Page 1: SPARK Search Engine](https://reader035.fdocuments.net/reader035/viewer/2022081419/56816617550346895dd965b7/html5/thumbnails/1.jpg)
SPARK
Search Engine
![Page 2: SPARK Search Engine](https://reader035.fdocuments.net/reader035/viewer/2022081419/56816617550346895dd965b7/html5/thumbnails/2.jpg)
Martijn HarthoornProgrammer at FuroreImplementer of the Search Engine of SPARK
http://spark.furore.com/fhir/patient?...
The work after the question mark.
Who am I?
![Page 3: SPARK Search Engine](https://reader035.fdocuments.net/reader035/viewer/2022081419/56816617550346895dd965b7/html5/thumbnails/3.jpg)
The place of Search
REST Service
Storage
Index&
Search
MongoDB
Spark
![Page 4: SPARK Search Engine](https://reader035.fdocuments.net/reader035/viewer/2022081419/56816617550346895dd965b7/html5/thumbnails/4.jpg)
ParadigmFHIR client should be easy. FHIR server needs to solve the complex issues.
Search
Search has some…
![Page 5: SPARK Search Engine](https://reader035.fdocuments.net/reader035/viewer/2022081419/56816617550346895dd965b7/html5/thumbnails/5.jpg)
First there was Storage
Search
Then there was Search
![Page 6: SPARK Search Engine](https://reader035.fdocuments.net/reader035/viewer/2022081419/56816617550346895dd965b7/html5/thumbnails/6.jpg)
Connectathon
To test a client – you must have a tested serverTo test a server – you must have a tested client
“One fool can ask more questions than seven wise men can answer”
![Page 7: SPARK Search Engine](https://reader035.fdocuments.net/reader035/viewer/2022081419/56816617550346895dd965b7/html5/thumbnails/7.jpg)
Connectathon
“But what if you are wrong?”
![Page 8: SPARK Search Engine](https://reader035.fdocuments.net/reader035/viewer/2022081419/56816617550346895dd965b7/html5/thumbnails/8.jpg)
History
Version 1.
- A Generics based implementation - On top of the FHIR data model. - Programmed per search parameter programming. - No meta data available yet.- No indexing. - Slow.
![Page 9: SPARK Search Engine](https://reader035.fdocuments.net/reader035/viewer/2022081419/56816617550346895dd965b7/html5/thumbnails/9.jpg)
History
Version 2.
- Data Model independent,- Meta data not available - manually added- Lucene.NET as indexer (Index in Lucene, Database in Mongo)- Fast- Standardised all parameter specifics into standard “modifiers”.- All Code based on search parameter types.- Joins are client side
![Page 10: SPARK Search Engine](https://reader035.fdocuments.net/reader035/viewer/2022081419/56816617550346895dd965b7/html5/thumbnails/10.jpg)
History
Version 3.
- Modified to store the Lucene index in Mongo- Index storage unreliable.- Never saw light of day
![Page 11: SPARK Search Engine](https://reader035.fdocuments.net/reader035/viewer/2022081419/56816617550346895dd965b7/html5/thumbnails/11.jpg)
History
Version 4. CURRENT
- Index storage to a dedicated Mongo collection- Build expression tree from parameters- Chained parameters have full functionality (modifiers, operators)- Joins are client side
![Page 12: SPARK Search Engine](https://reader035.fdocuments.net/reader035/viewer/2022081419/56816617550346895dd965b7/html5/thumbnails/12.jpg)
Indexing
Why indexing?
![Page 13: SPARK Search Engine](https://reader035.fdocuments.net/reader035/viewer/2022081419/56816617550346895dd965b7/html5/thumbnails/13.jpg)
Why indexing
http://spark.furore.com/fhir/patient?provider.name:partial=Health
![Page 14: SPARK Search Engine](https://reader035.fdocuments.net/reader035/viewer/2022081419/56816617550346895dd965b7/html5/thumbnails/14.jpg)
Why indexing
http://spark.furore.com/fhir/patient?provider.name:partial=Health
![Page 15: SPARK Search Engine](https://reader035.fdocuments.net/reader035/viewer/2022081419/56816617550346895dd965b7/html5/thumbnails/15.jpg)
Indexing. HOW-TO
You DO want A de-serialized data to an object with all values strongly typed.
You DON’T want to spend time analyzing and interpreting JSON and/or XML.
1. Harvest the Resource2. Determine data type 3. Groom your data4. Store data in Index
![Page 16: SPARK Search Engine](https://reader035.fdocuments.net/reader035/viewer/2022081419/56816617550346895dd965b7/html5/thumbnails/16.jpg)
Indexing. 1. Harvesting
Resource: PatientSearch parameter: family
Searches for the family name and prefix of every HumanName that is registered with a Patient.
Usage:
http://spark.furore.com/fhir/patient?family=White
![Page 17: SPARK Search Engine](https://reader035.fdocuments.net/reader035/viewer/2022081419/56816617550346895dd965b7/html5/thumbnails/17.jpg)
Indexing. 1. Harvesting
Patient
List<Name>
Name (HumanName)
Name (HumanName)
Name (HumanName)
Family
Prefix
Given
Suffix
Resource: PatientSearch parameter: family
Using the Visitor pattern
Path from Meta data: "patient.Name.Prefix" "patient.Name.Family"
![Page 18: SPARK Search Engine](https://reader035.fdocuments.net/reader035/viewer/2022081419/56816617550346895dd965b7/html5/thumbnails/18.jpg)
Indexing. 2. Determine data type
> patient (Patient) > Name (HumanName) > LastName (string)
Data type: stringSearch parameter type: string
Selected indexing method: - Single value – as string- More values – as string array
![Page 19: SPARK Search Engine](https://reader035.fdocuments.net/reader035/viewer/2022081419/56816617550346895dd965b7/html5/thumbnails/19.jpg)
Indexing. 2. Determine data type
> patient (Patient) > Gender (Coding) > Coding (List<Coding>)> Code (CodeableConcept)
Data type: CodeSearch parameter type: Token
Selected Indexing method:Store in an array each codeable concept - System (uri)- Code (string)- Display (string)
![Page 20: SPARK Search Engine](https://reader035.fdocuments.net/reader035/viewer/2022081419/56816617550346895dd965b7/html5/thumbnails/20.jpg)
Indexing. 3. Groom your data
- Remove dashes, dots, slashes from dates etc.
- If you implement a like search from the left side, you might want to split names at the dash in to multiple hits.
![Page 21: SPARK Search Engine](https://reader035.fdocuments.net/reader035/viewer/2022081419/56816617550346895dd965b7/html5/thumbnails/21.jpg)
Indexing. 4. Store in the index
Field Value
Resource "Patient"
Local ID patient/1
Level 0
Family ["LaVaughn", "Robinson", "Obama"]
Given "Michelle"
Gender [ { System: “…”, Code: “..”, Display: “..” } , …
…
* LevelThe patient is not a contained resource (level 0)
* Family In Mongo you can store an array that can be searched like a normal string.
![Page 22: SPARK Search Engine](https://reader035.fdocuments.net/reader035/viewer/2022081419/56816617550346895dd965b7/html5/thumbnails/22.jpg)
Future
Version 5. NEXT
- All parameters based on FHIR data types?- Joins using Mongo Map-Reduce?
![Page 23: SPARK Search Engine](https://reader035.fdocuments.net/reader035/viewer/2022081419/56816617550346895dd965b7/html5/thumbnails/23.jpg)
Complexity
So what is the issue?
![Page 24: SPARK Search Engine](https://reader035.fdocuments.net/reader035/viewer/2022081419/56816617550346895dd965b7/html5/thumbnails/24.jpg)
Complexity
Include & Chained parameters
- Joining over references return multiple resource types - Client side (not in Mongo database) joins
![Page 25: SPARK Search Engine](https://reader035.fdocuments.net/reader035/viewer/2022081419/56816617550346895dd965b7/html5/thumbnails/25.jpg)
Complexity
Transactions
- FHIR has bulk POST- Split between Indexing and storage
![Page 26: SPARK Search Engine](https://reader035.fdocuments.net/reader035/viewer/2022081419/56816617550346895dd965b7/html5/thumbnails/26.jpg)
Complexity
Multiple typesSome properties do not have a fixed type.
Example: observation.value
Can be a:- CodeableConcept- String - Quantity (number + unit)