Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y geoespaciales
-
Upload
mongodb -
Category
Data & Analytics
-
view
993 -
download
0
Transcript of Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y geoespaciales
![Page 1: Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y geoespaciales](https://reader031.fdocuments.net/reader031/viewer/2022022203/5873ec671a28abb1528b46a7/html5/thumbnails/1.jpg)
![Page 2: Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y geoespaciales](https://reader031.fdocuments.net/reader031/viewer/2022022203/5873ec671a28abb1528b46a7/html5/thumbnails/2.jpg)
MongoDB Europe 2016Old Billingsgate, London
15th November
Use my code rubenterceno20 for 20% off ticketsmongodb.com/europe
![Page 3: Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y geoespaciales](https://reader031.fdocuments.net/reader031/viewer/2022022203/5873ec671a28abb1528b46a7/html5/thumbnails/3.jpg)
Conceptos Básicos 2016Indexación Avanzada:Índices de texto y Geoespaciales
Rubén TerceñoSenior Solutions Architect, [email protected]@rubenTerceno
![Page 4: Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y geoespaciales](https://reader031.fdocuments.net/reader031/viewer/2022022203/5873ec671a28abb1528b46a7/html5/thumbnails/4.jpg)
Agenda del CursoDate Time Webinar25-Mayo-2016 16:00 CEST Introducción a NoSQL 7-Junio-2016 16:00 CEST Su primera aplicación MongoDB 21-Junio-2016 16:00 CEST Diseño de esquema orientado a documentos 07-Julio-2016 16:00 CEST Indexación avanzada, índices de texto y geoespaciales 19-Julio-2016 16:00 CEST Introducción al Aggregation Framework 28-Julio-2016 16:00 CEST Despliegue en producción
![Page 5: Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y geoespaciales](https://reader031.fdocuments.net/reader031/viewer/2022022203/5873ec671a28abb1528b46a7/html5/thumbnails/5.jpg)
Resumen de lo visto hasta ahora• ¿Porqué existe NoSQL?• Tipos de bases de datos NoSQL• Características clave de MongoDB
• Instalación y creación de bases de datos y colecciones• Operaciones CRUD• Índices y explain()
• Diseño de esquema dinámico• Jerarquía y documentos embebidos• Polimorfismo
![Page 6: Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y geoespaciales](https://reader031.fdocuments.net/reader031/viewer/2022022203/5873ec671a28abb1528b46a7/html5/thumbnails/6.jpg)
Indexing• An efficient way to look up data by its value• Avoids table scans
![Page 7: Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y geoespaciales](https://reader031.fdocuments.net/reader031/viewer/2022022203/5873ec671a28abb1528b46a7/html5/thumbnails/7.jpg)
Traditional Databases Use B-trees• … and so does MongoDB
![Page 8: Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y geoespaciales](https://reader031.fdocuments.net/reader031/viewer/2022022203/5873ec671a28abb1528b46a7/html5/thumbnails/8.jpg)
O(Log(n) Time
![Page 9: Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y geoespaciales](https://reader031.fdocuments.net/reader031/viewer/2022022203/5873ec671a28abb1528b46a7/html5/thumbnails/9.jpg)
Creating a Simple Indexdb.coll.createIndex( { fieldName : <Direction> } )
Database Name
Collection Name
Command
Field Name to be indexed
Ascending : 1 Descending : -1
![Page 10: Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y geoespaciales](https://reader031.fdocuments.net/reader031/viewer/2022022203/5873ec671a28abb1528b46a7/html5/thumbnails/10.jpg)
Two Other Kinds of Indexes• Full Text Index
• Allows searching inside the text of a field or several fields, ordering the results by relevance.
• Geospatial Index• Allows geospatial queries
• People around me.• Countries I’m traversing during my trip.• Restaurants in a given neighborhood.
• These indexes do not use B-trees
![Page 11: Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y geoespaciales](https://reader031.fdocuments.net/reader031/viewer/2022022203/5873ec671a28abb1528b46a7/html5/thumbnails/11.jpg)
Full Text Indexes• An “inverted index” on all the words inside text fields (only one text index per collection)
{ “comment” : “I think your blog post is very interesting and informative. I hope you will post more info like this in the future” }
>> db.posts.createIndex( { “comments” : “text” } )
MongoDB Enterprise > db.posts.find( { $text: { $search : "info" }} ){ "_id" : ObjectId(“…"), "comment" : "I think your blog post is very interesting and informative. I hope you will post more info like this in the future" }MongoDB Enterprise >
![Page 12: Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y geoespaciales](https://reader031.fdocuments.net/reader031/viewer/2022022203/5873ec671a28abb1528b46a7/html5/thumbnails/12.jpg)
On The Server2016-07-07T09:48:48.605+0200 I INDEX [conn4] build index on: indexes.products properties: { v: 1,
key: { _fts: "text", _ftsx: 1 }, name: "longDescription_text_shortDescription_text_name_text”,ns: "indexes.products", weights: { longDescription: 1, name: 10, shortDescription: 3 },default_language: "english”,language_override: "language”,textIndexVersion: 3 }
![Page 13: Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y geoespaciales](https://reader031.fdocuments.net/reader031/viewer/2022022203/5873ec671a28abb1528b46a7/html5/thumbnails/13.jpg)
More Detailed Example>> db.posts.insert( { "comment" : "Red yellow orange green" } )>> db.posts.insert( { "comment" : "Pink purple blue" } )>> db.posts.insert( { "comment" : "Red Pink" } )
>> db.posts.find( { "$text" : { "$search" : "Red" }} ){ "_id" : ObjectId("…"), "comment" : "Red yellow orange green" }
{ "_id" : ObjectId("…"), "comment" : "Red Pink" }
>> db.posts.find( { "$text" : { "$search" : "Pink Green" }} ){ "_id" : ObjectId("…"), "comment" : "Red Pink" }
{ "_id" : ObjectId("…"), "comment" : "Red yellow orange green" }
>> db.posts.find( { "$text" : { "$search" : "red" }} ) # <- Case Insensitve{ "_id" : ObjectId("…"), "comment" : "Red yellow orange green" }
{ "_id" : ObjectId("…"), "comment" : "Red Pink" }
>>
![Page 14: Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y geoespaciales](https://reader031.fdocuments.net/reader031/viewer/2022022203/5873ec671a28abb1528b46a7/html5/thumbnails/14.jpg)
Using Weights• We can assign different weights to different fields in the text index• E.g. I want to favour name over shortDescription in searching• So I increase the weight for the the name field
>> db.blog.createIndex( { shortDescription: "text", longDescription: "text”,
name: "text” }, { weights: { shortDescription: 3,
longDescription: 1, name: 10 }} )• Now searches will favour name over shortDesciption over longDescription
![Page 15: Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y geoespaciales](https://reader031.fdocuments.net/reader031/viewer/2022022203/5873ec671a28abb1528b46a7/html5/thumbnails/15.jpg)
$textscore• We may want to favor results with higher weights, thus:
>> db.products.find({$text : {$search: "humongous"}}, {score: {$meta : "textScore"}, name: 1, longDescription: 1, shortDescription: 1}).sort( { score: { $meta: "textScore" } } )
![Page 16: Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y geoespaciales](https://reader031.fdocuments.net/reader031/viewer/2022022203/5873ec671a28abb1528b46a7/html5/thumbnails/16.jpg)
Other Parameters• Language : Pick the language you want to search in e.g.
• $language : Spanish• Support case sensitive searching
• $caseSensitive : True (default false)• Support accented characters (diacritic sensitive search e.g. café
is distinguished from cafe )• $diacriticSensitive : True (default false)
![Page 17: Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y geoespaciales](https://reader031.fdocuments.net/reader031/viewer/2022022203/5873ec671a28abb1528b46a7/html5/thumbnails/17.jpg)
Geospatial Indexes• 2d
• Represents a flat surface. A good fit if:• You have legacy coordinate pairs (MongoDB 2.2 or earlier).• You do not plan to use geoJSON objects.• You don’t worry about the Earth's curvature. (Yup, earth is not flat)
• 2dsphere• Represents a flat surface on top of an spheroid.• It should be the default choice for geoData• Coordinates are (usually) stored in GeoJSON format• The index is based on a QuadTree representation• The index is based on WGS 84 standard
![Page 18: Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y geoespaciales](https://reader031.fdocuments.net/reader031/viewer/2022022203/5873ec671a28abb1528b46a7/html5/thumbnails/18.jpg)
Coordinates• Coordinates are represented as longitude, latitude• Longitude
• Measured from Greenwich meridian (0 degrees) • For locations east up to +180 degrees• For locations west we specify as negative up to -180
• Latitude• Measured from equator north and south (0 to 90 north, 0 to -90 south)
• Coordinates in MongoDB are stored on Longitude/Latitude order• Coordinates in Google Maps are stored in Latitude/Longitude order
![Page 19: Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y geoespaciales](https://reader031.fdocuments.net/reader031/viewer/2022022203/5873ec671a28abb1528b46a7/html5/thumbnails/19.jpg)
2dSphere Versions• Two versions of 2dSphere index in MongoDB• Version 1 : Up to MongoDB 2.4• Version 2 : From MongoDB 2.6 onwards• Version 3 : From MongoDB 3.2 onwards• We will only be talking about Version 3 in this webinar
![Page 20: Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y geoespaciales](https://reader031.fdocuments.net/reader031/viewer/2022022203/5873ec671a28abb1528b46a7/html5/thumbnails/20.jpg)
Creating a 2dSphere Indexdb.collection.createIndex ( { <location field> : "2dsphere" } )
• Location field must be coordinate or GeoJSON data
![Page 21: Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y geoespaciales](https://reader031.fdocuments.net/reader031/viewer/2022022203/5873ec671a28abb1528b46a7/html5/thumbnails/21.jpg)
Example
>> db.wines.createIndex( { geometry: "2dsphere" } ){
"createdCollectionAutomatically" : false,"numIndexesBefore" : 1,"numIndexesAfter" : 2,"ok" : 1
}
![Page 22: Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y geoespaciales](https://reader031.fdocuments.net/reader031/viewer/2022022203/5873ec671a28abb1528b46a7/html5/thumbnails/22.jpg)
Testing Geo Queries• Lets search for wine regions in the world• Using two collections from my gitHub repo
• https://github.com/terce13/geoData
• Import them into MongoDB• mongoimport -c wines -d geo wine_regions.json• mongoimport -c countries -d geo countries.json
![Page 23: Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y geoespaciales](https://reader031.fdocuments.net/reader031/viewer/2022022203/5873ec671a28abb1528b46a7/html5/thumbnails/23.jpg)
Country Document (Vatican){
"_id" : ObjectId("577e2ebd1007503076ac8c86"),
"type" : "Feature","properties" : {
"featurecla" : "Admin-0 country",
"sovereignt" : "Vatican","type" : "Sovereign country","admin" : "Vatican","adm0_a3" : "VAT","name" : "Vatican","name_long" : "Vatican","abbrev" : "Vat.","postal" : "V","formal_en" : "State of the
Vatican City","name_sort" : "Vatican (Holy
Sea)","name_alt" : "Holy Sea”,"pop_est" : 832,"economy" : "2. Developed
region: nonG7","income_grp" : "2. High income:
nonOECD","continent" : "Europe","region_un" : "Europe",
"subregion" : "Southern Europe","region_wb" : "Europe & Central
Asia",},
"geometry" : {"type" : "Polygon","coordinates" :
[ [ [12.439160156250011, 41.898388671875],
[12.430566406250023, 41.89755859375],
[12.427539062500017, 41.900732421875],
[12.430566406250023, 41.90546875],
[12.438378906250023, 41.906201171875],
[12.439160156250011, 41.898388671875]]]}
}
![Page 24: Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y geoespaciales](https://reader031.fdocuments.net/reader031/viewer/2022022203/5873ec671a28abb1528b46a7/html5/thumbnails/24.jpg)
Wine region documentMongoDB Enterprise > db.wines.findOne(){
"_id" : ObjectId("577e2e7e1007503076ac8769"),"properties" : {"name" : "AOC Anjou-Villages","description" : null,"id" : "a629ojjxl15z"},"type" : "Feature","geometry" : {"type" : "Point","coordinates" : [ -0.618980171610645, 47.2211343496821]}
}
You can type this into google maps but
remember to reverse the coordinate order
![Page 25: Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y geoespaciales](https://reader031.fdocuments.net/reader031/viewer/2022022203/5873ec671a28abb1528b46a7/html5/thumbnails/25.jpg)
Add IndexesMongoDB Enterprise > db.wines.createIndex({ geometry: "2dsphere" }){
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 1,
"numIndexesAfter" : 2,
"ok" : 1
}
MongoDB Enterprise > db.countries.createIndex({ geometry: "2dsphere" }){
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 1,
"numIndexesAfter" : 2,
"ok" : 1
}
MongoDB Enterprise >
![Page 26: Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y geoespaciales](https://reader031.fdocuments.net/reader031/viewer/2022022203/5873ec671a28abb1528b46a7/html5/thumbnails/26.jpg)
$geoIntersects to find our country• Assume we are at lat: 43.47, lon: -3.81• What country are we in? Use $geoIntersects
db.countries.findOne({ geometry: { $geoIntersects: { $geometry: { type: "Point", coordinates: [ -3.81, 43.47 ]}}}},
{"properties.name": 1})
![Page 27: Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y geoespaciales](https://reader031.fdocuments.net/reader031/viewer/2022022203/5873ec671a28abb1528b46a7/html5/thumbnails/27.jpg)
Results{
"_id" : ObjectId("577e2ebd1007503076ac8be5"),
"properties" : {"name" : "Spain"
}}
![Page 28: Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y geoespaciales](https://reader031.fdocuments.net/reader031/viewer/2022022203/5873ec671a28abb1528b46a7/html5/thumbnails/28.jpg)
Wine regions around me• Use $near (ordered results by distance)
db.wines.find({geometry: {$near: {$geometry:{type : "Point",
coordinates : [-3.81,43.47]}, $maxDistance: 250000 } }})
![Page 29: Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y geoespaciales](https://reader031.fdocuments.net/reader031/viewer/2022022203/5873ec671a28abb1528b46a7/html5/thumbnails/29.jpg)
Results (Projected){ "properties" : { "name" : "DO Arabako-Txakolina" } }{ "properties" : { "name" : "DO Chacoli de Vizcaya" } }{ "properties" : { "name" : "DO Chacoli de Guetaria" } }{ "properties" : { "name" : "DO Rioja" } }{ "properties" : { "name" : "DO Navarra" } }{ "properties" : { "name" : "DO Cigales" } }{ "properties" : { "name" : "AOC Irouléguy" } }{ "properties" : { "name" : "DO Ribera de Duero" } }{ "properties" : { "name" : "DO Rueda" } }{ "properties" : { "name" : "AOC Béarn-Bellocq" } }
![Page 30: Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y geoespaciales](https://reader031.fdocuments.net/reader031/viewer/2022022203/5873ec671a28abb1528b46a7/html5/thumbnails/30.jpg)
But screens are not circulardb.wines.find({ geometry: { $geoWithin: { $geometry:{type : "Polygon",
coordinates : [[[-51,-29],[-71,-29],[-71,-33],[-51,-33],[-51,-29]]]}}}
})
![Page 31: Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y geoespaciales](https://reader031.fdocuments.net/reader031/viewer/2022022203/5873ec671a28abb1528b46a7/html5/thumbnails/31.jpg)
Results – (Projected){ "properties" : { "name" : "Pinheiro Machado" } }{ "properties" : { "name" : "Rio Negro" } }{ "properties" : { "name" : "Tacuarembó" } }{ "properties" : { "name" : "Rivera" } }{ "properties" : { "name" : "Artigas" } }{ "properties" : { "name" : "Salto" } }{ "properties" : { "name" : "Paysandú" } }{ "properties" : { "name" : "Mendoza" } }{ "properties" : { "name" : "Luján de Cuyo" } }{ "properties" : { "name" : "Aconcagua" } }
![Page 32: Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y geoespaciales](https://reader031.fdocuments.net/reader031/viewer/2022022203/5873ec671a28abb1528b46a7/html5/thumbnails/32.jpg)
Use geo objects smartly• Use polygons and/or multipolygons from a collection to query a
second one.var mex = db.countries.findOne({"properties.name" : "Mexico"})db.wines.find({geometry: {
$geoWithin: {$geometry: mex.geometry}}})
{ "_id" : ObjectId("577e2e7e1007503076ac8ab9"), "properties" : { "name" : "Los Romos", "description" : null, "id" : "a629ojjkguyw" }, "type" : "Feature", "geometry" : { "type" : "Point", "coordinates" : [ -102.304048304437, 22.0992980768825 ] } }
{ "_id" : ObjectId("577e2e7e1007503076ac8a8d"), "properties" : { "name" : "Hermosillo", "description" : null, "id" : "a629ojiw0i7f" }, "type" : "Feature", "geometry" : { "type" : "Point", "coordinates" : [ -111.03600413129, 29.074715739466 ] } }
![Page 33: Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y geoespaciales](https://reader031.fdocuments.net/reader031/viewer/2022022203/5873ec671a28abb1528b46a7/html5/thumbnails/33.jpg)
Let’s do crazy thingsvar wines = db.wines.find()while (wines.hasNext()){
var wine = wines.next();var country = db.countries.findOne({geometry :
{$geoIntersects : {$geometry : wine.geometry}}});if (country!=null){db.wines.update({"_id" : wine._id},{$set : {"properties.country" :
country.properties.name}});}
}
![Page 34: Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y geoespaciales](https://reader031.fdocuments.net/reader031/viewer/2022022203/5873ec671a28abb1528b46a7/html5/thumbnails/34.jpg)
Summary of Operators• $geoIntersect: Find areas or points that overlap or are
adjacent• Points or polygons, doesn’t matter.
• $geoWithin: Find areas on points that lie within a specific area• Use screen limits smartly
• $near: Returns locations in order from nearest to furthest away• Find closest objects.
![Page 35: Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y geoespaciales](https://reader031.fdocuments.net/reader031/viewer/2022022203/5873ec671a28abb1528b46a7/html5/thumbnails/35.jpg)
Summary• Los índices de texto permiten hacer búsquedas tipo Google, SOLR, ElasticSearch
• Pueden tenere en cuenta los pesos de diferentes campos• Pueden combinarse con otras búsquedas• Pueden devolver los resultado ordenados por relevancia• Pueden ser multilenguaje y case/accent insensitive
• Los índices geoespaciales permiten manejar objetos GeoJSON• Permiten hacer búsquedas por proximidad, inclusión e intersección• Utilizan el sistema de referencia más habitual, WGS84
• Ojo!!! Latitud y longitud son al revés que Google Maps.
• Pueden combinarse con otras búsquedas • Existe un índice especial (2d) para superficies planas (un campo de fútbol, un mundo virtual, etc.)
![Page 36: Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y geoespaciales](https://reader031.fdocuments.net/reader031/viewer/2022022203/5873ec671a28abb1528b46a7/html5/thumbnails/36.jpg)
Próximo WebinarIntroducción a Aggregation Framework
• 19 de Julio 2016 – 16:00 CEST, 11:00 ART, 9:00
• ¡Regístrese si aún no lo ha hecho!• MongoDB Aggregation Framework concede al desarrollador la capacidad de
desplegar un procesamiento de análisis avanzado dentro de la base de datos..• Este procesa los datos en una pipeline tipo Unix y permite a los desarrolladores:
• Remodelar, transformar y extraer datos.• Aplicar funciones analíticas estándares que van desde las sumas y las medias hasta la
desviación estándar.
• Regístrese en : https://www.mongodb.com/webinars
• Denos su opinión, por favor: [email protected]
![Page 37: Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y geoespaciales](https://reader031.fdocuments.net/reader031/viewer/2022022203/5873ec671a28abb1528b46a7/html5/thumbnails/37.jpg)
¿Preguntas?
![Page 38: Conceptos básicos. Seminario web 4: Indexación avanzada, índices de texto y geoespaciales](https://reader031.fdocuments.net/reader031/viewer/2022022203/5873ec671a28abb1528b46a7/html5/thumbnails/38.jpg)