Rpsonmongodb

26
Rediff News Publishing System using MongoDB Subramanyam Yeleswarapu

Transcript of Rpsonmongodb

Page 1: Rpsonmongodb

Rediff News Publishing System using MongoDB

Subramanyam Yeleswarapu

Page 2: Rpsonmongodb

Agenda

•  Use Cases •  MongoDB Usage •  Architecture •  Q & A

Page 3: Rpsonmongodb

Use Cases

•  Rediff Maps •  Core Publishing System •  Newsletter

Page 4: Rpsonmongodb

Rediff Maps Use Case

•  Upload excel file and select the data •  Match data to Map attributes •  Author an article that consumes data

generated by data science team •  Visualize data on the map

Page 5: Rpsonmongodb

Upload data

2 3 RPS Login

1

1.  Login require to post to rediff news to backyard

2.  Select your excel file to upload

3.  Upload the data file to server and display it below

Page 6: Rpsonmongodb

Upload data…

4 7 6 5

4  If checked, ignore the first row and consider it as header column names of the data

5  Select the region column where name of district or state are given

6  If checked, many data columns can be selected

7  Select the data column that you want to show on map

(if first row doesn’t contains the header select options would be A,B,C..)

Page 7: Rpsonmongodb

Workflow

1.  Check the details of the data

•  Area: Your data coverage It may be India or any State of India

•  Data Unit: each record in data pertaining to state, district constituency

•  Calssification:

•  Select Categorized for Regions (records) with category,

•  Select Gratuated / Quantile for Regions (records) with quantity. If you want to highlight the records in map with respect to each other Quantile may be good option.

•  Intervals: Select the intervals to simplify your data. For Categorized the intervals are taken automatically from data.

`

Page 8: Rpsonmongodb

Workflow

2.  Select Colour Palette (Categorized): Click on the color palette to select more. In case of Categoriezed data colors can be changed at Legend box also.

`

2.  Select Colour Palette (Gratuated/ Quantile): Click on the color palette to select more.

`

Page 9: Rpsonmongodb

Output of Categorized map

If any changes done in the options, please click on the Render Map again to reflect changes on map

Page 10: Rpsonmongodb

Output of Quantile map

Page 11: Rpsonmongodb

Output of map with time series data

Page 12: Rpsonmongodb

Push to publishing system

Page 13: Rpsonmongodb

Where do we use •  Management of the life cycle of articles •  Articles’ Meta data storage •  Role, Access and Work flow management •  Acquisition External Feeds •  Tagging •  Notification •  Search •  Integrating data on Maps •  Compose Newsletters

–  Subscription based –  Customized Newsletters on user habits/profiling

Page 14: Rpsonmongodb

Why Mongodb

•  Write throughput performance •  Flexible Schema design (document style)

–  Allows to modify / alter data model as the business demands •  Read throughput (moderate) •  New document storage is future ready

– Data mining, Shading and Clustering as per the volume and features of the business.

Page 15: Rpsonmongodb

Architecture •  Schema is defined in POJO

–  “Reflection” are used to discover data structure •  Custom Dimension’s are created on fly

–  Use standard indices –  Create specialized named collection –  Counters –  All defined in simple config file –  Storage is totally abstracted from Apps layer

•  REST Layer –  Auto wiring Apps’s collections and exposing data as

resources

Page 16: Rpsonmongodb

Architecture `

Page 17: Rpsonmongodb

Create additional datasets

RPSApps

Mongodb

ETL Tools

dataset dataset

dataset dataset

dataset dataset

dataset

Datasets using Mongodb M/R

Map Reduce

Page 18: Rpsonmongodb

•  Based on uploaded photo’s metadata •  Trends analysis on Tags •  Timelines on geo location •  Popular topics / editorial wise analysis

Out-bound Datasets

Page 19: Rpsonmongodb

Use Case

•  Article Publishing •  Newsletter Publishing

Page 20: Rpsonmongodb

Features •  Search filters based on author, classification and

date range •  Scheduling articles to be published live •  Role based approval process and publishing life

cycle (for control and editorial reviews) •  Easy content versioning of articles •  Notification on application’s Tab / email •  Provides a channel publish “Breaking News” on

web and mobile platforms in real time •  Integrate with existing in house systems

Page 21: Rpsonmongodb

Add on features

•  Auto RSS Feeds creation and publishing •  Data Journalism Simplified •  SEO friendly (adding meta tags that helps to

rank up in search results) •  Newsletters creation and publish process

Page 22: Rpsonmongodb

+ Minimum and properly positioned buttons helps in publishing faster, less hassles and once used to it, it’s a game. Like while copy editing most of the buttons are positioned and bottom-right, so the editor does not have to scroll in search of buttons when he/she is done with editing it, its always in front.

Page 23: Rpsonmongodb

+ Image preview in slide-shows allow us to see what image is getting uploaded with the content, so there is not mis-match of images. + Proper placing of other required fields helps in updating them faster. + fast navigation between slides, swapping slides by dragging them on required sequence.

Page 24: Rpsonmongodb

All the versions of a copy gets locked when an editor opens it for editing, this helps in keeping the data update and its versioning/publishing smooth.

Page 25: Rpsonmongodb

•  The newsletter system has amazingly reduced efforts, its like select-headline and submit it for todays update.

•  Newsletter system allows to edit, re-process copy headline and abstract, can be tweaked to get better clicks from email.

•  Add URL in newsletter and Breaking news allows to add coverage and other content to go with regular RPS content. A faster and smooth process.

Page 26: Rpsonmongodb

Thank You