Building scalable and language-independent Java services using Apache Thrift [5th IndicThreads.com...

35
1 Thrift Scalable Cross Language Services Implementation Sanjoy Singh Senior Team Le Talentica Pvt Ltd

description

Session Presented at 5th IndicThreads.com Conference On Java held on 10-11 December 2010 in Pune, India WEB: http://J10.IndicThreads.com ------------

Transcript of Building scalable and language-independent Java services using Apache Thrift [5th IndicThreads.com...

Page 1: Building scalable and language-independent Java services using Apache Thrift  [5th IndicThreads.com Conference On Java, Pune, India]

1

ThriftScalable Cross Language Services Implementation

Sanjoy Singh Senior Team Lead

Talentica S/W (I) Pvt Ltd

Page 2: Building scalable and language-independent Java services using Apache Thrift  [5th IndicThreads.com Conference On Java, Pune, India]

2

Scalability ??

Design/Program is said to scale …

- if it is suitably efficient and practical when applied to large situations

Measures

Load Functional

Page 3: Building scalable and language-independent Java services using Apache Thrift  [5th IndicThreads.com Conference On Java, Pune, India]

3

Agenda Key Components/Challenges for Cross

Language Interactions Various System for Cross Language

Interactions Dive Into Apache Thrift

Principle Of Operation Example Thrift Stack Versioning

Why to use Thrift. Limitations? Quick Code Walkthrough

Page 4: Building scalable and language-independent Java services using Apache Thrift  [5th IndicThreads.com Conference On Java, Pune, India]

4

LAMP + Services

High-Level Goal: Enable transparent interaction between these.

…and some others too.

Page 5: Building scalable and language-independent Java services using Apache Thrift  [5th IndicThreads.com Conference On Java, Pune, India]

5

High Level Goals !

Transparent Interaction between multiple programming languages.

Maintain Right balance between Performance Ease and speed of development Availability of existing libraries. etc

Page 6: Building scalable and language-independent Java services using Apache Thrift  [5th IndicThreads.com Conference On Java, Pune, India]

6

Simple Distributed Architecture

Communication protocol, Data format

Sending requests,

getting results

Sending requests,

getting results

Waiting for requests

(known location,

known port)

Waiting for requests

(known location,

known port)

Basic questions are:Basic questions are:

What kind of protocol to use, and what What kind of protocol to use, and what data to transmitdata to transmit

What to do with requests on the server What to do with requests on the server sideside

Page 7: Building scalable and language-independent Java services using Apache Thrift  [5th IndicThreads.com Conference On Java, Pune, India]

7

Key Components/Challenges !

Type system Transport system Protocol system Versioning Processor Performance

No problem can stand the assault of sustained thinking.

Page 8: Building scalable and language-independent Java services using Apache Thrift  [5th IndicThreads.com Conference On Java, Pune, India]

8

Hasn’t this been done before? (yes.)

SOAP

CORBA

COM

Pillar

Protocol Buffers etc

Page 9: Building scalable and language-independent Java services using Apache Thrift  [5th IndicThreads.com Conference On Java, Pune, India]

9

Should we pick up one of those? (not sure)

SOAP XML, XML, and more XML

CORBA Over designed and Heavyweight

COM Embraced mainly in Windows Client Software

Pillar Slick! But no versioning/abstraction.

Protocol Buffers etc Closed source Google deliciousness

Page 10: Building scalable and language-independent Java services using Apache Thrift  [5th IndicThreads.com Conference On Java, Pune, India]

10

As a developer, what are you looking for?

Be Patient, I have something for you in the subsequent slides !!

Decision Time !

Page 11: Building scalable and language-independent Java services using Apache Thrift  [5th IndicThreads.com Conference On Java, Pune, India]

11

Solution

Apache Thrift

Software framework for scalable cross-language services development.

Page 12: Building scalable and language-independent Java services using Apache Thrift  [5th IndicThreads.com Conference On Java, Pune, India]

12

Apache Thrift - Introduction

Originally developed at Facebook Open sourced in April 2007 Easy exchange of data Cross language serialization with

minimal overhead . Thrift tools can generate code for

C++, Java, Python, PHP, Ruby, Erlang, Perl, Haskell, C#, Cocoa, Smalltalk and OCaml

Page 13: Building scalable and language-independent Java services using Apache Thrift  [5th IndicThreads.com Conference On Java, Pune, India]

13

Lets Dive It..

Page 14: Building scalable and language-independent Java services using Apache Thrift  [5th IndicThreads.com Conference On Java, Pune, India]

14

Principle Of Operation

Thrift Code Generator Tool(written in C++)

Create a thrift fileeg demo.thrift

Define Data types and Service

interfaces

Build Thrift platform files

Demo.php Demo.javaDemo.pyDemo.cpp

Create Server/Client AppRun the Server

Server implements Services and Client calls

them

Page 15: Building scalable and language-independent Java services using Apache Thrift  [5th IndicThreads.com Conference On Java, Pune, India]

15

Thrift Cares About Type Definitions Service Definitions

Thrift Doesn’t Care About Wire Protocol (internal XML...) Transport (HTTP? Sockets? Whatevz!) Programming Languages

Page 16: Building scalable and language-independent Java services using Apache Thrift  [5th IndicThreads.com Conference On Java, Pune, India]

16

Enough Banter. Show Us the Goodz.// Include other thrift files

include "shared.thrift“

namespace java calculator

enum Operation { // define enums

ADD = 1,

SUBTRACT = 2,

MULTIPLY = 3,

DIVIDE = 4

}

struct Work {// complex data structures

1: i32 num1 = 0,

2: i32 num2,

3: Operation op,

4: optional string comment,

}

Page 17: Building scalable and language-independent Java services using Apache Thrift  [5th IndicThreads.com Conference On Java, Pune, India]

17

Enough Banter. Show Us the Goodz.

// Exception

exception InvalidOperation {

1: i32 what,

2: string why

}

// Service

service Calculator extends shared.SharedService {

void ping(),

i32 add(1:i32 num1, 2:i32 num2),

i32 calculate(1:i32 logid, 2:Work w) throws (1:InvalidOperation ouch),

oneway void zip()

}

Page 18: Building scalable and language-independent Java services using Apache Thrift  [5th IndicThreads.com Conference On Java, Pune, India]

18

Enough Banter. Show Us the Goodz.// Include other thrift files

include "shared.thrift“

namespace java calculator

enum Operation { // define enums

ADD = 1,

SUBTRACT = 2,

MULTIPLY = 3,

DIVIDE = 4

}

struct Work {// complex data structures

1: i32 num1 = 0,

2: i32 num2,

3: Operation op,

4: optional string comment,

}

// Exception

exception InvalidOperation {

1: i32 what,

2: string why

}

// Service

service Calculator extends shared.SharedService {

void ping(),

i32 add(1:i32 num1, 2:i32 num2),

i32 calculate(1:i32 logid, 2:Work w) throws (1:InvalidOperation ouch),

oneway void zip()

}

Page 19: Building scalable and language-independent Java services using Apache Thrift  [5th IndicThreads.com Conference On Java, Pune, India]

19

What DOES that do?

Generates definitions for all the types in each language

Generates Client and Server interfaces for each language

What DOESNT that do? Anything to do with sockets Anything to do with serialization

Page 20: Building scalable and language-independent Java services using Apache Thrift  [5th IndicThreads.com Conference On Java, Pune, India]

20

Magically Generated Files

gen-java/calculatordemo

Calculator.java

InvalidOperation.java

Operation.java

Work.java

gen-php/

Calculator.php

calculator_types

gen-py/

ttypes.py

Calculator.py

Calculator-remote

Page 21: Building scalable and language-independent Java services using Apache Thrift  [5th IndicThreads.com Conference On Java, Pune, India]

21

Thrift Philosophy

Create a system that is abstracted in a systematic way, such that developers can easily extend it to suit their needs and function in custom environments.

Page 22: Building scalable and language-independent Java services using Apache Thrift  [5th IndicThreads.com Conference On Java, Pune, India]

22

Structs don’t have any code to do with serialization or sockets, etc.

But they know how to read and write themselves… How does that work?

Page 23: Building scalable and language-independent Java services using Apache Thrift  [5th IndicThreads.com Conference On Java, Pune, India]

23

The Thrift Stack The Thrift stack is a common class hierarchy implemented in each language

that abstracts out the tricky details of protocol encoding and network communication.

It provides a simple interface for generated code to use.

There are two key interfaces:

TTransport

De-coupled the transport layer from Code Generation Layer. Provides read() and write(), with a set of other helpers like open(), close(),

etc. Implementation - TSocket, TFileTransport, TBufferedTransport,

TFramedTransport, TMemoryBuffer. TProtocol

Separate Data Structure from Transport representation. Provides the ability to read and write various types of data, i.e. readI32(),

writeString(), etc. Supports Bi-directional sequenced messaging and encoding of base types,

container and struts.

Page 24: Building scalable and language-independent Java services using Apache Thrift  [5th IndicThreads.com Conference On Java, Pune, India]

24

The Thrift Stack

Objectwrite()

TTransport

TProtocol TTransport TProtocol

Objectread()

Information Flow!

Page 25: Building scalable and language-independent Java services using Apache Thrift  [5th IndicThreads.com Conference On Java, Pune, India]

25

Versioning (applications change a lot, not protocols!) What happens when definitions change?

Struct needs a new member Function needs a new argument

No Problem! We’ve got Field Identifiers!

Example:

struct Work {

1: i32 num1 = 0,

2: i32 num2,

3: Operation op,

4: optional string comment,

}

Page 26: Building scalable and language-independent Java services using Apache Thrift  [5th IndicThreads.com Conference On Java, Pune, India]

26

Versioning - Case AnalysisAdd a Field

New Client, Old ServerServer sees a field id that it doesn’t recognize, and safely

ignores it.

Old Client, New ServerServer doesn’t see the field id it expects. Leaves it unset in

object, server implementation can properly handle

Remove a Field New Client, Old Server

Server doesn’t see field it expects. Analogous to above. Old Client, New Server

Old client sends deprecated field. Server politely ignore it. Analogous to the top case.

Page 27: Building scalable and language-independent Java services using Apache Thrift  [5th IndicThreads.com Conference On Java, Pune, India]

27

Why to use Thrift … Less time wasted by individual developers

No duplicated networking and protocol code less time dealing with boilerplate stuff Write your client and server in about 5 minutes

Less maintenance One networking code base that needs maintenance Fix bugs once, rather than repeatedly in every server

Division of labour Work on high-performance servers separate from

applications Common toolkit

Code reuse and shared tools

Page 28: Building scalable and language-independent Java services using Apache Thrift  [5th IndicThreads.com Conference On Java, Pune, India]

28

Why to use Thrift … Cross-language serialization with lower overhead

than alternatives such as SOAP due to use of binary format

A lean and clean library. No framework to code to. No XML configuration files.

The language bindings feel natural. For example Java uses ArrayList<String>. C++ uses std::vector<std::string>.

The application-level wire format and the serialization-level wire format are cleanly separated. They can be modified independently.

Page 29: Building scalable and language-independent Java services using Apache Thrift  [5th IndicThreads.com Conference On Java, Pune, India]

29

Why to use Thrift …

The predefined serialization styles include: binary, HTTP-friendly and compact binary.

Soft versioning of the protocol.

No build dependencies or non-standard software. No mix of incompatible software licenses.

Page 30: Building scalable and language-independent Java services using Apache Thrift  [5th IndicThreads.com Conference On Java, Pune, India]

30

Limitations / Non-Features

Is struct inheritance/polymorphism supported?

No, it isn’t Can I overload service methods?

Nope. Method names must be unique. Heterogeneous containers Not supported

Is there any enough documentation on Thrift development?

I think this is one weak area.

Page 31: Building scalable and language-independent Java services using Apache Thrift  [5th IndicThreads.com Conference On Java, Pune, India]

31

Steps/Code Walkthrough(Lets build the example described earlier)

Page 32: Building scalable and language-independent Java services using Apache Thrift  [5th IndicThreads.com Conference On Java, Pune, India]

32

Some Real Time Example

Facebook Search Service

AdServer, Blogfeeds, CSSParser, Memcached, Network Selector, News Feed, Scribe etc

PHP based Web App

Thrift PHP Lib Search Service(implemented in C++

Page 33: Building scalable and language-independent Java services using Apache Thrift  [5th IndicThreads.com Conference On Java, Pune, India]

33

Why Should I not try this?

Guess the answer?

Answer: Please do let me know at [email protected]

Skpe_id/Gtalk_id : sanjoy_17 /sanjoy17

Page 34: Building scalable and language-independent Java services using Apache Thrift  [5th IndicThreads.com Conference On Java, Pune, India]

34

References

http://incubator.apache.org/thrift/http://incubator.apache.org/thrift/stat

ic/thrift-20070401.pdf

Page 35: Building scalable and language-independent Java services using Apache Thrift  [5th IndicThreads.com Conference On Java, Pune, India]

35

Thanks !!!