Apache Thrift

tunisianbromidrosisInternet and Web Development

Feb 5, 2013 (4 years and 2 months ago)

181 views

1

Thrift

Scalable Cross Language Services
Implementation

Sanjoy Singh



Senior Team Lead


Talentica S/W (I)


Pvt Ltd

2

Scalability ??



Design/Program is said to scale …


-

if it is suitably efficient and practical when
applied to large situations

Measures

Load

Functional

3

Agenda



Key Components/Challenges for Cross Language
Interactions



Various System for Cross Language Interactions



Dive Into Apache Thrift



Principle Of Operation



Example



Thrift Stack



Versioning



Why to use Thrift. Limitations?



Quick Code Walkthrough

4


LAMP + Services

High
-
Level Goal: Enable transparent interaction between these.

…and some others too.

5

High Level Goals !



Transparent Interaction between multiple
programming languages.



Maintain Right balance between


Performance


Ease and speed of development


Availability of existing libraries. etc

6

Simple Distributed Architecture

Communication protocol, Data format

Sending
requests,
getting
results

Waiting for
requests

(known location,

known port)

Basic questions are:




What kind of protocol to use, and what data to
transmit



What to do with requests on the server side

7

Key Components/Challenges !



Type system



Transport system



Protocol system



Versioning



Processor



Performance


No problem can stand the assault of sustained thinking.

8

Hasn’t this been done before?
(yes.)




SOAP



CORBA



COM



Pillar



Protocol Buffers etc

9

Should we pick up one of those?
(not sure)



SOAP


XML, XML, and more XML


CORBA


Over designed and Heavyweight


COM



Embraced mainly in Windows Client Software


Pillar


Slick! But no versioning/abstraction.


Protocol Buffers etc


Closed source Google deliciousness


10





As a developer, what are you looking
for?


Be Patient, I have something for you in
the subsequent slides !!

Decision Time !

11

Solution


Apache Thrift




Software framework for scalable
cross
-
language services development.


12

Apache Thrift
-

Introduction



Originally developed at Facebook



Open sourced in April 2007



Easy exchange of data



Cross language serialization with
minimal overhead .



Thrift tools can generate code for C++,
Java, Python, PHP, Ruby, Erlang, Perl,
Haskell, C#, Cocoa, Smalltalk and OCaml


13

Lets Dive It..

14

Principle Of Operation









Thrift Code Generator Tool

(written in C++)

Create a thrift file

eg demo.thrift

Define Data
types and
Service
interfaces

Build Thrift
platform files

Demo.php

Demo.java

Demo.py

Demo.cpp


Create Server/Client App

Run the Server

Server
implements
Services and
Client calls
them

15

Thrift Cares About



Type Definitions



Service Definitions



Thrift Doesn’t Care About



Wire Protocol (internal XML...)



Transport (HTTP? Sockets? Whatevz!)



Programming Languages

16

Enough Banter. Show Us the Goodz.

// Include other thrift files

include "shared.thrift“

namespace java calculator

enum Operation {
// define enums


ADD = 1,


SUBTRACT = 2,


MULTIPLY = 3,


DIVIDE = 4

}

struct Work {
// complex data structures


1: i32 num1 = 0,


2: i32 num2,


3: Operation op,


4: optional string comment,

}


17

Enough Banter. Show Us the Goodz.

// Exception

exception InvalidOperation {


1: i32 what,


2: string why

}


// Service

service Calculator extends shared.SharedService {


void ping(),


i32 add(1:i32 num1, 2:i32 num2),


i32 calculate(1:i32 logid, 2:Work w) throws (1:InvalidOperation ouch),


oneway void zip()

}

18

Enough Banter. Show Us the Goodz.

// Include other thrift files

include "shared.thrift“

namespace java calculator

enum Operation {
// define enums


ADD =
1
,


SUBTRACT =
2
,


MULTIPLY =
3
,


DIVIDE =
4

}

struct Work {
// complex data structures


1
: i
32
num
1
=
0
,


2
: i
32
num
2
,


3
: Operation op,


4
: optional string comment,

}


// Exception

exception InvalidOperation {


1: i32 what,


2: string why

}


// Service

service Calculator extends
shared.SharedService {


void ping(),


i32 add(1:i32 num1, 2:i32 num2),


i32 calculate(1:i32 logid, 2:Work w)
throws (1:InvalidOperation ouch),


oneway void zip()

}

19

What DOES that do?



Generates definitions for all the types in
each language



Generates Client and Server interfaces
for each language



What DOESNT that do?



Anything to do with sockets



Anything to do with serialization

20

Magically Generated Files

gen
-
java/calculatordemo


Calculator.java


InvalidOperation.java


Operation.java


Work.java

gen
-
php/


Calculator.php


calculator_types

gen
-
py/


ttypes.py


Calculator.py


Calculator
-
remote

21


Thrift Philosophy


Create a system that is abstracted in a
systematic way, such that developers can
easily extend it to suit their needs and
function in custom environments.


22



Structs don’t have any code to do with
serialization or sockets, etc.


But they know how to read and write
themselves… How does that work?


23

The Thrift Stack



The Thrift stack is a common class hierarchy implemented in each language that
abstracts out the tricky details of protocol encoding and network communication.

It provides a simple interface for generated code to use.

There are two key interfaces:


TTransport


De
-
coupled the transport layer from Code Generation Layer.


Provides read() and write(), with a set of other helpers like open(), close(),
etc.


Implementation
-

TSocket, TFileTransport, TBufferedTransport,
TFramedTransport, TMemoryBuffer.



TProtocol


Separate Data Structure from Transport representation.


Provides the ability to read and write various types of data, i.e. readI32(),
writeString(), etc.


Supports Bi
-
directional sequenced messaging and encoding of base types,
container and struts.



24

The Thrift Stack

Object

write()

TTransport

TProtocol

TTransport

TProtocol

Object

read()

Information Flow!

25

Versioning
(applications change a lot, not protocols!)


What happens when definitions change?


Struct needs a new member


Function needs a new argument



No Problem! We’ve got Field Identifiers!


Example:

struct Work {


1:

i32 num1 = 0,


2:

i32 num2,


3:

Operation op,


4:

optional string comment,

}


26

Versioning
-

Case Analysis


Add a Field


New Client, Old Server



Server sees a field id that it doesn’t recognize, and safely ignores it.


Old Client, New Server



Server doesn’t see the field id it expects. Leaves it unset in object,
server implementation can properly handle



Remove a Field


New Client, Old Server

Server doesn’t see field it expects. Analogous to above.


Old Client, New Server

Old client sends deprecated field. Server politely ignore it. Analogous
to the top case.

27

Why to use Thrift …


Less time wasted by individual developers


No duplicated networking and protocol code


less time dealing with boilerplate stuff


Write your client and server in about 5 minutes


Less maintenance


One networking code base that needs maintenance


Fix bugs once, rather than repeatedly in every server


Division of labour


Work on high
-
performance servers separate from
applications


Common toolkit


Code reuse and shared tools


28

Why to use Thrift …



Cross
-
language serialization with lower overhead
than alternatives such as SOAP due to use of binary
format



A lean and clean library. No framework to code to.
No XML configuration files.



The language bindings feel natural. For example
Java uses ArrayList<String>. C++ uses
std::vector<std::string>.


The application
-
level wire format and the
serialization
-
level wire format are cleanly separated.
They can be modified independently.









29

Why to use Thrift …


The predefined serialization styles include: binary,
HTTP
-
friendly and compact binary.


Soft versioning of the protocol.


No build dependencies or non
-
standard software. No
mix of incompatible software licenses.









30

Limitations / Non
-
Features



Is struct inheritance/polymorphism supported?


No, it isn’t



Can I overload service methods?


Nope. Method names must be unique.



Heterogeneous containers Not supported



Is there any enough documentation on Thrift
development?


I think this is one weak area.










31



Steps/Code Walkthrough

(Lets build the example described earlier)







32

Some Real Time Example











Facebook Search Service







AdServer, Blogfeeds, CSSParser,
Memcached, Network Selector, News
Feed, Scribe etc

PHP based

Web App

Thrift PHP Lib

Search Service

(implemented in C++

33





Why Should I not try this?






Guess the answer?


Answer: Please do let me know at
sanjoys@talentica.com

Skpe_id/Gtalk_id : sanjoy_17 /sanjoy17









34

References











http://incubator.apache.org/thrift/


http://incubator.apache.org/thrift/static
/thrift
-
20070401.pdf



35




Thanks !!!