iFCP in Storage area networks

R.V. College of Engineering

1

R.V. COLLEGE OF ENGINEERING, BANGALORE-560059

(Autonomous Institution Affiliated to VTU, Belgaum)

SELF STUDY REPORT ON

INTERNET FIBRE CHANNEL PROTOCOL

IN STORAGE AREA NETWORKS

Submitted by

SANJAY VINAYAK H K

1RV13CS139

Under the guidance of

Ms. Ganashree K.C, Assistant Professor, CSE

Mrs. Prapulla S.B, Assistant Professor, CSE

Ms.Vishalakshi Prabhu H, Assistant Professor,CSE

Dr.Neeta Shivakumar, Associate Professor, BT

Submitted to

COMPUTER SCIENCE AND ENGINEERING DEPARTMENT

R.V. College of Engineering, Bangalore-59


Department of Computer Science

Engineering

R.V. COLLEGE OF ENGINEERING, BANGALORE - 560059 (Autonomous Institution Affiliated to VTU, Belgaum)

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

CERTIFICATE

Certified that the Self Study work titled INTERNET FIBRE CHANNEL

PROTOCOL IN STORAGE AREA NETWORKS is carried out by SANJAY

VINAYAK H K (1RV13CS139), who is bonafide student of R.V College of

Engineering, Bangalore, in partial fulfillment for the award of degree of Bachelor of

Engineering in Computer Science and Engineering of the Visvesvaraya

Technological University, Belgaum during the year 2014-2015. It is certified that all

corrections/suggestions indicated for the internal Assessment have been incorporated

in the report deposited in the departmental library. The Self Study report has been

approved as it satisfies the academic requirements in respect of Self Study work

prescribed by the institution for the said degree.

Ms.Ganashree K.C Mrs. Prapulla S.B

Assistant Professor, CSE Assistant Professor, CSE

Ms. Vishalakshi Prabhu H Dr.Neeta Shivakumar

Assistant Professor, CSE Associate Professor, BT

Dr.Shobha G

Head of Department,

Department of CSE,

R.V.College of Engineering, Bangalore-560059


Department of Computer Science 3

Engineering

ii

TABLE OF CONTENTS

PROBLEM DEFINITION

DESIGN

IMPLEMENTATION

CONCLUSION

FUTURE ENHANCEMENT



Engineering

LIST OF FIGURES AND TABLES

Figure 1: TRADITIONAL DAS ARCHITECTURE.....7

Figure 2: FIBRE CHANNEL TOPOLOGIES...8

Figure 3: FIBRE CHANNEL PROTOCOL LAYER.. 9



Engineering

LIST OF SYMBOLS, ACRONYMS / ABBREVIATION AND

NOMENCLATURE

SAN : Storage Area Network

NAS: Network Attached Storage

DAS: Direct Attached Storage

iSCSI: Internet Small Computer Interface

iFCP: Internet Fibre Channel Protocol

FCIP: Fibre Channel over Internet Protocol

TCP/IP: Transmission Control Protocol/Internet Protocol

LAN: Local Area Network

WAN: Wide Area Network

IETF: Internet Engineering Task Force



Engineering

PROBLEM DEFINITION

Todays applications are rapidly overwhelming the capacity of networks and of storage

space. In e-commerce, huge databases support electronic cataloging and ordering while large

numbers of customers attempt to simultaneously access the information. As corporations

grow and enter the international business environment, enterprise systems maintain corporate

information across not only states but countries. To maintain and make available to all users

that large amount of information reliably and in a timely manner is challenging to say the

least. More and more feature films are incorporating digital effects. Video editing software,

Computer Aided Drafting and photo-realistic rendering software are utilized to either modify

a film or even create one from scratch. Even a few seconds worth of a film requires hundreds

of megabytes of storage space. When teams of 20 animators/digital artists are trying to work

on their own piece of a film, the burden on the storage and the network facilities are

tremendous. Web sites that serve up streaming audio and or video are consuming more

resources as the demand for these services go up. In addition to simply supporting bandwidth

and storage increases, corporations now want to be able to safeguard their data. This typically

entails making backups of data (to tape) and saving data off the corporate premises. This is

an extremely small sample of the applications that are challenging the storage and networking

architectures.

Traditionally, these applications have been supported by file servers with either large internal

disks or disk farms directly attached to the server The disks are typically connected to the

server via SCSI (Small Computer System Interconnect). The SCSI standard defines a high

throughput parallel interface that is used to connect up to 7 peripherals (including the host

adapter card itself) to the computer. Examples of these peripherals are scanners, CD

(Compact Disk) players/recorders, digitizers, tape drives and as previously stated hard disks.

This architecture has several limitations. The server can only access data on devices directly

attached to it. If a server or any part of its SCSI hardware fails, access to its data is cut off.

Also, SCSI supports a finite number of devices, therefore the amount of data a server can

access is limited. If more storage space is needed, but there is no more room on the SCSI bus,

expansion is no longer possible. SCSI, due to its parallel structure, has distance limitations

as well. This requires that the storage be near the servers. These limitations are the driving

force behind a new paradigm for data storage and access.



Engineering

Distance is one of the major drawbacks for those who rely solely on Fibre Channel. In order

to promote disaster recovery, particularly in areas experiencing frequent earthquakes, Fibre

Channel was originally designed to allow storage to take place over distances of up to

approximately 10 km from hosts. Even by using various methods of signal enhancement,

which might allow the distance to expand by several hundred kilometers, storing data over

distances hundreds or thousands of miles is out oftion across both enterprise and small to

medium sized business environments. the question for devices connected merely by Fibre

Channel cables. More recently, other disaster recovery concerns have arisen which have

resulted in the promotion of storage which takes place over longer distances, sometimes

crossing international boundaries .

Additionally, Fibre Channel media and equipment can often be expensive and

cumbersome, both to install and manage. Although backbone technologies do exist which

can carry Fibre Channel data over longer distances, their installation, cost, and maintenance

would present many difficulties that could be handled easily by a technology such as TCP/IP,

which is already in place across the world. Thus, the relationship between SAN technologies

which run over TCP/IP and other SAN technologies can be considered similar to that between

Wide Area Networking and the LAN. Additionally, some SANs which run over TCP/IP can

sometimes replace other SAN technologies altogether. These factors, combined with the fact

that the Internet has become so widespread and convenient, are among the primary

motivations for this thesis, and for the rising popularity of storage over TCP/IP.

SANs also tend to enable more effective disaster recovery processes. A SAN could span a

distant location containing a secondary storage array. This enables storage replication either

implemented by disk array controllers, by server software, or by specialized SAN devices.

Figure 1:Traditional DAS architecture


Department of Computer Science Engineering 8

DESIGN

The generic Fibre Channel network is composed of one or more bi-directional point-to-

point channels. The links support 1Gbps (or 100MBps) data rates in each direction. The transport

media may be fiber optic cable, copper twisted pair or coax cable. The links in the FC network are

between communication ports known as N_ports. N_port stands for Node Port where a node is a

device on the FC network. The links may be point-to-point between N_ports or the may be set up

as a Fabric. A Fabric consists of several N_Ports connected to a switch. Note: Ports on the switch

are called F_ports. Finally, the ports may be daisy chained to form a ring. This is called an

Arbitrated Loop (FC-AL). In this configuration the ports are referred to as L_ports. No switch is

necessary for FC-AL. These basic layouts may be combined in different ways to create more

complex topologies.

FC is typically realized in one of 3 topologies: Point-To-Point, Loop or Fabric. The Point-To-Point

connection is the simplest type of connection. It can exist by itself or as a subset in a Fabric or Loop

topology.

Figure 2:Different fibre channel topologies



Fibre Channel uses a multi layer protocol architecture along the lines of the 7 Layer OSI Model.

There are 5 layers. They are FC-0: Physical layer, FC-1: Encode/Decode layer, FC-2: Framing

Protocol/Flow Control, FC-3: Common Services and FC-4: Upper Level Protocol Support.

Additionally, there is another layer, which although is not typically considered part of the basic

architecture is so important as to warrant mention. This is the FC-AL (Arbitrated Loop) layer.

iFCP uses OSPF to implement addressing and routing.

Figure 3: Fibre channel protocol layer

With iFCP, N_Port addressing can be locally assigned by each gateway for a Gateway Region

local mode operation. Alternatively, in address-transparent mode, N_Ports can be globally

assigned across an interconnected set of gateways.

The routing between Gateway Regions operates with IP only. Routing that takes place within a

Gateway Region (if there is any routing within) is opaque to the IP network. For example, Fibre

Channel routing and DFS traffic that may be operating within a Gateway Region does not flow

between Gateway Regions.

Address Transparent Mode

In address transparent mode, the scope of N_Port addresses is fabric wide. The IP network fabric

is defined as a name server object containing a collection of gateways. The iSNS name server

acts as a fabric Domain Address Manager, and maintains a pool of Domain IDs for the fabric,



assigning FC domain IDs to each gateway within the fabric. Within each Gateway Region, the

gateway acts as the downstream principal switch. The advantage of address transparent mode is the

transparency across the fabric and the resulting simplification of gateway operation. The

disadvantage is that each Gateway Region consumes 65K of Node IDs and this is inefficient when

the Gateway Region N_Port count is low. Also, Address Transparent Mode is less scaleable as

communication among N_Ports is restricted to

N_Ports within the fabric.

Gateway Region Local Mode

In Gateway Region Local mode, the scope of the N_Port addresses is local to the Gateway

Region. Each gateway maps N_Port network addresses of external devices to N_Port fabric

addresses. Normal inter-gateway frame traffic is mapped on the fly.

The advantage of Local Mode is scalability. N_Port connectivity is network-wide, allowing

unrestricted addresses within a Gateway Region. Since each gateway is individually responsible

for N_Port addresses allocated to its Gateway Region, the fabric becomes more stable as the

network scales in size. This is because there is no dependence on a central addressing authority, as

is the case with Fibre Channel and iFCP Transparent Mode fabrics.



Mapping of Fibre Channel to iFCP

Fibre Channel frames ingressing the iFCP gateway are converted to iFCP frames through the

process shown in Figure 7. The FC frames may be addressed to remote devices, or to other FC

devices attached to the same iFCP gateway. If the latter is the case, no address translation

mechanism is needed, and the frame is directly delivered to the local N_Port. If the former is the

case, then an address mapping function must occur that maps a key found in the D_ID to the TCP

connection addressed to the appropriate remote N_Port network address (N_Port ID and IP address)



IMPLEMENTATION

DAA Component:

Implementation of Dijkstras algorithm in OSPF:

OSPF uses a shorted path first algorithm in order to build and calculate the shortest path to all

known destinations.The shortest path is calculated with the use of the Dijkstra algorithm. The

algorithm by itself is quite complicated. This is a very high level, simplified way of looking at the

various steps of the algorithm:

1. Upon initialization or due to any change in routing information, a router generates a link-

state advertisement. This advertisement represents the collection of all link-states on that

router.

2. All routers exchange link-states by means of flooding. Each router that receives a link-state

update should store a copy in its link-state database and then propagate the update to other

routers.

3. After the database of each router is completed, the router calculates a Shortest Path Tree to

all destinations. The router uses the Dijkstra algorithm in order to calculate the shortest path

tree. The destinations, the associated cost and the next hop to reach those destinations form

the IP routing table.

4. In case no changes in the OSPF network occur, such as cost of a link or a network being

added or deleted, OSPF should be very quiet. Any changes that occur are communicated

through link-state packets, and the Dijkstra algorithm is recalculated in order to find the

shortest path.

The Dijkstra algorithm places each router at the root of a tree and calculates the shortest path to

each destination based on the cumulative cost required to reach that destination. Each router will

have its own view of the topology even though all the routers will build a shortest path tree using

the same link-state database. The following sections indicate what is involved in building a shortest

path tree.

OSPF Cost

The cost (also called metric) of an interface in OSPF is an indication of the overhead required to

send packets across a certain interface. The cost of an interface is inversely proportional to the

bandwidth of that interface. A higher bandwidth indicates a lower cost. There is more overhead

(higher cost) and time delays involved in crossing a 56k serial line than crossing a 10M ethernet

line. The formula used to calculate the cost is:



cost= 10000 0000/bandwith in bps

For example, it will cost 10 EXP8/10 EXP7 = 10 to cross a 10M Ethernet line and will cost 10

EXP8/1544000 = 64 to cross a T1 line.

By default, the cost of an interface is calculated based on the bandwidth; you can force the cost of

an interface with the ip ospf cost interface subconfiguration mode command.

Shortest Path Tree

Assume we have the following network diagram with the indicated interface costs. In order to build

the shortest path tree for RTA, we would have to make RTA the root of the tree and calculate the

smallest cost for each destination.

The above is the view of the network as seen from RTA. Note the direction of the arrows in

calculating the cost. For example, the cost of RTB's interface to network 128.213.0.0 is not relevant

when calculating the cost to 192.213.11.0. RTA can reach 192.213.11.0 via RTB with a cost of 15

(10+5). RTA can also reach 222.211.10.0 via RTC with a cost of 20 (10+10) or via RTB with a

cost of 20 (10+5+5). In case equal cost paths exist to the same destination, Cisco's implementation

of OSPF will keep track of up to six next hops to the same destination.

After the router builds the shortest path tree, it will start building the routing table accordingly.

Directly connected networks will be reached via a metric (cost) of 0 and other networks will be

reached according to the cost calculated in the tree.
http://www.cisco.com/c/dam/en/us/support/docs/ip/open-shortest-path-first-ospf/7039-spf1.gif



TOC COMPONENT:

In computer science, a communicating finite-state machine is a finite state

machine labelled with "receive" and "send" operations over some alphabet of channels. They were

introduced by Brand and Zafiropulo, and can be used as a model of concurrent processes like Petri

nets. Communicating finite state machines are used frequently for modelling a communication

protocol since they make it possible to detect major protocol design errors, including boundedness,

deadlocks, and unspecified receptions.

The advantage of communicating finite state machines is that they make it possible to decide many

properties in communication protocols, beyond the level of just detecting such properties. This

advantage rules out the need for human assistance or restriction in generality.

It has been proved with the introduction of the concept itself that when two finite state machines

communicate with only one type of messages, boundedness, deadlocks, and unspecified reception

state can be decided and identified while such is not the case when the machines communicate with

two or more types of messages. Later, it has been further proved that when only one finite state

machine communicates with single type of message while the communication of its partner is

unconstrained, we can still decide and identify boundedness, deadlocks, and unspecified reception

state.

It has been further proved that when the message priority relation is empty, boundedness, deadlocks

and unspecified reception state can be decided even under the condition in which there are two or

more types of messages in the communication between finite state machines.

Boundedness, deadlocks, and unspecified reception state are all decidable in polynomial time

(which means that a particular problem can be solved in tractable, not infinite, amount of time)

since the decision problems regarding them are nondeterministic logspace complete.

Communicating finite state machines can be the most powerful in situations where the propagation

delay is not negligible (so that several messages can be in transit at one time) and in situations

where it is natural to describe the protocol parties and the communication medium as separate

entities.
http://en.wikipedia.org/wiki/Computer_sciencehttp://en.wikipedia.org/wiki/Finite_state_machinehttp://en.wikipedia.org/wiki/Finite_state_machinehttp://en.wikipedia.org/wiki/Concurrency_(computer_science)http://en.wikipedia.org/wiki/Petri_netshttp://en.wikipedia.org/wiki/Petri_nets



Figure 4:CFSM for TCP/IP



OOPS COMPONENT:

Shortest path algorithms find wide applications Given below is a practical implementation of

Dijkstras algorithm in C++.Dijkstras algorithm helps in finding the single source shortest path in

OSPF.

#include

using namespace std;

#define INFINITY 99

void sp(int);

void pp(int);

int choose();

int dist[10],path[10],reach[10];

int adj[10][10],n,edge;

int main()

{

int i,j,s;

coutn;

for( i=1;i



}

couts;

cout



}

int choose()

{

int min=INFINITY;

int j;

for(int w=1;w



A* uses a best-first search and finds a least-cost path from a given initial node to one goal node (out

of one or more possible goals). As A* traverses the graph, it follows a path of the lowest expected

total cost or distance, keeping a sorted priority queue of alternate path segments along the way.

It uses a knowledge-plus-heuristic cost function of node x (usually denoted f(x)) to determine the

order in which the search visits nodes in the tree. The cost function is a sum of two functions:

the past path-cost function, which is the known distance from the starting node to the current

node x (usually denoted g(x))

a future path-cost function, which is an admissible "heuristic estimate" of the distance from x to

the goal (usually denoted h(x)).

The h(x) part of the f(x) function must be an admissible heuristic; that is, it must not overestimate

the distance to the goal. Thus, for an application like routing, h(x) might represent the straight-line

distance to the goal, since that is physically the smallest possible distance between any two points

or nodes. A practical implementation of A* search in C++ is shown:

#include

#include

#include

#include

#include

#include

using namespace std;

const int n=60; // horizontal size of the map

const int m=60; // vertical size size of the map

static int map[n][m];

static int closed_nodes_map[n][m]; // map of closed (tried-out) nodes

static int open_nodes_map[n][m]; // map of open (not-yet-tried) nodes

static int dir_map[n][m]; // map of directions

const int dir=8; // number of possible directions to go at any position

// if dir==4

//static int dx[dir]={1, 0, -1, 0};

//static int dy[dir]={0, 1, 0, -1};

// if dir==8

static int dx[dir]={1, 1, 0, -1, -1, -1, 0, 1};

static int dy[dir]={0, 1, 1, 1, 0, -1, -1, -1};

class node

{

// current position

int xPos;

int yPos;

// total distance already travelled to reach the node

int level;

// priority=level+remaining distance estimate

int priority; // smaller: higher priority

public:

node(int xp, int yp, int d, int p)
http://en.wikipedia.org/wiki/Best-first_searchhttp://en.wikipedia.org/wiki/Node_(graph_theory)http://en.wikipedia.org/wiki/Goal_nodehttp://en.wikipedia.org/wiki/Priority_queuehttp://en.wikipedia.org/wiki/Heuristichttp://en.wikipedia.org/wiki/Admissible_heuristichttp://en.wikipedia.org/wiki/Admissible_heuristichttp://en.wikipedia.org/wiki/Routing



{xPos=xp; yPos=yp; level=d; priority=p;}

int getxPos() const {return xPos;}

int getyPos() const {return yPos;}

int getLevel() const {return level;}

int getPriority() const {return priority;}

void updatePriority(const int & xDest, const int & yDest)

{

priority=level+estimate(xDest, yDest)*10; //A*

}

// give better priority to going strait instead of diagonally

void nextLevel(const int & i) // i: direction

{

level+=(dir==8?(i%2==0?10:14):10);

}

// Estimation function for the remaining distance to the goal.

const int & estimate(const int & xDest, const int & yDest) const

{

static int xd, yd, d;

xd=xDest-xPos;

yd=yDest-yPos;

// Euclidian Distance

d=static_cast(sqrt(xd*xd+yd*yd));

// Manhattan distance

//d=abs(xd)+abs(yd);

// Chebyshev distance

//d=max(abs(xd), abs(yd));

return(d);

}

};

// Determine priority (in the priority queue)

bool operator b.getPriority();

}

// A-star algorithm.

// The route returned is a string of direction digits.

string pathFind( const int & xStart, const int & yStart,

const int & xFinish, const int & yFinish )

{

static priority_queue pq[2]; // list of open (not-yet-tried) nodes

static int pqi; // pq index

static node* n0;

static node* m0;

static int i, j, x, y, xdx, ydy;

static char c;

pqi=0;

// reset the node maps

for(y=0;ygetPriority(); // mark it on the open nodes map



// A* search

while(!pq[pqi].empty())

{

// get the current node w/ the highest priority

// from the list of open nodes

n0=new node( pq[pqi].top().getxPos(), pq[pqi].top().getyPos(),

pq[pqi].top().getLevel(), pq[pqi].top().getPriority());

x=n0->getxPos(); y=n0->getyPos();

pq[pqi].pop(); // remove the node from the open list

open_nodes_map[x][y]=0;

// mark it on the closed nodes map

closed_nodes_map[x][y]=1;

// quit searching when the goal state is reached

//if((*n0).estimate(xFinish, yFinish) == 0)

if(x==xFinish && y==yFinish)

{

// generate the path from finish to start

// by following the directions

string path="";

while(!(x==xStart && y==yStart))

{

j=dir_map[x][y];

c='0'+(j+dir/2)%dir;

path=c+path;

x+=dx[j];

y+=dy[j];

}

// garbage collection

delete n0;

// empty the leftover nodes

while(!pq[pqi].empty()) pq[pqi].pop();

return path;

}

// generate moves (child nodes) in all possible directions

for(i=0;igetLevel(),

n0->getPriority());

m0->nextLevel(i);

m0->updatePriority(xFinish, yFinish);

// if it is not in the open list then add into that

if(open_nodes_map[xdx][ydy]==0)

{

open_nodes_map[xdx][ydy]=m0->getPriority();

pq[pqi].push(*m0);

// mark its parent node direction

dir_map[xdx][ydy]=(i+dir/2)%dir;

}

else if(open_nodes_map[xdx][ydy]>m0->getPriority())

{

// update the priority info

open_nodes_map[xdx][ydy]=m0->getPriority();

// update the parent direction info

dir_map[xdx][ydy]=(i+dir/2)%dir;

// replace the node

// by emptying one pq to the other one



// except the node to be replaced will be ignored

// and the new node will be pushed in instead

while(!(pq[pqi].top().getxPos()==xdx &&

pq[pqi].top().getyPos()==ydy))

{

pq[1-pqi].push(pq[pqi].top());

pq[pqi].pop();

}

pq[pqi].pop(); // remove the wanted node

// empty the larger size pq to the smaller one

if(pq[pqi].size()>pq[1-pqi].size()) pqi=1-pqi;

while(!pq[pqi].empty())

{

pq[1-pqi].push(pq[pqi].top());

pq[pqi].pop();

}

pqi=1-pqi;

pq[pqi].push(*m0); // add the better node instead

}

else delete m0; // garbage collection

}

}

delete n0; // garbage collection

}

return ""; // no route found

}

int main()

{

srand(time(NULL));

// create empty map

for(int y=0;y



cout



EB COMPONENT:

Most data centres, by design, consume vast amounts of energy in an incongruously wasteful

manner, interviews and documents show. Online companies typically run their facilities at

maximum capacity around the clock, whatever the demand. As a result, data centres can

waste 90 percent or more of the electricity they pull off the grid, The Times found. To guard

against a power failure, they further rely on banks of generators that emit diesel exhaust. The

pollution from data centres has increasingly been cited by the authorities for violating clean

air regulations, documents show.

In Silicon Valley, many data centres appear on the state governments Toxic Air Contaminant

Inventory, a roster of the areas top stationary diesel polluters. Worldwide, the digital

warehouses use about 30 billion watts of electricity, roughly equivalent to the output of 30

nuclear power plants, according to estimates industry experts compiled for The Times. Data

centres in the United States account for one-quarter to one-third of that load, the estimates

show. Energy efficiency varies widely from company to company. But at the request of The

Times, the consulting firm McKinsey & Company analysed energy use by data centres and

found that, on average, they were using only 6 percent to 12 percent of the electricity

powering their servers to perform computations. The rest was essentially used to keep servers

idling and ready in case of a surge in activity that could slow or crash their operations.

A server is a sort of bulked-up desktop computer, minus a screen and keyboard that contains

chips to process data. The study sampled about 20,000 servers in about 70 large data centres

spanning the commercial gamut: drug companies, military contractors, banks, media

companies and government agencies.

The inefficient use of power is largely driven by a symbiotic relationship between users who

demand an instantaneous response to the click of a mouse and companies that put their

business at risk if they fail to meet that expectation.



Even running electricity at full throttle has not been enough to satisfy the industry. In addition

to generators, most large data centres contain banks of huge, spinning flywheels or thousands

of lead-acid batteries many of them similar to automobile batteries to power the

computers in case of a grid failure as brief as a few hundredths of a second, an interruption

that could crash the servers.

Rapid digitization of content has led to extreme demands on storage systems. The nature of

data access such as simulation data dumps, check-pointing, real-time data access

queries, data warehousing queries, etc., warrants an online data management solution. Most

online data management solutions make use of hierarchical storage management techniques

to accommodate the large volume of digital data. In such solutions, a major portion of the

data set is usually hosted by tape-based archival solutions, which offer cheaper storage at the

cost of higher access latencies. This loss in performance due to tape-based archive solutions

limits the performance of the higher level applications that make these different types of data

accesses. This is particularly true since many queries may require access to older, archived

data. The decreasing cost and increasing capacity of commodity disks are rapidly changing

the economics of online storage and making the use of these large disk arrays more practical

for applications of low latency. Large disk arrays also enable system scaling, an important

property as the growth of online content is predicted to be enormous. The enhanced

performance offered by disk-based solutions comes at a price, however. Keeping huge arrays

of spinning disks has a hidden cost, i.e., energy. Industry surveys suggest that the cost of

powering our countrys data centers is growing at a rate of 25% every year .Among various

components of a data center, storage is one of the biggest energy consumers, consuming

almost 27% of the total.

Given the well-known growth in total cost of ownership, a solution that can mitigate the high

cost of power, yet keep data online, is needed. Various studies of data access patterns in data

centers suggest that on any given day, the total amount of data accessed is less than 5% of

the total stored. Most energy conservation techniques make use of various optimizations to

conserve energy, but this usually comes with a huge performance penalty. Massive array of

idle disks (MAID) is a design philosophy recently adopted .The central idea behind MAID

is that all disks in a MAID storage array are not spinning all the time. Within a MAID

subsystem, disks remain dormant (i.e., powered off) until the data they hold is requested.

When a request arrives for data on a disk that is off, the controller turns on the disk, which

takes around 710 s, and services the request. Additionally, a set of disks is designated as

cache disks, which are always spinning (i.e., never turned off).



This disk-based caching is necessary because the regular memory cache is usually not large

enough to hold all of the frequently accessed data. The MAID concept works on the

assumption that less than 5% of the stored data actually gets accessed on any given day.

Keeping this in mind, the MAID controller tries to make sure that frequently accessed data

are moved to the always-on cache disks. For this reason, the response time of the system is

very tightly tied to the size of the cache disk set. By increasing the cache hit ratio, the

controller tries to minimize the response time and also conserve energy. The savings increase

as the storage environments get larger. A commercial product based on this idea, Copan

MAID, has seen a great deal of success in the realm of archival systems. One of the main

drawbacks with the MAID approach is that it tries to keep the most frequently

accessed data in the cache disk set, but this will not ensure good response time for noncached

data. Data that are not cached could include data being accessed for the first time or data that

cannot be cached due to their sheer volume or the access pattern. A study of using application

hints to increase the efficiency of prefetching and to achieve better energy efficient is

presented. Application hinting has drawn a great amount of interest in the high-performance

computing community. The idea is to use application hints for the purpose of prefetching

data ahead of time, thereby reducing the file system I/O latencies. Other approaches to

increasing energy efficiency for storage systems are possible. A new energy conservation

technique for disk array-based network servers called popular data concentration (PDC) was

proposed .According to this scheme, frequently accessed data are migrated to a subset of the

disks. The main assumption here is that data exhibit heavily clustered popularities. PDC tries

to lay data out across the disk array so that the first disk stores the most popular disk data,

the second disk stores the next most popular disk data, and so on. Since data blocks are always

moved around to different locations in the disk array, this mapping mechanism becomes very

important.

A new solution called Hibernator was presented. The main idea here is the dynamic switching of

disk speeds based on observed performance. This approach makes use of multispeed disk drives

that can run at different speeds but have to be shut down to make a transition between different

speeds. The general consensus of all these works is that normal cache management algorithms are

not necessarily the best option when it comes to power conservation. Specifically, they explore the

use of spatial and temporal locality information together in order to develop cache replacement

algorithms. A new type of hard disk drive that can operate at multiple speeds is also explored for

energy saving. It was demonstrated that using dynamic revolutions per minute (DRPM) speed

control for power management in server disk arrays can provide large savings in power

consumption with very little degradation in delivered performance.



CONCLUSION In the above report, the design of iFCP in SAN has been discussed. Then in the implementation

part,the design of OSPF, design of protocols using CFSM and practical implementation of the

shortest path algorithms has been done. Some of the basic fundamentals of the Fibre Channel

technology have been provided in this thesis, which allow us to understand the origins of many

Storage Area Networking mechanisms. We can understand on a basic level how Fibre Channel

devices discover one another during initialization, which enables them to establish the lower layers

necessary for the transport of subsequent Fibre Channel frames.

FUTURE ENHANCEMENTS Fibre Channel is a mature networking technology that is ideally suited for SANs.FC is a gigabit

technology supporting speeds up to 1gigabit per second with faster rates being realized in the future.

FC supports different transport media such as copper for lower cost lower capability configurations

or fiber optics for greater speed and distance at a higher cost. FC products support a SANs need for

reliability by incorporating self-configuring capabilities that allow reconfiguring of networks,

faulty equipment isolation and maintenance of the network all with minimal to no impact on SAN

operations.

Clearly, the largest hindrance introduced with regard to the transmission of Fibre Channel over

TCP/IP is the interface for the Fibre Channel hardware itself. Research in Fibre Optic technology

seeks to reduce this problem

References

[1]David Norman. "Fibre Channel Technology for Storage Area Networks, December

2011

[2]Claire Kraft. Design and Implementation of iFCP University of Colorado, 2004

[3]Franco Travostino,iFCP-A Technical overview", Storage Networking Industry

Association (SNIA), 2013

[4]Stephan Rurhup,Network Protocol Design and Evaluation, University of

Freiburg,2009

iFCP in Storage area networks

Documents

Transcript of iFCP in Storage area networks