iFCP in Storage area networks
-
Upload
tarquin-overlord -
Category
Documents
-
view
27 -
download
3
description
Transcript of iFCP in Storage area networks
-
R.V. College of Engineering
1
R.V. COLLEGE OF ENGINEERING, BANGALORE-560059
(Autonomous Institution Affiliated to VTU, Belgaum)
SELF STUDY REPORT ON
INTERNET FIBRE CHANNEL PROTOCOL
IN STORAGE AREA NETWORKS
Submitted by
SANJAY VINAYAK H K
1RV13CS139
Under the guidance of
Ms. Ganashree K.C, Assistant Professor, CSE
Mrs. Prapulla S.B, Assistant Professor, CSE
Ms.Vishalakshi Prabhu H, Assistant Professor,CSE
Dr.Neeta Shivakumar, Associate Professor, BT
Submitted to
COMPUTER SCIENCE AND ENGINEERING DEPARTMENT
R.V. College of Engineering, Bangalore-59
-
R.V. College of Engineering
Department of Computer Science
Engineering
R.V. COLLEGE OF ENGINEERING, BANGALORE - 560059 (Autonomous Institution Affiliated to VTU, Belgaum)
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
CERTIFICATE
Certified that the Self Study work titled INTERNET FIBRE CHANNEL
PROTOCOL IN STORAGE AREA NETWORKS is carried out by SANJAY
VINAYAK H K (1RV13CS139), who is bonafide student of R.V College of
Engineering, Bangalore, in partial fulfillment for the award of degree of Bachelor of
Engineering in Computer Science and Engineering of the Visvesvaraya
Technological University, Belgaum during the year 2014-2015. It is certified that all
corrections/suggestions indicated for the internal Assessment have been incorporated
in the report deposited in the departmental library. The Self Study report has been
approved as it satisfies the academic requirements in respect of Self Study work
prescribed by the institution for the said degree.
Ms.Ganashree K.C Mrs. Prapulla S.B
Assistant Professor, CSE Assistant Professor, CSE
Ms. Vishalakshi Prabhu H Dr.Neeta Shivakumar
Assistant Professor, CSE Associate Professor, BT
Dr.Shobha G
Head of Department,
Department of CSE,
R.V.College of Engineering, Bangalore-560059
-
R.V. College of Engineering
Department of Computer Science 3
Engineering
ii
TABLE OF CONTENTS
PROBLEM DEFINITION
DESIGN
IMPLEMENTATION
CONCLUSION
FUTURE ENHANCEMENT
-
R.V. College of Engineering
Department of Computer Science 4
Engineering
LIST OF FIGURES AND TABLES
Figure 1: TRADITIONAL DAS ARCHITECTURE.....7
Figure 2: FIBRE CHANNEL TOPOLOGIES...8
Figure 3: FIBRE CHANNEL PROTOCOL LAYER.. 9
-
R.V. College of Engineering
Department of Computer Science 5
Engineering
LIST OF SYMBOLS, ACRONYMS / ABBREVIATION AND
NOMENCLATURE
SAN : Storage Area Network
NAS: Network Attached Storage
DAS: Direct Attached Storage
iSCSI: Internet Small Computer Interface
iFCP: Internet Fibre Channel Protocol
FCIP: Fibre Channel over Internet Protocol
TCP/IP: Transmission Control Protocol/Internet Protocol
LAN: Local Area Network
WAN: Wide Area Network
IETF: Internet Engineering Task Force
-
R.V. College of Engineering
Department of Computer Science 6
Engineering
PROBLEM DEFINITION
Todays applications are rapidly overwhelming the capacity of networks and of storage
space. In e-commerce, huge databases support electronic cataloging and ordering while large
numbers of customers attempt to simultaneously access the information. As corporations
grow and enter the international business environment, enterprise systems maintain corporate
information across not only states but countries. To maintain and make available to all users
that large amount of information reliably and in a timely manner is challenging to say the
least. More and more feature films are incorporating digital effects. Video editing software,
Computer Aided Drafting and photo-realistic rendering software are utilized to either modify
a film or even create one from scratch. Even a few seconds worth of a film requires hundreds
of megabytes of storage space. When teams of 20 animators/digital artists are trying to work
on their own piece of a film, the burden on the storage and the network facilities are
tremendous. Web sites that serve up streaming audio and or video are consuming more
resources as the demand for these services go up. In addition to simply supporting bandwidth
and storage increases, corporations now want to be able to safeguard their data. This typically
entails making backups of data (to tape) and saving data off the corporate premises. This is
an extremely small sample of the applications that are challenging the storage and networking
architectures.
Traditionally, these applications have been supported by file servers with either large internal
disks or disk farms directly attached to the server The disks are typically connected to the
server via SCSI (Small Computer System Interconnect). The SCSI standard defines a high
throughput parallel interface that is used to connect up to 7 peripherals (including the host
adapter card itself) to the computer. Examples of these peripherals are scanners, CD
(Compact Disk) players/recorders, digitizers, tape drives and as previously stated hard disks.
This architecture has several limitations. The server can only access data on devices directly
attached to it. If a server or any part of its SCSI hardware fails, access to its data is cut off.
Also, SCSI supports a finite number of devices, therefore the amount of data a server can
access is limited. If more storage space is needed, but there is no more room on the SCSI bus,
expansion is no longer possible. SCSI, due to its parallel structure, has distance limitations
as well. This requires that the storage be near the servers. These limitations are the driving
force behind a new paradigm for data storage and access.
-
R.V. College of Engineering
Department of Computer Science 7
Engineering
Distance is one of the major drawbacks for those who rely solely on Fibre Channel. In order
to promote disaster recovery, particularly in areas experiencing frequent earthquakes, Fibre
Channel was originally designed to allow storage to take place over distances of up to
approximately 10 km from hosts. Even by using various methods of signal enhancement,
which might allow the distance to expand by several hundred kilometers, storing data over
distances hundreds or thousands of miles is out oftion across both enterprise and small to
medium sized business environments. the question for devices connected merely by Fibre
Channel cables. More recently, other disaster recovery concerns have arisen which have
resulted in the promotion of storage which takes place over longer distances, sometimes
crossing international boundaries .
Additionally, Fibre Channel media and equipment can often be expensive and
cumbersome, both to install and manage. Although backbone technologies do exist which
can carry Fibre Channel data over longer distances, their installation, cost, and maintenance
would present many difficulties that could be handled easily by a technology such as TCP/IP,
which is already in place across the world. Thus, the relationship between SAN technologies
which run over TCP/IP and other SAN technologies can be considered similar to that between
Wide Area Networking and the LAN. Additionally, some SANs which run over TCP/IP can
sometimes replace other SAN technologies altogether. These factors, combined with the fact
that the Internet has become so widespread and convenient, are among the primary
motivations for this thesis, and for the rising popularity of storage over TCP/IP.
SANs also tend to enable more effective disaster recovery processes. A SAN could span a
distant location containing a secondary storage array. This enables storage replication either
implemented by disk array controllers, by server software, or by specialized SAN devices.
Figure 1:Traditional DAS architecture
-
R.V. College of Engineering
Department of Computer Science Engineering 8
DESIGN
The generic Fibre Channel network is composed of one or more bi-directional point-to-
point channels. The links support 1Gbps (or 100MBps) data rates in each direction. The transport
media may be fiber optic cable, copper twisted pair or coax cable. The links in the FC network are
between communication ports known as N_ports. N_port stands for Node Port where a node is a
device on the FC network. The links may be point-to-point between N_ports or the may be set up
as a Fabric. A Fabric consists of several N_Ports connected to a switch. Note: Ports on the switch
are called F_ports. Finally, the ports may be daisy chained to form a ring. This is called an
Arbitrated Loop (FC-AL). In this configuration the ports are referred to as L_ports. No switch is
necessary for FC-AL. These basic layouts may be combined in different ways to create more
complex topologies.
FC is typically realized in one of 3 topologies: Point-To-Point, Loop or Fabric. The Point-To-Point
connection is the simplest type of connection. It can exist by itself or as a subset in a Fabric or Loop
topology.
Figure 2:Different fibre channel topologies
-
R.V. College of Engineering
Department of Computer Science Engineering 9
Fibre Channel uses a multi layer protocol architecture along the lines of the 7 Layer OSI Model.
There are 5 layers. They are FC-0: Physical layer, FC-1: Encode/Decode layer, FC-2: Framing
Protocol/Flow Control, FC-3: Common Services and FC-4: Upper Level Protocol Support.
Additionally, there is another layer, which although is not typically considered part of the basic
architecture is so important as to warrant mention. This is the FC-AL (Arbitrated Loop) layer.
iFCP uses OSPF to implement addressing and routing.
Figure 3: Fibre channel protocol layer
With iFCP, N_Port addressing can be locally assigned by each gateway for a Gateway Region
local mode operation. Alternatively, in address-transparent mode, N_Ports can be globally
assigned across an interconnected set of gateways.
The routing between Gateway Regions operates with IP only. Routing that takes place within a
Gateway Region (if there is any routing within) is opaque to the IP network. For example, Fibre
Channel routing and DFS traffic that may be operating within a Gateway Region does not flow
between Gateway Regions.
Address Transparent Mode
In address transparent mode, the scope of N_Port addresses is fabric wide. The IP network fabric
is defined as a name server object containing a collection of gateways. The iSNS name server
acts as a fabric Domain Address Manager, and maintains a pool of Domain IDs for the fabric,
-
R.V. College of Engineering
Department of Computer Science Engineering 10
assigning FC domain IDs to each gateway within the fabric. Within each Gateway Region, the
gateway acts as the downstream principal switch. The advantage of address transparent mode is the
transparency across the fabric and the resulting simplification of gateway operation. The
disadvantage is that each Gateway Region consumes 65K of Node IDs and this is inefficient when
the Gateway Region N_Port count is low. Also, Address Transparent Mode is less scaleable as
communication among N_Ports is restricted to
N_Ports within the fabric.
Gateway Region Local Mode
In Gateway Region Local mode, the scope of the N_Port addresses is local to the Gateway
Region. Each gateway maps N_Port network addresses of external devices to N_Port fabric
addresses. Normal inter-gateway frame traffic is mapped on the fly.
The advantage of Local Mode is scalability. N_Port connectivity is network-wide, allowing
unrestricted addresses within a Gateway Region. Since each gateway is individually responsible
for N_Port addresses allocated to its Gateway Region, the fabric becomes more stable as the
network scales in size. This is because there is no dependence on a central addressing authority, as
is the case with Fibre Channel and iFCP Transparent Mode fabrics.
-
R.V. College of Engineering
Department of Computer Science Engineering 11
Mapping of Fibre Channel to iFCP
Fibre Channel frames ingressing the iFCP gateway are converted to iFCP frames through the
process shown in Figure 7. The FC frames may be addressed to remote devices, or to other FC
devices attached to the same iFCP gateway. If the latter is the case, no address translation
mechanism is needed, and the frame is directly delivered to the local N_Port. If the former is the
case, then an address mapping function must occur that maps a key found in the D_ID to the TCP
connection addressed to the appropriate remote N_Port network address (N_Port ID and IP address)
-
R.V. College of Engineering
Department of Computer Science Engineering 12
IMPLEMENTATION
DAA Component:
Implementation of Dijkstras algorithm in OSPF:
OSPF uses a shorted path first algorithm in order to build and calculate the shortest path to all
known destinations.The shortest path is calculated with the use of the Dijkstra algorithm. The
algorithm by itself is quite complicated. This is a very high level, simplified way of looking at the
various steps of the algorithm:
1. Upon initialization or due to any change in routing information, a router generates a link-
state advertisement. This advertisement represents the collection of all link-states on that
router.
2. All routers exchange link-states by means of flooding. Each router that receives a link-state
update should store a copy in its link-state database and then propagate the update to other
routers.
3. After the database of each router is completed, the router calculates a Shortest Path Tree to
all destinations. The router uses the Dijkstra algorithm in order to calculate the shortest path
tree. The destinations, the associated cost and the next hop to reach those destinations form
the IP routing table.
4. In case no changes in the OSPF network occur, such as cost of a link or a network being
added or deleted, OSPF should be very quiet. Any changes that occur are communicated
through link-state packets, and the Dijkstra algorithm is recalculated in order to find the
shortest path.
The Dijkstra algorithm places each router at the root of a tree and calculates the shortest path to
each destination based on the cumulative cost required to reach that destination. Each router will
have its own view of the topology even though all the routers will build a shortest path tree using
the same link-state database. The following sections indicate what is involved in building a shortest
path tree.
OSPF Cost
The cost (also called metric) of an interface in OSPF is an indication of the overhead required to
send packets across a certain interface. The cost of an interface is inversely proportional to the
bandwidth of that interface. A higher bandwidth indicates a lower cost. There is more overhead
(higher cost) and time delays involved in crossing a 56k serial line than crossing a 10M ethernet
line. The formula used to calculate the cost is:
-
R.V. College of Engineering
Department of Computer Science Engineering 13
cost= 10000 0000/bandwith in bps
For example, it will cost 10 EXP8/10 EXP7 = 10 to cross a 10M Ethernet line and will cost 10
EXP8/1544000 = 64 to cross a T1 line.
By default, the cost of an interface is calculated based on the bandwidth; you can force the cost of
an interface with the ip ospf cost interface subconfiguration mode command.
Shortest Path Tree
Assume we have the following network diagram with the indicated interface costs. In order to build
the shortest path tree for RTA, we would have to make RTA the root of the tree and calculate the
smallest cost for each destination.
The above is the view of the network as seen from RTA. Note the direction of the arrows in
calculating the cost. For example, the cost of RTB's interface to network 128.213.0.0 is not relevant
when calculating the cost to 192.213.11.0. RTA can reach 192.213.11.0 via RTB with a cost of 15
(10+5). RTA can also reach 222.211.10.0 via RTC with a cost of 20 (10+10) or via RTB with a
cost of 20 (10+5+5). In case equal cost paths exist to the same destination, Cisco's implementation
of OSPF will keep track of up to six next hops to the same destination.
After the router builds the shortest path tree, it will start building the routing table accordingly.
Directly connected networks will be reached via a metric (cost) of 0 and other networks will be
reached according to the cost calculated in the tree.
http://www.cisco.com/c/dam/en/us/support/docs/ip/open-shortest-path-first-ospf/7039-spf1.gif -
R.V. College of Engineering
Department of Computer Science Engineering 14
TOC COMPONENT:
In computer science, a communicating finite-state machine is a finite state
machine labelled with "receive" and "send" operations over some alphabet of channels. They were
introduced by Brand and Zafiropulo, and can be used as a model of concurrent processes like Petri
nets. Communicating finite state machines are used frequently for modelling a communication
protocol since they make it possible to detect major protocol design errors, including boundedness,
deadlocks, and unspecified receptions.
The advantage of communicating finite state machines is that they make it possible to decide many
properties in communication protocols, beyond the level of just detecting such properties. This
advantage rules out the need for human assistance or restriction in generality.
It has been proved with the introduction of the concept itself that when two finite state machines
communicate with only one type of messages, boundedness, deadlocks, and unspecified reception
state can be decided and identified while such is not the case when the machines communicate with
two or more types of messages. Later, it has been further proved that when only one finite state
machine communicates with single type of message while the communication of its partner is
unconstrained, we can still decide and identify boundedness, deadlocks, and unspecified reception
state.
It has been further proved that when the message priority relation is empty, boundedness, deadlocks
and unspecified reception state can be decided even under the condition in which there are two or
more types of messages in the communication between finite state machines.
Boundedness, deadlocks, and unspecified reception state are all decidable in polynomial time
(which means that a particular problem can be solved in tractable, not infinite, amount of time)
since the decision problems regarding them are nondeterministic logspace complete.
Communicating finite state machines can be the most powerful in situations where the propagation
delay is not negligible (so that several messages can be in transit at one time) and in situations
where it is natural to describe the protocol parties and the communication medium as separate
entities.
http://en.wikipedia.org/wiki/Computer_sciencehttp://en.wikipedia.org/wiki/Finite_state_machinehttp://en.wikipedia.org/wiki/Finite_state_machinehttp://en.wikipedia.org/wiki/Concurrency_(computer_science)http://en.wikipedia.org/wiki/Petri_netshttp://en.wikipedia.org/wiki/Petri_nets -
R.V. College of Engineering
Department of Computer Science Engineering 15
Figure 4:CFSM for TCP/IP
-
R.V. College of Engineering
Department of Computer Science Engineering 16
OOPS COMPONENT:
Shortest path algorithms find wide applications Given below is a practical implementation of
Dijkstras algorithm in C++.Dijkstras algorithm helps in finding the single source shortest path in
OSPF.
#include
using namespace std;
#define INFINITY 99
void sp(int);
void pp(int);
int choose();
int dist[10],path[10],reach[10];
int adj[10][10],n,edge;
int main()
{
int i,j,s;
coutn;
for( i=1;i
-
R.V. College of Engineering
Department of Computer Science Engineering 17
}
couts;
cout
-
R.V. College of Engineering
Department of Computer Science Engineering 18
}
int choose()
{
int min=INFINITY;
int j;
for(int w=1;w
-
R.V. College of Engineering
Department of Computer Science Engineering 19
A* uses a best-first search and finds a least-cost path from a given initial node to one goal node (out
of one or more possible goals). As A* traverses the graph, it follows a path of the lowest expected
total cost or distance, keeping a sorted priority queue of alternate path segments along the way.
It uses a knowledge-plus-heuristic cost function of node x (usually denoted f(x)) to determine the
order in which the search visits nodes in the tree. The cost function is a sum of two functions:
the past path-cost function, which is the known distance from the starting node to the current
node x (usually denoted g(x))
a future path-cost function, which is an admissible "heuristic estimate" of the distance from x to
the goal (usually denoted h(x)).
The h(x) part of the f(x) function must be an admissible heuristic; that is, it must not overestimate
the distance to the goal. Thus, for an application like routing, h(x) might represent the straight-line
distance to the goal, since that is physically the smallest possible distance between any two points
or nodes. A practical implementation of A* search in C++ is shown:
#include
#include
#include
#include
#include
#include
using namespace std;
const int n=60; // horizontal size of the map
const int m=60; // vertical size size of the map
static int map[n][m];
static int closed_nodes_map[n][m]; // map of closed (tried-out) nodes
static int open_nodes_map[n][m]; // map of open (not-yet-tried) nodes
static int dir_map[n][m]; // map of directions
const int dir=8; // number of possible directions to go at any position
// if dir==4
//static int dx[dir]={1, 0, -1, 0};
//static int dy[dir]={0, 1, 0, -1};
// if dir==8
static int dx[dir]={1, 1, 0, -1, -1, -1, 0, 1};
static int dy[dir]={0, 1, 1, 1, 0, -1, -1, -1};
class node
{
// current position
int xPos;
int yPos;
// total distance already travelled to reach the node
int level;
// priority=level+remaining distance estimate
int priority; // smaller: higher priority
public:
node(int xp, int yp, int d, int p)
http://en.wikipedia.org/wiki/Best-first_searchhttp://en.wikipedia.org/wiki/Node_(graph_theory)http://en.wikipedia.org/wiki/Goal_nodehttp://en.wikipedia.org/wiki/Priority_queuehttp://en.wikipedia.org/wiki/Heuristichttp://en.wikipedia.org/wiki/Admissible_heuristichttp://en.wikipedia.org/wiki/Admissible_heuristichttp://en.wikipedia.org/wiki/Routing -
R.V. College of Engineering
Department of Computer Science Engineering 20
{xPos=xp; yPos=yp; level=d; priority=p;}
int getxPos() const {return xPos;}
int getyPos() const {return yPos;}
int getLevel() const {return level;}
int getPriority() const {return priority;}
void updatePriority(const int & xDest, const int & yDest)
{
priority=level+estimate(xDest, yDest)*10; //A*
}
// give better priority to going strait instead of diagonally
void nextLevel(const int & i) // i: direction
{
level+=(dir==8?(i%2==0?10:14):10);
}
// Estimation function for the remaining distance to the goal.
const int & estimate(const int & xDest, const int & yDest) const
{
static int xd, yd, d;
xd=xDest-xPos;
yd=yDest-yPos;
// Euclidian Distance
d=static_cast(sqrt(xd*xd+yd*yd));
// Manhattan distance
//d=abs(xd)+abs(yd);
// Chebyshev distance
//d=max(abs(xd), abs(yd));
return(d);
}
};
// Determine priority (in the priority queue)
bool operator b.getPriority();
}
// A-star algorithm.
// The route returned is a string of direction digits.
string pathFind( const int & xStart, const int & yStart,
const int & xFinish, const int & yFinish )
{
static priority_queue pq[2]; // list of open (not-yet-tried) nodes
static int pqi; // pq index
static node* n0;
static node* m0;
static int i, j, x, y, xdx, ydy;
static char c;
pqi=0;
// reset the node maps
for(y=0;ygetPriority(); // mark it on the open nodes map
-
R.V. College of Engineering
Department of Computer Science Engineering 21
// A* search
while(!pq[pqi].empty())
{
// get the current node w/ the highest priority
// from the list of open nodes
n0=new node( pq[pqi].top().getxPos(), pq[pqi].top().getyPos(),
pq[pqi].top().getLevel(), pq[pqi].top().getPriority());
x=n0->getxPos(); y=n0->getyPos();
pq[pqi].pop(); // remove the node from the open list
open_nodes_map[x][y]=0;
// mark it on the closed nodes map
closed_nodes_map[x][y]=1;
// quit searching when the goal state is reached
//if((*n0).estimate(xFinish, yFinish) == 0)
if(x==xFinish && y==yFinish)
{
// generate the path from finish to start
// by following the directions
string path="";
while(!(x==xStart && y==yStart))
{
j=dir_map[x][y];
c='0'+(j+dir/2)%dir;
path=c+path;
x+=dx[j];
y+=dy[j];
}
// garbage collection
delete n0;
// empty the leftover nodes
while(!pq[pqi].empty()) pq[pqi].pop();
return path;
}
// generate moves (child nodes) in all possible directions
for(i=0;igetLevel(),
n0->getPriority());
m0->nextLevel(i);
m0->updatePriority(xFinish, yFinish);
// if it is not in the open list then add into that
if(open_nodes_map[xdx][ydy]==0)
{
open_nodes_map[xdx][ydy]=m0->getPriority();
pq[pqi].push(*m0);
// mark its parent node direction
dir_map[xdx][ydy]=(i+dir/2)%dir;
}
else if(open_nodes_map[xdx][ydy]>m0->getPriority())
{
// update the priority info
open_nodes_map[xdx][ydy]=m0->getPriority();
// update the parent direction info
dir_map[xdx][ydy]=(i+dir/2)%dir;
// replace the node
// by emptying one pq to the other one
-
R.V. College of Engineering
Department of Computer Science Engineering 22
// except the node to be replaced will be ignored
// and the new node will be pushed in instead
while(!(pq[pqi].top().getxPos()==xdx &&
pq[pqi].top().getyPos()==ydy))
{
pq[1-pqi].push(pq[pqi].top());
pq[pqi].pop();
}
pq[pqi].pop(); // remove the wanted node
// empty the larger size pq to the smaller one
if(pq[pqi].size()>pq[1-pqi].size()) pqi=1-pqi;
while(!pq[pqi].empty())
{
pq[1-pqi].push(pq[pqi].top());
pq[pqi].pop();
}
pqi=1-pqi;
pq[pqi].push(*m0); // add the better node instead
}
else delete m0; // garbage collection
}
}
delete n0; // garbage collection
}
return ""; // no route found
}
int main()
{
srand(time(NULL));
// create empty map
for(int y=0;y
-
R.V. College of Engineering
Department of Computer Science Engineering 23
cout
-
R.V. College of Engineering
Department of Computer Science Engineering 24
EB COMPONENT:
Most data centres, by design, consume vast amounts of energy in an incongruously wasteful
manner, interviews and documents show. Online companies typically run their facilities at
maximum capacity around the clock, whatever the demand. As a result, data centres can
waste 90 percent or more of the electricity they pull off the grid, The Times found. To guard
against a power failure, they further rely on banks of generators that emit diesel exhaust. The
pollution from data centres has increasingly been cited by the authorities for violating clean
air regulations, documents show.
In Silicon Valley, many data centres appear on the state governments Toxic Air Contaminant
Inventory, a roster of the areas top stationary diesel polluters. Worldwide, the digital
warehouses use about 30 billion watts of electricity, roughly equivalent to the output of 30
nuclear power plants, according to estimates industry experts compiled for The Times. Data
centres in the United States account for one-quarter to one-third of that load, the estimates
show. Energy efficiency varies widely from company to company. But at the request of The
Times, the consulting firm McKinsey & Company analysed energy use by data centres and
found that, on average, they were using only 6 percent to 12 percent of the electricity
powering their servers to perform computations. The rest was essentially used to keep servers
idling and ready in case of a surge in activity that could slow or crash their operations.
A server is a sort of bulked-up desktop computer, minus a screen and keyboard that contains
chips to process data. The study sampled about 20,000 servers in about 70 large data centres
spanning the commercial gamut: drug companies, military contractors, banks, media
companies and government agencies.
The inefficient use of power is largely driven by a symbiotic relationship between users who
demand an instantaneous response to the click of a mouse and companies that put their
business at risk if they fail to meet that expectation.
-
R.V. College of Engineering
Department of Computer Science Engineering 25
Even running electricity at full throttle has not been enough to satisfy the industry. In addition
to generators, most large data centres contain banks of huge, spinning flywheels or thousands
of lead-acid batteries many of them similar to automobile batteries to power the
computers in case of a grid failure as brief as a few hundredths of a second, an interruption
that could crash the servers.
Rapid digitization of content has led to extreme demands on storage systems. The nature of
data access such as simulation data dumps, check-pointing, real-time data access
queries, data warehousing queries, etc., warrants an online data management solution. Most
online data management solutions make use of hierarchical storage management techniques
to accommodate the large volume of digital data. In such solutions, a major portion of the
data set is usually hosted by tape-based archival solutions, which offer cheaper storage at the
cost of higher access latencies. This loss in performance due to tape-based archive solutions
limits the performance of the higher level applications that make these different types of data
accesses. This is particularly true since many queries may require access to older, archived
data. The decreasing cost and increasing capacity of commodity disks are rapidly changing
the economics of online storage and making the use of these large disk arrays more practical
for applications of low latency. Large disk arrays also enable system scaling, an important
property as the growth of online content is predicted to be enormous. The enhanced
performance offered by disk-based solutions comes at a price, however. Keeping huge arrays
of spinning disks has a hidden cost, i.e., energy. Industry surveys suggest that the cost of
powering our countrys data centers is growing at a rate of 25% every year .Among various
components of a data center, storage is one of the biggest energy consumers, consuming
almost 27% of the total.
Given the well-known growth in total cost of ownership, a solution that can mitigate the high
cost of power, yet keep data online, is needed. Various studies of data access patterns in data
centers suggest that on any given day, the total amount of data accessed is less than 5% of
the total stored. Most energy conservation techniques make use of various optimizations to
conserve energy, but this usually comes with a huge performance penalty. Massive array of
idle disks (MAID) is a design philosophy recently adopted .The central idea behind MAID
is that all disks in a MAID storage array are not spinning all the time. Within a MAID
subsystem, disks remain dormant (i.e., powered off) until the data they hold is requested.
When a request arrives for data on a disk that is off, the controller turns on the disk, which
takes around 710 s, and services the request. Additionally, a set of disks is designated as
cache disks, which are always spinning (i.e., never turned off).
-
R.V. College of Engineering
Department of Computer Science Engineering 26
This disk-based caching is necessary because the regular memory cache is usually not large
enough to hold all of the frequently accessed data. The MAID concept works on the
assumption that less than 5% of the stored data actually gets accessed on any given day.
Keeping this in mind, the MAID controller tries to make sure that frequently accessed data
are moved to the always-on cache disks. For this reason, the response time of the system is
very tightly tied to the size of the cache disk set. By increasing the cache hit ratio, the
controller tries to minimize the response time and also conserve energy. The savings increase
as the storage environments get larger. A commercial product based on this idea, Copan
MAID, has seen a great deal of success in the realm of archival systems. One of the main
drawbacks with the MAID approach is that it tries to keep the most frequently
accessed data in the cache disk set, but this will not ensure good response time for noncached
data. Data that are not cached could include data being accessed for the first time or data that
cannot be cached due to their sheer volume or the access pattern. A study of using application
hints to increase the efficiency of prefetching and to achieve better energy efficient is
presented. Application hinting has drawn a great amount of interest in the high-performance
computing community. The idea is to use application hints for the purpose of prefetching
data ahead of time, thereby reducing the file system I/O latencies. Other approaches to
increasing energy efficiency for storage systems are possible. A new energy conservation
technique for disk array-based network servers called popular data concentration (PDC) was
proposed .According to this scheme, frequently accessed data are migrated to a subset of the
disks. The main assumption here is that data exhibit heavily clustered popularities. PDC tries
to lay data out across the disk array so that the first disk stores the most popular disk data,
the second disk stores the next most popular disk data, and so on. Since data blocks are always
moved around to different locations in the disk array, this mapping mechanism becomes very
important.
A new solution called Hibernator was presented. The main idea here is the dynamic switching of
disk speeds based on observed performance. This approach makes use of multispeed disk drives
that can run at different speeds but have to be shut down to make a transition between different
speeds. The general consensus of all these works is that normal cache management algorithms are
not necessarily the best option when it comes to power conservation. Specifically, they explore the
use of spatial and temporal locality information together in order to develop cache replacement
algorithms. A new type of hard disk drive that can operate at multiple speeds is also explored for
energy saving. It was demonstrated that using dynamic revolutions per minute (DRPM) speed
control for power management in server disk arrays can provide large savings in power
consumption with very little degradation in delivered performance.
-
R.V. College of Engineering
Department of Computer Science Engineering 27
CONCLUSION In the above report, the design of iFCP in SAN has been discussed. Then in the implementation
part,the design of OSPF, design of protocols using CFSM and practical implementation of the
shortest path algorithms has been done. Some of the basic fundamentals of the Fibre Channel
technology have been provided in this thesis, which allow us to understand the origins of many
Storage Area Networking mechanisms. We can understand on a basic level how Fibre Channel
devices discover one another during initialization, which enables them to establish the lower layers
necessary for the transport of subsequent Fibre Channel frames.
FUTURE ENHANCEMENTS Fibre Channel is a mature networking technology that is ideally suited for SANs.FC is a gigabit
technology supporting speeds up to 1gigabit per second with faster rates being realized in the future.
FC supports different transport media such as copper for lower cost lower capability configurations
or fiber optics for greater speed and distance at a higher cost. FC products support a SANs need for
reliability by incorporating self-configuring capabilities that allow reconfiguring of networks,
faulty equipment isolation and maintenance of the network all with minimal to no impact on SAN
operations.
Clearly, the largest hindrance introduced with regard to the transmission of Fibre Channel over
TCP/IP is the interface for the Fibre Channel hardware itself. Research in Fibre Optic technology
seeks to reduce this problem
References
[1]David Norman. "Fibre Channel Technology for Storage Area Networks, December
2011
[2]Claire Kraft. Design and Implementation of iFCP University of Colorado, 2004
[3]Franco Travostino,iFCP-A Technical overview", Storage Networking Industry
Association (SNIA), 2013
[4]Stephan Rurhup,Network Protocol Design and Evaluation, University of
Freiburg,2009
-
R.V. College of Engineering
Department of Computer Science Engineering 28