bsdn_2

7/30/2019 bsdn_2

1/16

7/30/2019 bsdn_2

2/16

AIM

How social and data networks can be integrated to enable enhanced and Cost-

effective performance.

PROBLEM STATEMENT

Applications to search for information online or to share multimedia content with

our friends and families, these activities generate enormous quantities of data traffic that burdens

the underlying transport network.

INTRODUCTION

Social networking applications have emerged as the platform of choice for

carrying out a number of different activities online. In addition to their primary target of social

interaction, we now also employ such applications to search for information online or to share

multimedia content with our friends and families. For instance, according to recent statistics,

each of us spends on average 15 min on YouTube every day. Similarly noteworthy is the daily

rate of queries that Twitters search engine receives: 600 million.These activities generate

enormous quantities of data traffic that burdens the underlying transport network. This article

describes how social and data networks can be integrated to enable enhancedand cost-effective

performance. Such an integrated approach to systems design will become even more important in

the near future due to the ongoing convergenceof computer-communication technologies that

will further increase the diversity of data-intensive social networking activities.

7/30/2019 bsdn_2

3/16

EXISTING SYSTEM

Distributed network directory systems feature a collection of peer computers

interested in exchanging information of interest. Each peer hosts a set of information items, e.g.,

data files that may be shared with other peers in the network upon demand. There is a registry

server that maintains a directory of all hosts in the system. When the system is queried for

information, it returns a subset of hosts, which are then directly contacted for the item of interest.

The procedure is repeated until the search is successfully concluded. Since the hosts to be

queried are selected at random by the tracking server, they may not always have the desired

information item. Therefore, multiple rounds of query may need to be initiated for detection to

occur. In addition, the selected hosts may exhibit extensive distances from the querying peer in

the topology of the underlying transport network, which in turn would make the exchange of

information between the hosts and the peer quite inefficient and costly.

Disadvantage:

The disadvantages of the traditional approach outlined above can be overcome by

keeping track of the information preferences affiliated with each peer computer in the system and

the relative distances between the peers in the underlying network topology.

PROPOSED SYSTEM

There is a growing understanding on the part of network operators and peer-to-peer

application developers that mutual benefits can be achieved by collaboration. The methods and

systems presented in this article can lead to similar synergies between online communities,

networked systems applications, and ISPs. In particular, they simultaneously enable enhanced

application performance and data transport efficiency, to the benefit of the end user and the

network operator as well.

Advantage:

In particular, they simultaneously enable enhanced application performance and datatransport efficiency

7/30/2019 bsdn_2

4/16

ARCHITECTURE

Distributed network directory systems feature a collection of peer computers

interested in exchanging information of interest. Each peer hosts a set of information items, e.g.,

data files, that may be shared with other peers in the network upon demand. There is a registry

server that maintains a directory of all hosts in the system. When the system is queried for

information, it returns a subset of hosts, which are then directly contacted for the item of interest.

The procedure is repeated until the search is successfully concluded. Since the hosts to be

queried are selected at random by the tracking server, they may not always have the desired

information item. Therefore, multiple rounds of query may need to be initiated for detection to

occur. In addition, the selected hosts may exhibit extensive distances from the querying peer in

the topology of the underlying transport network, which in turn would make the exchange of

information between the hosts and the peer quite inefficient and costly. The disadvantages of the

traditional approach outlined above can be overcome by keeping track of the informationpreferences affiliated with each peer computer in the system and the relative distances between

the peers in the underlying network topology, as shown in Figure . The former quantities can be

obtained from the membership profiles affiliated with the peer computers in the online

community in which their corresponding users participate. A preference vector pi characterizes

the preferences of useri over a collection of information themes, e.g., types of movies, images,

7/30/2019 bsdn_2

5/16

or audio content. The latter quantities, describing the distance dijbetween two nodes i andj in a

network, are increasingly available today through network operators such as Internet service

providers (ISPs). By exploiting the above context, an intelligent tracking server could provide a

suitable subset of information hosts to be queried. In particular, the server could select the hosts

that maximize the probability of information discovery at the smallest possible data transport

cost, relative to the querying peer.

7/30/2019 bsdn_2

6/16

LITERATURE SURVEY

1.1) YouTube facts and figures and Twitterstatistics,According to data analysis released in May 2012 by the Bitly Blog, day and time

affect the amount of attention a post will receive on social media sites.

Bitly Blogs analysis shows traffic peaks mid-week (from 1:00-3:00 PM ESP) for

social networking leader, Face book. As a result, links posted from 1:00-3:00 PM experience the

highest click through rates. The data shows to avoid posting on weekends and after 4:00 PM, as

traffic dramatically decreases during this timeframe. Being cognizant of the days and times that

drive the highest click through rates can potentially lead to increased conversions for social

media marketers, driving revenue for the brand.

Twitter experiences similar data regarding traffic and links; however, for Twitter,

high traffic volume spans the majority of weekdays, rather than just mid-week. From the blog,

For Twitter, posting in the afternoon earlier in the week is your best chance at achieving a high

click amount(1:00-3:00 PM Monday through Thursday). Posting after 8:00 PM should be

avoided. Specifically, dont bother posting after 3:00 PM on a Friday since, as far as being a

gateway to drive traffic to your content, it appears that Twitter doesnt works on weekends.

These trends shows that day and time impact the level of attention a post will receive on

Facebook or Twitter. If they properly leverage this data, marketers can increase the chances of

social media posts going viral, which will ultimately drive exposure and the opportunity for

conversions.

Jacob charakeshi

We consider a generic scenario of content browsing where a client is presented

with a catalogue of items of interest. Upon the selection of an item from a page of the catalogue,the client can choose the next item to browse from a list of related items presented on the same

page. The system has limited resources to have all items available for immediate access by the

browsing client. Therefore, it pays a penalty when the client selects an unavailable item.

Conversely, there is a reward that the system gains when the client selects an immediately

available item. We formulate the optimization problem of selecting the subset of items that the

7/30/2019 bsdn_2

7/16

system should have for immediate access such that its profit is maximized, for the given system

resources. We design two techniques for solving the optimization problem in linear time, as a

function of the catalogue size. We examine their performance via numerical simulations that

reveal their core properties. We also study their operation on actual YouTube data and compare

their efficiency relative to conventional solutions. Substantial performance gains are

demonstrated, due to accounting for the content graph imposed by the catalogue of items.

1.2) Enable ISP and P2P to work togetherToday this infrastructure is supporting a rapidly growing quantity of multimedia

traffic. The increasing popularity of such content and its expected migration to higher definition

formats presages an explosion in total multimedia data volumes. As libraries and access

bandwidth requirements grow to meet this demand, content providers are exploring peer-to-peer

(P2P) content distribution architectures that scale content-delivery system capacity and access

bandwidth with growing demand, by utilizing consumer storage and bandwidth as peers work

together to cooperatively download and share popular content.While they can be effective in

reducing server capital and access expense.

Recent theoretical work from Yale and proof-of-concept trials of their scheme by Pando,

Verizon, and Telefonica demonstrate that cooperative mechanisms can indeed guide peer

selection to be network-aware, resulting in simultaneous improvements in network efficiency

and application performance. In the tradition of TCP, these results strongly suggest that there is a

new opportunity for collaborative mechanisms to support global efficiencies in multimedia

content delivery. But this efficiency can best be achieved if common standards can be

established on an industry-led, collaborative basis so that P2P providers and ISPs can effectively

and efficiently coordinate.

Another major concern of ISPs is the risk of adverse impact of P2P-CDN Clients

operations on the performance of subscribers computing endpoints (PCs/Laptops) and Home

Network elements (Residential Gateways, Home Routers). As P2P-CDNs proliferate in the

current unmanaged fashion there will likely be multiple P2P Clients concurrently running on

7/30/2019 bsdn_2

8/16

each of several active PCs/Laptops on a Home Networks, potentially consuming

PC/Laptop/RG/Home Router resources and thus causing performance degradation of other

networked applications (Web surfing, VoIP, IPTV) which could be perceived by subscribers as

ISPs network/service problemsresulting in costly Customer Care costs for ISPs. This points to

the need for home-computing as well as home-network resource management standards and

tools.

7/30/2019 bsdn_2

9/16

APPLICATION MOTIVATION

There are two general motivations for application developers to work with ISPs to

optimize P2P traffic: to improve delivery speed and performance to end users, and to encourage

ISPs to work collaboratively on shared solutions for managing the bandwidth demands of P2P.

In addition, there is the opportunity to mitigate the potential negative impacts of P2P applications

on end-users other networked applications.

All P2P networks attempt to maximize download speed to their users. P2P

software generally includes strategies for optimizing data throughput. If working with ISPs will

allow P2P networks to provide superior performance, this is a strong incentive towardpartnership. As some P2P platforms start implementing open standards that provide superior

performance, competitive pressures will drive other P2P companies to follow suit.

Optimum branching

Recently, directed information graphs have been proposed as concise graphical

representations of the statistical dynamics amongst multiple random processes. A directed edge

from one node to another indicates that the past of one random process statistically affects the

future of another, given the past of all other processes. When the number of processes is large,

computing those conditional dependence tests becomes difficult. Also, when the number of

interactions becomes too large, the graph no longer facilitates visual extraction of relevant

information for decision-making. This work considers approximating the true joint distribution

on multiple random processes by another, whose directed information graph has at most one

parent for any node. Under a Kullback-Leibler (KL) divergence minimization criterion, we show

that the optimal approximate joint distribution can be obtained by maximizing a sum of directed

informations. In particular, (a) each directed information calculation only involves statistics

amongst a pair of processes and can be efficientlyestimated; (b) given all pairwise directed

informations, an efficient minimum weight spanning directed tree algorithm can be solved to

find the best tree. We demonstrate the efficiency of this approach using simulated and

experimental data. In both, the approximations preserve the relevant information for decision

making.

7/30/2019 bsdn_2

10/16

DATA FLOW DIAGRAM

Queering

Peer

Bridging

Social and

Data

Networks

Social

Network

T1 Register

T2 request

T3 friends

T4 message

T5 visitors

T6 photocomment

T7 community

T8 community member

Data

Network

Search

Engine

Web

T1 Host Database

Request

Reply

7/30/2019 bsdn_2

11/16

LIST OF MODULES

1. VIDEO SHARE2.

AUDIO SHARE

3. SOCIAL GRAPH REGULARIZATION

MODULE WISE DESIGN

1. VIDEO SHARE:

Dissemination in online communities is frequently carried out in a viral

fashion. Particularly, we send the content of interest to all our contacts in the social

graph, who then repeat the same procedure with their own contacts, and so forth, such a

mode of delivery is very inefficient from a data communication perspective and can also

lead to poor timeliness of the multimedia application comprising the content. In addition

to their primary target of social interaction, we now also employ such applications to

search for information online or to share multimedia content with our friends and

families.

2. AUDIO SHARE:

Distributed network directory systems feature a collection of peer

computers interested in exchanging information of interest. Each peer hosts a set of

information items, data files, audio files, that may be shared with other peers in the

network upon demand. There is a registry server that maintains a directory of all hosts in

the system. When the system is queried for information, it returns a subset of hosts,

which are then directly contacted for the item of interest. The procedure is repeated until

the search is successfully concluded. Since the hosts to be queried are selected at random

by the tracking server, they may not always have the desired information item. Therefore,

multiple rounds of query may need to be initiated for detection to occur.

7/30/2019 bsdn_2

12/16

3. SOCIAL GRAPH REGULARIZATION:

The social graph can be regularized with data network information to enable

efficient content file casting among its nodes. In addition, the selected hosts may

exhibit extensive distances from the querying peer in the topology of the

underlying transport network, which in turn would make the exchange of

information between the hosts and the peer quite inefficient and costly.

7/30/2019 bsdn_2

13/16

SCHEME DIAGRAM

Social network Site Side database schema

Tracker Side database schema

HostInfo

IPaddress ID Distance

7/30/2019 bsdn_2

14/16

ER-DIAGRAM

7/30/2019 bsdn_2

15/16

SOFTWARE AND HARDWARE REQUIRMENTS

HARDWARE REQUIREMENTS System : Minimum PENTIUM IV Hard Disk Space : Minimum 20GB HDD RAM : Minimum 1GB DDR RAM LAN : Minimum 2 machines

SOFTWARE REQUIREMENTS

Operating system : Windows XP/Windows 7. Coding Language : J2EE ,HTML Server Side Coding : JSP Client Side Coding : JavaScript Tool Kit : Above JDK1.6 Server : Tomcat IDE : FrontPage/Dreamviewer Back End : MySQL

7/30/2019 bsdn_2

16/16

REFERENCES

[1] Jacob Chakareski , Bridging Social and Data Networks , IEEE SIGNAL PROCESSING

MAGAZINE[141]MARCH 2012

[2] Michael Beye, Arjan Jeckmans, Zekeriya Erkin, Pieter Hartel, Reginald Lagendijk andQiang Tang, Literature Overview - Privacy in Online Social Networks

[3] Elmasri and Navathe : Fundamentals of Database Systems,5th Edition, Addison Wesley

2007.

bsdn_2

Documents

Transcript of bsdn_2