bsdn_2
-
Upload
gautam-dematti -
Category
Documents
-
view
216 -
download
0
Transcript of bsdn_2
-
7/30/2019 bsdn_2
1/16
-
7/30/2019 bsdn_2
2/16
AIM
How social and data networks can be integrated to enable enhanced and Cost-
effective performance.
PROBLEM STATEMENT
Applications to search for information online or to share multimedia content with
our friends and families, these activities generate enormous quantities of data traffic that burdens
the underlying transport network.
INTRODUCTION
Social networking applications have emerged as the platform of choice for
carrying out a number of different activities online. In addition to their primary target of social
interaction, we now also employ such applications to search for information online or to share
multimedia content with our friends and families. For instance, according to recent statistics,
each of us spends on average 15 min on YouTube every day. Similarly noteworthy is the daily
rate of queries that Twitters search engine receives: 600 million.These activities generate
enormous quantities of data traffic that burdens the underlying transport network. This article
describes how social and data networks can be integrated to enable enhancedand cost-effective
performance. Such an integrated approach to systems design will become even more important in
the near future due to the ongoing convergenceof computer-communication technologies that
will further increase the diversity of data-intensive social networking activities.
-
7/30/2019 bsdn_2
3/16
EXISTING SYSTEM
Distributed network directory systems feature a collection of peer computers
interested in exchanging information of interest. Each peer hosts a set of information items, e.g.,
data files that may be shared with other peers in the network upon demand. There is a registry
server that maintains a directory of all hosts in the system. When the system is queried for
information, it returns a subset of hosts, which are then directly contacted for the item of interest.
The procedure is repeated until the search is successfully concluded. Since the hosts to be
queried are selected at random by the tracking server, they may not always have the desired
information item. Therefore, multiple rounds of query may need to be initiated for detection to
occur. In addition, the selected hosts may exhibit extensive distances from the querying peer in
the topology of the underlying transport network, which in turn would make the exchange of
information between the hosts and the peer quite inefficient and costly.
Disadvantage:
The disadvantages of the traditional approach outlined above can be overcome by
keeping track of the information preferences affiliated with each peer computer in the system and
the relative distances between the peers in the underlying network topology.
PROPOSED SYSTEM
There is a growing understanding on the part of network operators and peer-to-peer
application developers that mutual benefits can be achieved by collaboration. The methods and
systems presented in this article can lead to similar synergies between online communities,
networked systems applications, and ISPs. In particular, they simultaneously enable enhanced
application performance and data transport efficiency, to the benefit of the end user and the
network operator as well.
Advantage:
In particular, they simultaneously enable enhanced application performance and datatransport efficiency
-
7/30/2019 bsdn_2
4/16
ARCHITECTURE
Distributed network directory systems feature a collection of peer computers
interested in exchanging information of interest. Each peer hosts a set of information items, e.g.,
data files, that may be shared with other peers in the network upon demand. There is a registry
server that maintains a directory of all hosts in the system. When the system is queried for
information, it returns a subset of hosts, which are then directly contacted for the item of interest.
The procedure is repeated until the search is successfully concluded. Since the hosts to be
queried are selected at random by the tracking server, they may not always have the desired
information item. Therefore, multiple rounds of query may need to be initiated for detection to
occur. In addition, the selected hosts may exhibit extensive distances from the querying peer in
the topology of the underlying transport network, which in turn would make the exchange of
information between the hosts and the peer quite inefficient and costly. The disadvantages of the
traditional approach outlined above can be overcome by keeping track of the informationpreferences affiliated with each peer computer in the system and the relative distances between
the peers in the underlying network topology, as shown in Figure . The former quantities can be
obtained from the membership profiles affiliated with the peer computers in the online
community in which their corresponding users participate. A preference vector pi characterizes
the preferences of useri over a collection of information themes, e.g., types of movies, images,
-
7/30/2019 bsdn_2
5/16
or audio content. The latter quantities, describing the distance dijbetween two nodes i andj in a
network, are increasingly available today through network operators such as Internet service
providers (ISPs). By exploiting the above context, an intelligent tracking server could provide a
suitable subset of information hosts to be queried. In particular, the server could select the hosts
that maximize the probability of information discovery at the smallest possible data transport
cost, relative to the querying peer.
-
7/30/2019 bsdn_2
6/16
LITERATURE SURVEY
1.1) YouTube facts and figures and Twitterstatistics,According to data analysis released in May 2012 by the Bitly Blog, day and time
affect the amount of attention a post will receive on social media sites.
Bitly Blogs analysis shows traffic peaks mid-week (from 1:00-3:00 PM ESP) for
social networking leader, Face book. As a result, links posted from 1:00-3:00 PM experience the
highest click through rates. The data shows to avoid posting on weekends and after 4:00 PM, as
traffic dramatically decreases during this timeframe. Being cognizant of the days and times that
drive the highest click through rates can potentially lead to increased conversions for social
media marketers, driving revenue for the brand.
Twitter experiences similar data regarding traffic and links; however, for Twitter,
high traffic volume spans the majority of weekdays, rather than just mid-week. From the blog,
For Twitter, posting in the afternoon earlier in the week is your best chance at achieving a high
click amount(1:00-3:00 PM Monday through Thursday). Posting after 8:00 PM should be
avoided. Specifically, dont bother posting after 3:00 PM on a Friday since, as far as being a
gateway to drive traffic to your content, it appears that Twitter doesnt works on weekends.
These trends shows that day and time impact the level of attention a post will receive on
Facebook or Twitter. If they properly leverage this data, marketers can increase the chances of
social media posts going viral, which will ultimately drive exposure and the opportunity for
conversions.
Jacob charakeshi
We consider a generic scenario of content browsing where a client is presented
with a catalogue of items of interest. Upon the selection of an item from a page of the catalogue,the client can choose the next item to browse from a list of related items presented on the same
page. The system has limited resources to have all items available for immediate access by the
browsing client. Therefore, it pays a penalty when the client selects an unavailable item.
Conversely, there is a reward that the system gains when the client selects an immediately
available item. We formulate the optimization problem of selecting the subset of items that the
-
7/30/2019 bsdn_2
7/16
system should have for immediate access such that its profit is maximized, for the given system
resources. We design two techniques for solving the optimization problem in linear time, as a
function of the catalogue size. We examine their performance via numerical simulations that
reveal their core properties. We also study their operation on actual YouTube data and compare
their efficiency relative to conventional solutions. Substantial performance gains are
demonstrated, due to accounting for the content graph imposed by the catalogue of items.
1.2) Enable ISP and P2P to work togetherToday this infrastructure is supporting a rapidly growing quantity of multimedia
traffic. The increasing popularity of such content and its expected migration to higher definition
formats presages an explosion in total multimedia data volumes. As libraries and access
bandwidth requirements grow to meet this demand, content providers are exploring peer-to-peer
(P2P) content distribution architectures that scale content-delivery system capacity and access
bandwidth with growing demand, by utilizing consumer storage and bandwidth as peers work
together to cooperatively download and share popular content.While they can be effective in
reducing server capital and access expense.
Recent theoretical work from Yale and proof-of-concept trials of their scheme by Pando,
Verizon, and Telefonica demonstrate that cooperative mechanisms can indeed guide peer
selection to be network-aware, resulting in simultaneous improvements in network efficiency
and application performance. In the tradition of TCP, these results strongly suggest that there is a
new opportunity for collaborative mechanisms to support global efficiencies in multimedia
content delivery. But this efficiency can best be achieved if common standards can be
established on an industry-led, collaborative basis so that P2P providers and ISPs can effectively
and efficiently coordinate.
Another major concern of ISPs is the risk of adverse impact of P2P-CDN Clients
operations on the performance of subscribers computing endpoints (PCs/Laptops) and Home
Network elements (Residential Gateways, Home Routers). As P2P-CDNs proliferate in the
current unmanaged fashion there will likely be multiple P2P Clients concurrently running on
-
7/30/2019 bsdn_2
8/16
each of several active PCs/Laptops on a Home Networks, potentially consuming
PC/Laptop/RG/Home Router resources and thus causing performance degradation of other
networked applications (Web surfing, VoIP, IPTV) which could be perceived by subscribers as
ISPs network/service problemsresulting in costly Customer Care costs for ISPs. This points to
the need for home-computing as well as home-network resource management standards and
tools.
-
7/30/2019 bsdn_2
9/16
APPLICATION MOTIVATION
There are two general motivations for application developers to work with ISPs to
optimize P2P traffic: to improve delivery speed and performance to end users, and to encourage
ISPs to work collaboratively on shared solutions for managing the bandwidth demands of P2P.
In addition, there is the opportunity to mitigate the potential negative impacts of P2P applications
on end-users other networked applications.
All P2P networks attempt to maximize download speed to their users. P2P
software generally includes strategies for optimizing data throughput. If working with ISPs will
allow P2P networks to provide superior performance, this is a strong incentive towardpartnership. As some P2P platforms start implementing open standards that provide superior
performance, competitive pressures will drive other P2P companies to follow suit.
Optimum branching
Recently, directed information graphs have been proposed as concise graphical
representations of the statistical dynamics amongst multiple random processes. A directed edge
from one node to another indicates that the past of one random process statistically affects the
future of another, given the past of all other processes. When the number of processes is large,
computing those conditional dependence tests becomes difficult. Also, when the number of
interactions becomes too large, the graph no longer facilitates visual extraction of relevant
information for decision-making. This work considers approximating the true joint distribution
on multiple random processes by another, whose directed information graph has at most one
parent for any node. Under a Kullback-Leibler (KL) divergence minimization criterion, we show
that the optimal approximate joint distribution can be obtained by maximizing a sum of directed
informations. In particular, (a) each directed information calculation only involves statistics
amongst a pair of processes and can be efficientlyestimated; (b) given all pairwise directed
informations, an efficient minimum weight spanning directed tree algorithm can be solved to
find the best tree. We demonstrate the efficiency of this approach using simulated and
experimental data. In both, the approximations preserve the relevant information for decision
making.
-
7/30/2019 bsdn_2
10/16
DATA FLOW DIAGRAM
Queering
Peer
Bridging
Social and
Data
Networks
Social
Network
T1 Register
T2 request
T3 friends
T4 message
T5 visitors
T6 photocomment
T7 community
T8 community member
Data
Network
Search
Engine
Web
T1 Host Database
Request
Reply
-
7/30/2019 bsdn_2
11/16
LIST OF MODULES
1. VIDEO SHARE2.
AUDIO SHARE
3. SOCIAL GRAPH REGULARIZATION
MODULE WISE DESIGN
1. VIDEO SHARE:
Dissemination in online communities is frequently carried out in a viral
fashion. Particularly, we send the content of interest to all our contacts in the social
graph, who then repeat the same procedure with their own contacts, and so forth, such a
mode of delivery is very inefficient from a data communication perspective and can also
lead to poor timeliness of the multimedia application comprising the content. In addition
to their primary target of social interaction, we now also employ such applications to
search for information online or to share multimedia content with our friends and
families.
2. AUDIO SHARE:
Distributed network directory systems feature a collection of peer
computers interested in exchanging information of interest. Each peer hosts a set of
information items, data files, audio files, that may be shared with other peers in the
network upon demand. There is a registry server that maintains a directory of all hosts in
the system. When the system is queried for information, it returns a subset of hosts,
which are then directly contacted for the item of interest. The procedure is repeated until
the search is successfully concluded. Since the hosts to be queried are selected at random
by the tracking server, they may not always have the desired information item. Therefore,
multiple rounds of query may need to be initiated for detection to occur.
-
7/30/2019 bsdn_2
12/16
3. SOCIAL GRAPH REGULARIZATION:
The social graph can be regularized with data network information to enable
efficient content file casting among its nodes. In addition, the selected hosts may
exhibit extensive distances from the querying peer in the topology of the
underlying transport network, which in turn would make the exchange of
information between the hosts and the peer quite inefficient and costly.
-
7/30/2019 bsdn_2
13/16
SCHEME DIAGRAM
Social network Site Side database schema
Tracker Side database schema
HostInfo
IPaddress ID Distance
-
7/30/2019 bsdn_2
14/16
ER-DIAGRAM
-
7/30/2019 bsdn_2
15/16
SOFTWARE AND HARDWARE REQUIRMENTS
HARDWARE REQUIREMENTS System : Minimum PENTIUM IV Hard Disk Space : Minimum 20GB HDD RAM : Minimum 1GB DDR RAM LAN : Minimum 2 machines
SOFTWARE REQUIREMENTS
Operating system : Windows XP/Windows 7. Coding Language : J2EE ,HTML Server Side Coding : JSP Client Side Coding : JavaScript Tool Kit : Above JDK1.6 Server : Tomcat IDE : FrontPage/Dreamviewer Back End : MySQL
-
7/30/2019 bsdn_2
16/16
REFERENCES
[1] Jacob Chakareski , Bridging Social and Data Networks , IEEE SIGNAL PROCESSING
MAGAZINE[141]MARCH 2012
[2] Michael Beye, Arjan Jeckmans, Zekeriya Erkin, Pieter Hartel, Reginald Lagendijk andQiang Tang, Literature Overview - Privacy in Online Social Networks
[3] Elmasri and Navathe : Fundamentals of Database Systems,5th Edition, Addison Wesley
2007.