bsdn_2

download bsdn_2

of 16

Transcript of bsdn_2

  • 7/30/2019 bsdn_2

    1/16

  • 7/30/2019 bsdn_2

    2/16

    AIM

    How social and data networks can be integrated to enable enhanced and Cost-

    effective performance.

    PROBLEM STATEMENT

    Applications to search for information online or to share multimedia content with

    our friends and families, these activities generate enormous quantities of data traffic that burdens

    the underlying transport network.

    INTRODUCTION

    Social networking applications have emerged as the platform of choice for

    carrying out a number of different activities online. In addition to their primary target of social

    interaction, we now also employ such applications to search for information online or to share

    multimedia content with our friends and families. For instance, according to recent statistics,

    each of us spends on average 15 min on YouTube every day. Similarly noteworthy is the daily

    rate of queries that Twitters search engine receives: 600 million.These activities generate

    enormous quantities of data traffic that burdens the underlying transport network. This article

    describes how social and data networks can be integrated to enable enhancedand cost-effective

    performance. Such an integrated approach to systems design will become even more important in

    the near future due to the ongoing convergenceof computer-communication technologies that

    will further increase the diversity of data-intensive social networking activities.

  • 7/30/2019 bsdn_2

    3/16

    EXISTING SYSTEM

    Distributed network directory systems feature a collection of peer computers

    interested in exchanging information of interest. Each peer hosts a set of information items, e.g.,

    data files that may be shared with other peers in the network upon demand. There is a registry

    server that maintains a directory of all hosts in the system. When the system is queried for

    information, it returns a subset of hosts, which are then directly contacted for the item of interest.

    The procedure is repeated until the search is successfully concluded. Since the hosts to be

    queried are selected at random by the tracking server, they may not always have the desired

    information item. Therefore, multiple rounds of query may need to be initiated for detection to

    occur. In addition, the selected hosts may exhibit extensive distances from the querying peer in

    the topology of the underlying transport network, which in turn would make the exchange of

    information between the hosts and the peer quite inefficient and costly.

    Disadvantage:

    The disadvantages of the traditional approach outlined above can be overcome by

    keeping track of the information preferences affiliated with each peer computer in the system and

    the relative distances between the peers in the underlying network topology.

    PROPOSED SYSTEM

    There is a growing understanding on the part of network operators and peer-to-peer

    application developers that mutual benefits can be achieved by collaboration. The methods and

    systems presented in this article can lead to similar synergies between online communities,

    networked systems applications, and ISPs. In particular, they simultaneously enable enhanced

    application performance and data transport efficiency, to the benefit of the end user and the

    network operator as well.

    Advantage:

    In particular, they simultaneously enable enhanced application performance and datatransport efficiency

  • 7/30/2019 bsdn_2

    4/16

    ARCHITECTURE

    Distributed network directory systems feature a collection of peer computers

    interested in exchanging information of interest. Each peer hosts a set of information items, e.g.,

    data files, that may be shared with other peers in the network upon demand. There is a registry

    server that maintains a directory of all hosts in the system. When the system is queried for

    information, it returns a subset of hosts, which are then directly contacted for the item of interest.

    The procedure is repeated until the search is successfully concluded. Since the hosts to be

    queried are selected at random by the tracking server, they may not always have the desired

    information item. Therefore, multiple rounds of query may need to be initiated for detection to

    occur. In addition, the selected hosts may exhibit extensive distances from the querying peer in

    the topology of the underlying transport network, which in turn would make the exchange of

    information between the hosts and the peer quite inefficient and costly. The disadvantages of the

    traditional approach outlined above can be overcome by keeping track of the informationpreferences affiliated with each peer computer in the system and the relative distances between

    the peers in the underlying network topology, as shown in Figure . The former quantities can be

    obtained from the membership profiles affiliated with the peer computers in the online

    community in which their corresponding users participate. A preference vector pi characterizes

    the preferences of useri over a collection of information themes, e.g., types of movies, images,

  • 7/30/2019 bsdn_2

    5/16

    or audio content. The latter quantities, describing the distance dijbetween two nodes i andj in a

    network, are increasingly available today through network operators such as Internet service

    providers (ISPs). By exploiting the above context, an intelligent tracking server could provide a

    suitable subset of information hosts to be queried. In particular, the server could select the hosts

    that maximize the probability of information discovery at the smallest possible data transport

    cost, relative to the querying peer.

  • 7/30/2019 bsdn_2

    6/16

    LITERATURE SURVEY

    1.1) YouTube facts and figures and Twitterstatistics,According to data analysis released in May 2012 by the Bitly Blog, day and time

    affect the amount of attention a post will receive on social media sites.

    Bitly Blogs analysis shows traffic peaks mid-week (from 1:00-3:00 PM ESP) for

    social networking leader, Face book. As a result, links posted from 1:00-3:00 PM experience the

    highest click through rates. The data shows to avoid posting on weekends and after 4:00 PM, as

    traffic dramatically decreases during this timeframe. Being cognizant of the days and times that

    drive the highest click through rates can potentially lead to increased conversions for social

    media marketers, driving revenue for the brand.

    Twitter experiences similar data regarding traffic and links; however, for Twitter,

    high traffic volume spans the majority of weekdays, rather than just mid-week. From the blog,

    For Twitter, posting in the afternoon earlier in the week is your best chance at achieving a high

    click amount(1:00-3:00 PM Monday through Thursday). Posting after 8:00 PM should be

    avoided. Specifically, dont bother posting after 3:00 PM on a Friday since, as far as being a

    gateway to drive traffic to your content, it appears that Twitter doesnt works on weekends.

    These trends shows that day and time impact the level of attention a post will receive on

    Facebook or Twitter. If they properly leverage this data, marketers can increase the chances of

    social media posts going viral, which will ultimately drive exposure and the opportunity for

    conversions.

    Jacob charakeshi

    We consider a generic scenario of content browsing where a client is presented

    with a catalogue of items of interest. Upon the selection of an item from a page of the catalogue,the client can choose the next item to browse from a list of related items presented on the same

    page. The system has limited resources to have all items available for immediate access by the

    browsing client. Therefore, it pays a penalty when the client selects an unavailable item.

    Conversely, there is a reward that the system gains when the client selects an immediately

    available item. We formulate the optimization problem of selecting the subset of items that the

  • 7/30/2019 bsdn_2

    7/16

    system should have for immediate access such that its profit is maximized, for the given system

    resources. We design two techniques for solving the optimization problem in linear time, as a

    function of the catalogue size. We examine their performance via numerical simulations that

    reveal their core properties. We also study their operation on actual YouTube data and compare

    their efficiency relative to conventional solutions. Substantial performance gains are

    demonstrated, due to accounting for the content graph imposed by the catalogue of items.

    1.2) Enable ISP and P2P to work togetherToday this infrastructure is supporting a rapidly growing quantity of multimedia

    traffic. The increasing popularity of such content and its expected migration to higher definition

    formats presages an explosion in total multimedia data volumes. As libraries and access

    bandwidth requirements grow to meet this demand, content providers are exploring peer-to-peer

    (P2P) content distribution architectures that scale content-delivery system capacity and access

    bandwidth with growing demand, by utilizing consumer storage and bandwidth as peers work

    together to cooperatively download and share popular content.While they can be effective in

    reducing server capital and access expense.

    Recent theoretical work from Yale and proof-of-concept trials of their scheme by Pando,

    Verizon, and Telefonica demonstrate that cooperative mechanisms can indeed guide peer

    selection to be network-aware, resulting in simultaneous improvements in network efficiency

    and application performance. In the tradition of TCP, these results strongly suggest that there is a

    new opportunity for collaborative mechanisms to support global efficiencies in multimedia

    content delivery. But this efficiency can best be achieved if common standards can be

    established on an industry-led, collaborative basis so that P2P providers and ISPs can effectively

    and efficiently coordinate.

    Another major concern of ISPs is the risk of adverse impact of P2P-CDN Clients

    operations on the performance of subscribers computing endpoints (PCs/Laptops) and Home

    Network elements (Residential Gateways, Home Routers). As P2P-CDNs proliferate in the

    current unmanaged fashion there will likely be multiple P2P Clients concurrently running on

  • 7/30/2019 bsdn_2

    8/16

    each of several active PCs/Laptops on a Home Networks, potentially consuming

    PC/Laptop/RG/Home Router resources and thus causing performance degradation of other

    networked applications (Web surfing, VoIP, IPTV) which could be perceived by subscribers as

    ISPs network/service problemsresulting in costly Customer Care costs for ISPs. This points to

    the need for home-computing as well as home-network resource management standards and

    tools.

  • 7/30/2019 bsdn_2

    9/16

    APPLICATION MOTIVATION

    There are two general motivations for application developers to work with ISPs to

    optimize P2P traffic: to improve delivery speed and performance to end users, and to encourage

    ISPs to work collaboratively on shared solutions for managing the bandwidth demands of P2P.

    In addition, there is the opportunity to mitigate the potential negative impacts of P2P applications

    on end-users other networked applications.

    All P2P networks attempt to maximize download speed to their users. P2P

    software generally includes strategies for optimizing data throughput. If working with ISPs will

    allow P2P networks to provide superior performance, this is a strong incentive towardpartnership. As some P2P platforms start implementing open standards that provide superior

    performance, competitive pressures will drive other P2P companies to follow suit.

    Optimum branching

    Recently, directed information graphs have been proposed as concise graphical

    representations of the statistical dynamics amongst multiple random processes. A directed edge

    from one node to another indicates that the past of one random process statistically affects the

    future of another, given the past of all other processes. When the number of processes is large,

    computing those conditional dependence tests becomes difficult. Also, when the number of

    interactions becomes too large, the graph no longer facilitates visual extraction of relevant

    information for decision-making. This work considers approximating the true joint distribution

    on multiple random processes by another, whose directed information graph has at most one

    parent for any node. Under a Kullback-Leibler (KL) divergence minimization criterion, we show

    that the optimal approximate joint distribution can be obtained by maximizing a sum of directed

    informations. In particular, (a) each directed information calculation only involves statistics

    amongst a pair of processes and can be efficientlyestimated; (b) given all pairwise directed

    informations, an efficient minimum weight spanning directed tree algorithm can be solved to

    find the best tree. We demonstrate the efficiency of this approach using simulated and

    experimental data. In both, the approximations preserve the relevant information for decision

    making.

  • 7/30/2019 bsdn_2

    10/16

    DATA FLOW DIAGRAM

    Queering

    Peer

    Bridging

    Social and

    Data

    Networks

    Social

    Network

    T1 Register

    T2 request

    T3 friends

    T4 message

    T5 visitors

    T6 photocomment

    T7 community

    T8 community member

    Data

    Network

    Search

    Engine

    Web

    T1 Host Database

    Request

    Reply

  • 7/30/2019 bsdn_2

    11/16

    LIST OF MODULES

    1. VIDEO SHARE2.

    AUDIO SHARE

    3. SOCIAL GRAPH REGULARIZATION

    MODULE WISE DESIGN

    1. VIDEO SHARE:

    Dissemination in online communities is frequently carried out in a viral

    fashion. Particularly, we send the content of interest to all our contacts in the social

    graph, who then repeat the same procedure with their own contacts, and so forth, such a

    mode of delivery is very inefficient from a data communication perspective and can also

    lead to poor timeliness of the multimedia application comprising the content. In addition

    to their primary target of social interaction, we now also employ such applications to

    search for information online or to share multimedia content with our friends and

    families.

    2. AUDIO SHARE:

    Distributed network directory systems feature a collection of peer

    computers interested in exchanging information of interest. Each peer hosts a set of

    information items, data files, audio files, that may be shared with other peers in the

    network upon demand. There is a registry server that maintains a directory of all hosts in

    the system. When the system is queried for information, it returns a subset of hosts,

    which are then directly contacted for the item of interest. The procedure is repeated until

    the search is successfully concluded. Since the hosts to be queried are selected at random

    by the tracking server, they may not always have the desired information item. Therefore,

    multiple rounds of query may need to be initiated for detection to occur.

  • 7/30/2019 bsdn_2

    12/16

    3. SOCIAL GRAPH REGULARIZATION:

    The social graph can be regularized with data network information to enable

    efficient content file casting among its nodes. In addition, the selected hosts may

    exhibit extensive distances from the querying peer in the topology of the

    underlying transport network, which in turn would make the exchange of

    information between the hosts and the peer quite inefficient and costly.

  • 7/30/2019 bsdn_2

    13/16

    SCHEME DIAGRAM

    Social network Site Side database schema

    Tracker Side database schema

    HostInfo

    IPaddress ID Distance

  • 7/30/2019 bsdn_2

    14/16

    ER-DIAGRAM

  • 7/30/2019 bsdn_2

    15/16

    SOFTWARE AND HARDWARE REQUIRMENTS

    HARDWARE REQUIREMENTS System : Minimum PENTIUM IV Hard Disk Space : Minimum 20GB HDD RAM : Minimum 1GB DDR RAM LAN : Minimum 2 machines

    SOFTWARE REQUIREMENTS

    Operating system : Windows XP/Windows 7. Coding Language : J2EE ,HTML Server Side Coding : JSP Client Side Coding : JavaScript Tool Kit : Above JDK1.6 Server : Tomcat IDE : FrontPage/Dreamviewer Back End : MySQL

  • 7/30/2019 bsdn_2

    16/16

    REFERENCES

    [1] Jacob Chakareski , Bridging Social and Data Networks , IEEE SIGNAL PROCESSING

    MAGAZINE[141]MARCH 2012

    [2] Michael Beye, Arjan Jeckmans, Zekeriya Erkin, Pieter Hartel, Reginald Lagendijk andQiang Tang, Literature Overview - Privacy in Online Social Networks

    [3] Elmasri and Navathe : Fundamentals of Database Systems,5th Edition, Addison Wesley

    2007.