Databases Portfolio - ssa.group .products categories, manufacturers and sellers, price changes...

Databases Portfolio - ssa.group .products categories, manufacturers and sellers, price changes during
Databases Portfolio - ssa.group .products categories, manufacturers and sellers, price changes during
Databases Portfolio - ssa.group .products categories, manufacturers and sellers, price changes during
Databases Portfolio - ssa.group .products categories, manufacturers and sellers, price changes during
Databases Portfolio - ssa.group .products categories, manufacturers and sellers, price changes during
download Databases Portfolio - ssa.group .products categories, manufacturers and sellers, price changes during

of 5

  • date post

    03-Jul-2018
  • Category

    Documents

  • view

    212
  • download

    0

Embed Size (px)

Transcript of Databases Portfolio - ssa.group .products categories, manufacturers and sellers, price changes...

  • Databases Portfolio

  • Project

    DescriptionOur Universal Web Crawler carries out the Internet surfing of more than 40 web portals of different trade companies and inputs data about their

    products into the database. Information for each product contains its code, model, description, all technical features listed on the site, price (if it

    is showed on the site), measurement unit for bought items count, etc. The ASP.NET technology is used for interface building. The program

    language is C# (.NET Framework 3.5). The interface allows setting all parameters of crawling, scheduling and running. It contains the information

    about the crawlers last run. To monitor the process the detail logging is available. You can determine where the crawlers work was incorrect and

    rectify the situation. Our database contains information about several millions of products and this number increases every day. You can look at

    the summary data for any category, manufacturer and period as well as the detailed information about goods. With the purpose of data saving

    and processing SQL Server 2008 R2 Enterprise is used. There are two databases. The first one is used as an Online Transaction Processing

    service and the second is used as a Data Warehouse. The data transfer to the Data Warehouse is implemented by transactional replication.

    The SQL Server Agent jobs are extensively used for different purposes such as database maintenance, summary tables filling, notification about

    the current state of different processes, etc. The CLR-stored procedures and functions are used for tasks which cannot be realized in

    Transact-SQL, for example, using regular expressions, downloading data from outer sources (the Internet or local networks), etc. The SQL Server

    Reporting Services are applied to generate a number of reports about both the current state of the system and the macro activities of different

    products categories, manufacturers and sellers, price changes during any period, comparison of prices in different companies, analysis of price

    index dynamics, etc. The multidimensional structures and data mining models in SQL Server Analysis Service are used to get the main trends in

    pricing for different products categories, manufacturers and sellers. The SQL Server Integration Services packages are used very much for

    database maintenance and other tasks such as data uploading onto the FTP-server, etc.

    Universal web crawling system

    Page 2

    Confidential InformationSSA LTD Databases Portfolio

    www.ssa-outsourcing.com

    Web sites, portals

    21 3 ...

    ...

    N

    Crawling Core Instance 1

    XPATH, XML,Regular Expressions,

    Proxy Servers, Anonimizers ect.

    Logs viewer Database

    ReportingServices

    Admin panel Logs viewerAdmin panel

    Crawling Core Instance M

    XPATH, XML,Regular Expressions,

    Proxy Servers, Anonimizers ect.

    Tools / TechnologiesThe ASP.NET technology is used for interface building.

    The program language is C# (.NET Framework 3.5)

    With the purpose of data saving and processing SQL Server 2008 R2 Enterprise is used.

  • Techniques and ApproachesTransformation of the incorrect HTML marker of the document into a well-structured XML document using SgmlReader

    Organizing the crawl of all website pages or the given part to retrieve the necessary data

    If needed, realizing the input of data into the required fields (e-mail, ZIP-code, login/password, etc.)

    Management of the Internet information resources scanning

    Automatic process initialization for the execution of the crawlers work

    Tuning configuration and management of crawlers work through the users interface in addition to providing different types of reports based on

    the results of crawlers work

    Selection of the necessary information from a given Internet information resource

    Handling HTML frames

    Handling complex AJAX constructions Handling custom javascript constructions

    Handling any exception pages such as 404 website not found, Site under reconstruction, etc.

    Escaping black lists

    Using anonymizers

    Using custom proxy servers

    Using Regular expressions and XPath for extraction of textual and graphical information.

    DurationMore than 5 years (current)

    Team SizePM, 1 .Net developer, 1 Database developer

    Cooperation ModelDedicated Team

    Customers Feedback

    Screenshots

    Excellent provider. Very happy with their service, professionalism, and support. Highly recommended.

    Nathan Krol, Stanley, USA

    Page 3

    Confidential InformationSSA LTD Databases Portfolio

    www.ssa-outsourcing.com

  • Project

    DescriptionThe system allows organizing a fully automated cycle of placing bets on any sport on the betfair.com site. In this case the system functionality

    description is given based on horse racing.

    HRBS carries out all the necessary work related to search and fetching data on all races in Great Britain and Ireland for the nearest few days. It

    is made possible due to the use of so-called crawlers or search robots.

    The system fetches both race cards and statistical information about horses, trainers and jockeys. Taken into account are also the pedigree and

    current real coefficients of every runner, going conditions and courses.

    HRBS includes a parser, which enables the system to prepare all statistical info for further decision-making by the predictor.

    The predictor evaluates a few tens of parameters for every runner and builds an internal rating for each of them, which allows generating

    predictions for both winners and runners that are most likely to lose.

    Once the predictions are generated, Free Betfair API can be used for automatic bet placing on betfair.com. There are a great number of various

    strategies for generating predictions and bet making. Each of these strategies is realized within a so-called Betfair bot. To ensure the stable work

    of bots and monitor the process of bet making as well as to warn about some faults and errors the Bots controller is used.

    The bots are fairly configurable. For each of them the user can set stop loss and stop won, limits for the number of runners, types of races,

    distance, going conditions, etc.

    Horse racing betting system

    Page 4

    Confidential InformationSSA LTD Databases Portfolio

    www.ssa-outsourcing.com

    Web Data Sources

    Betfair.com

    Database

    Data analyser

    Results predictor

    Webmonitor

    Crawling controle panel

    Horse pedigreecrawler

    Racing cardscrawler

    Statisicscrawler

    Resultscrawler / RRS

    reader

    Betfair API

    Betfair bots / robots

    21 3 ... N Botscontroller

  • The system also contains a component allowing to check the races results in real-time mode and return the stakes results won or loss.

    The work of the whole system is logged and can be controlled in real-time mode with the help of the Web monitor, which in addition to providing

    the possibility to collect statistical info about the performance of each bot also allows calculating the profit on each of them. HRBS includes a web

    service for connection with other applications like Secret Horse, Horse Reminder, etc.

    Page 5

    Confidential InformationSSA LTD Databases Portfolio

    www.ssa-outsourcing.com

    Tools / TechnologiesC++, C#.Net, ASP.NET, MS SQL Server 2008, DotNetNuke

    Duration2 years

    Team SizePM, database developer, ASP.Net developer, QA engineer

    Cooperation ModelTime and Materials

    Customers Feedback

    Screenshots

    "We have been very pleased with SSA. They listen to our requirements and provide solutions that meet our needs. We have been very impressed

    with their technical knowledge, attention to detail and ability to deliver on schedule."

    Company name under NDA