GS1 – Design Specification

14
GS1 Design Specification GS1 – Design Specification Thai Stock Data Web scraping 1 System Architecture Figure1. System Architecture First of all, a script is developed in order to scrape stock data from different stock websites. In the very beginning, the script scrapes all Thai stock symbol and full names from SET (Stock Exchange of Thailand) website. Then the historical data of each stock is scraped from Yahoo’s finance website. The script basically scrapes a table containing the historical data and loops through all the pages until there is no table to scrape. This happens to all the existing Thai stock. However once there is existing Thai stock data, the script will only scrape the latest (after the market reaches the closing price) data which would be the very first row of the table. In addition, the script is to automatically scrape once a day and update the website without users being affected with any changes to the back-end. Then using the scraped data, the adjusted prices are calculated. The scraped data also contains corporate actions, dividends and stock split. The adjusted open, close, high, and low price is calculated and stored into the database along with the unadjusted price as well. Calculating beforehand will allow the system to do the calculation only once which helps avoid mistakes and work, since the adjusted price will be called as much as the unadjusted price. All the necessary data will be available in the database. So when the user wishes to export or view any data, the system can get it directly from the database. There is no need to scrape or calculate anything new here. Then the additional information of each corporate action is scraped from SET. The INDEX data, which is the measurement of the value in the stock market, are scraped from Google. Thai Stock Data Web Scraping 1 Scraping Script Stock Websites Extract and Calculate Database Export Display Scraped Data Browser

Transcript of GS1 – Design Specification

Page 1: GS1 – Design Specification

GS1 Design Specification

GS1 – Design SpecificationThai Stock Data Web scraping

1 System Architecture

Figure1. System Architecture

First of all, a script is developed in order to scrape stock data from different stockwebsites. In the very beginning, the script scrapes all Thai stock symbol and full namesfrom SET (Stock Exchange of Thailand) website. Then the historical data of each stock isscraped from Yahoo’s finance website. The script basically scrapes a table containing thehistorical data and loops through all the pages until there is no table to scrape. Thishappens to all the existing Thai stock.

However once there is existing Thai stock data, the script will only scrape thelatest (after the market reaches the closing price) data which would be the very first row ofthe table. In addition, the script is to automatically scrape once a day and update thewebsite without users being affected with any changes to the back-end. Then using thescraped data, the adjusted prices are calculated.

The scraped data also contains corporate actions, dividends and stock split. Theadjusted open, close, high, and low price is calculated and stored into the database alongwith the unadjusted price as well. Calculating beforehand will allow the system to do thecalculation only once which helps avoid mistakes and work, since the adjusted price willbe called as much as the unadjusted price.

All the necessary data will be available in the database. So when the user wishes toexport or view any data, the system can get it directly from the database. There is no needto scrape or calculate anything new here. Then the additional information of eachcorporate action is scraped from SET. The INDEX data, which is the measurement of thevalue in the stock market, are scraped from Google.

Thai Stock Data Web Scraping 1

ScrapingScript

StockWebsites

Extract andCalculate

Database

Export

Display

Scraped Data

Browser

Page 2: GS1 – Design Specification

GS1 Design Specification

2 Detailed Design

The system is developed using Meteor, a full-stack JavaScript framework. It is built on topof NodeJS and allows the client-side and server-side to communicate to each other usingJavaScript. This allows the system to be updated real-time and doesn’t ruin userexperience or front-end when the back-end is being updated. The website does not use anytheme and is built from ground up.

2.1 Database The system uses MongoDB which is document-oriented and a NoSQL database. Thestructure takes in the form of JSON-like documents. It is easier to index, faster to update,and can scale without any problem. The current database is structured as below:

Figure2. System Database

The database stores the adjusted and unadjusted price of a stock separately. Thedetails of corporate actions are stored separately as well. However, they all have a uniquedata which is the stock symbol. A stock data is identified by its symbol and this structureallows us to quickly and only get the stock data that user asks for.

Thai Stock Data Web Scraping 2

Page 3: GS1 – Design Specification

GS1 Design Specification

2.2 User FlowchartTo have a broad overview of user actions and how the system replies, a flowchart isdrawn.

Figure3. User Flowchart

The current system allows four main actions from the user. The first action is toclick on any stock symbol to view the historical data. Then users are allowed to subscribe,filter, or export. Subscribing will only be possible if the email the user input exists.Filtering and exporting data is possible if the data they user wish to filter or export existsin the database. If the data that user wants doesn’t exist, then nothing happens at themoment. The improved system will notify users about the result and might be able torecommend users of existing data instead.

Thai Stock Data Web Scraping 3

Page 4: GS1 – Design Specification

GS1 Design Specification

The system's export feature can filter data to be exported by choosing markets,price type, multiple symbols at once, and etc. Then To allow better comparison betweentwo stocks, the system has implemented a comparison feature where users can choose twostocks and check their performance, whether it shows weakness or strength when themarket rises or falls. The system will allow to calculate the total return on investment of astock.

The system allow users to search with different combination of criteria to find astock. Also, if a user creates an account, they are now able to manage their subscriptions.In addition, logged in users can favorite stocks and personalize their list to view the stocksthat matters to them most and export as CSV file as well.

2.3 User Interface

Figure4. Homepage

Thai Stock Data Web Scraping 4

Page 5: GS1 – Design Specification

GS1 Design Specification

Figure5. Company/Stock List Page

Figure6. Stock Data Page (Filter Date, Export, Graph)

Thai Stock Data Web Scraping 5

Page 6: GS1 – Design Specification

GS1 Design Specification

Figure7. Stock Data Page (Historical Price Data)

Figure8. Stock Data Page (Corporate Actions)

Thai Stock Data Web Scraping 6

Page 7: GS1 – Design Specification

GS1 Design Specification

Figure9. Stock Data Page (Total Return on Investment Calculator)

Figure10. Stock Data Page (Dividend Reinvestment Calculator)

Thai Stock Data Web Scraping 7

Page 8: GS1 – Design Specification

GS1 Design Specification

Figure11. Bulk Export Page

Figure12. Custom Export Page

Thai Stock Data Web Scraping 8

Page 9: GS1 – Design Specification

GS1 Design Specification

Figure 13.1. Price Relative Indicator (Form)

Figure13.2. Price Relative Indicator (Close Price)

Thai Stock Data Web Scraping 9

Page 10: GS1 – Design Specification

GS1 Design Specification

Figure13.3. Price Relative Indicator (Ratio)

Figure14.1. Search (criteria all checked)

Thai Stock Data Web Scraping 10

Page 11: GS1 – Design Specification

GS1 Design Specification

Figure14.2. Search (example)

Figure15.1. Subscribe Form

Thai Stock Data Web Scraping 11

Page 12: GS1 – Design Specification

GS1 Design Specification

Figure15.2. Subscribe Confirmation Email

Figure15.3. Subscribe Daily Email

Thai Stock Data Web Scraping 12

Page 13: GS1 – Design Specification

GS1 Design Specification

Figure16.1. Logged in user view (Additional menu and favorite button)

Figure16.1. Logged in user view (Manage subscriptions)

Thai Stock Data Web Scraping 13

Page 14: GS1 – Design Specification

GS1 Design Specification

Figure16.1. Logged in user view (Manage and export favorites)

Thai Stock Data Web Scraping 14