Students v.s. Engineersmarc/cs410/seng11-22.pdf · About Me • Senior software engineer • Been...
Transcript of Students v.s. Engineersmarc/cs410/seng11-22.pdf · About Me • Senior software engineer • Been...
Students v.s. Engineers1
From a Student to an Engineer
http://imgur.com/gallery/yvBNw
ITzu Chen, PillPack 11/22/2016
About Me
• Senior software engineer • Been at some start-ups, including cloud
computing, financial tech and now healthcare tech
• PillPack - a full service online pharmacy
2
PillPack
3
Engineering Challenges
• Unstructured Data, including prescription, insurance claims and drug information
• Manual process, like chasing prescription refill, packing medications and prescription verification
• Substance control, insurance, privacy regulations, etc.
4
• Robots to help pack medications • Prescription verification tool • Pre-checked drug conflict • Automatic insurance claim processing
5
Roadmap
• Good coding style • Business requirements • Software architecture • Tool selection
6
Good code?
7
-XKCD
Good code?
8https://www.amazon.com/dp/0132350882/?tag=stackoverfl08-20
Coding style
• Meaningful variable names • Well-organized code structure • Smaller methods • Comments • Unit tests
9https://en.wikipedia.org/wiki/Best_coding_practices#Design
10
If I follow every coding style guide, will I be the “good” programmer?
“A good programmer can be as 10X times more productive than a mediocre one” - Steve McConnell
http://www.construx.com/10x_Software_Development/Origins_of_10X_–_How_Valid_is_the_Underlying_Research_/
Steve McConnell, an author of software engineering textbooks such as Code Complete, Rapid Development, and Software Estimation
Not really
• A lot of planning happens before the first line of code is written
• Business requirements • Software architectures • Tool selection
11
Roadmap
• Good coding style • Business requirements • Software architecture • Select the right tools
12
Business Requirements
• What’s the purpose of the application? • What’s the feature of the application? • What might be the bottleneck?
13
Wikipedia
14
1. Show static page filled with texts and pics 2. Ability for user to add new content 3. Reference other pages 4. Search articles
Features
Bottleneck 1. More read than write 2. Massive unstructured data 3. Search efficiency
online encyclopedia
Netflix
15
1.Provide on-demand content 2.Suggest related content 3.Search content
Features
1.Latency has to be low 2.Lots of contents
Bottleneck
on-demand content streaming
Roadmap
• Good coding style • Business requirements • Software architecture • Select the right tools
16
Software Architecture
17
In Patterns of Enterprise Application Architecture
https://www.amazon.com/Patterns-Enterprise-Application-Architecture-Martin/dp/0321127420
Adaptable, Maintainable, Minimize complexity
Analogy: House Architecture
18http://www.roomsketcher.com/features/2d-floor-plans/
Break Systems into Components
• Separate concerns: • as little overlap as possible (low coupling)
• Single responsibility principle: • only be responsible for one thing
• Least knowledge principle: • should not know details of other
components • DRY (don’t repeat yourself):
• same functionality should only exist in one place 19
https://msdn.microsoft.com/en-us/library/ee658124.aspx
House Architecture Analogy
20
•Separate Concerns •Single responsibility principle •Least knowledge principle •DRY
Trade-off
• Cost/performance • Business requirement • Implementation complexity
21
Resource Constraints
22http://www.rukle.com/at/4605/ks-studio-floor/1603/
http://www.friv5games.com/e59807d8a38af8b8-college-dorm-room-floor-plans.html
Studio Dorm
Caveat
23
“Premature optimization is the root of all evil” - Donald Knuth
Waterfall development is less favorable than agile development in nowadays.
Waterfall development: a sequential development process. Agile development: a iterative, short cycle development process
- an author of The Art of Computer Programming and a creator of TeX
Agile Development
• Bare in mind the design principle • Based on limited resources, design a
flexible system architecture which can be maintained and adapted easily
24
Wikipedia
25
1. Show static page filled with texts and pics 2. Ability for user to add new content 3. Reference other pages 4. Search articles
Features
Bottleneck 1. More read than write 2. Massive unstructured data 3. Search efficiency
online encyclopedia
web framework
Wikipedia
Frontend DatabaseController
Display infos Accept fronted request Get data from DB Process data
Data storage Search
Which part of the system will be most possible bottleneck?
Wikipedia
27
FrontendDatabase cluster
Controller
Display infos Accept fronted request Get data from DB Process data
Data storage Search
What if the website receive more writing traffic than we expected?Requirement changed!
28
Wikipedia
28
FrontendDatabase cluster
Controller
Display infos Accept fronted request Get data from DB
Data storage Search
Workers
Process data
Components Inside Components
• Break components into smaller components/modules/classes
• Fill in more details while you break it down
29
Design Practice
• Keep design consistent on each layer • Understand how components will
communicate with each other • Keep the data format consistent • Establish a coding style and naming
convention for development
30https://msdn.microsoft.com/en-us/library/ee658124.aspx
Keep Design Consistent
• Put microwave, stove and fridge in your bedroom because you’re sometime hungry at night.
• The electricity of the house started to be unstable and your roommate can’t figure out why.
31
How do Components Communicate?
• You wouldn’t want 10 doors in your rooms
32
Keep Data Format Consistent
• Like 110 V in every room in the house
33
• Understand requirements and software architecture can help you develop the “good code”
• For example, let’s say you join Facebook and your first task is to play sound while user express emotion. How do you know what the right place to do it in the million lines of code?
34http://www.construx.com/10x_Software_Development/Origins_of_10X_–_How_Valid_is_the_Underlying_Research_/
Example: Trading System
• Project in financial-tech company • Trading systems buy and sell stocks based
on in-coming data and have to show real-time report
35
Trading System
36
Trading strategy
Real-time data steaming
Financial broker
Real-time report
Receiving data Processing data
Making trading decision Send trading request Receive trading result Generate report
21
3
4
First breakdown
37
Trading Strategy
Making trading decision Generate report
Trading System Order system
Send trading request Receive trading result
38
class TradingServer(): def __init__(self, trading_server, config): super(BaseTrader, self).__init__() self.order_system = OrderHandler.new()
def eventOccurred(self, event): msg = self.process_event(event) self.makeDecision(msg)
def makeDecision(self, msg): if msg["value"] > self.threadhold order = {"action": "buy"} else order = {"action": "sell"} end self.current_trade = self.order_system.placeOrder(contract, order) def updateOrderStatus(self) return self.order_system.getOrderStatus(self.current_trade)
If we want to get some extra order information?
Don’t try to get the information from the 3rd party server directly. Should always go through order system.
39
class TradingServer(): def __init__(self, trading_server, config): super(BaseTrader, self).__init__() self.order_system = OrderHandler.new()
def eventOccurred(self, event): msg = self.process_event(event) self.makeDecision(msg)
def makeDecision(self, msg): if msg["value"] > self.threadhold order = {"action": "buy"} else order = {"action": "sell"} end self.current_trade = self.order_system.placeOrder(contract, order) def updateOrderStatus(self) return self.order_system.getOrderStatus(self.current_trade)
API for Real-time data steaming
40
class OrderSystem():
def __init__(self): super(OrderSystem,self).__init__() return def placeOrder(self,contract,order,wait=True,account=None,client=None,options=None): order_id = self.initOrder() Ordering = contract if options is None else dict(options.items() + contract.items()) self._order[order_id]={ 'contract': contract, 'order': order, 'Ordering': Ordering } Log.info("Place Order: {0}".format(Ordering)) placed_time = self.now() self.sendRequest('NewOrderSingle',Ordering) self._orderStatus[order_id].setAttr('placedTime',placed_time)
return order_id
def updateOrderStatus(self, id, msg): self.setStatus(id,msg) Log.info("Order Statu ...")
def getOrderStatus(self,order_id): return self._orderStatus[order_id]
API for Trading System
API for Trading System
API for financial broker
41
class OrderSystem():
def __init__(self): super(OrderSystem,self).__init__() return def placeOrder(self,contract,order,wait=True,account=None,client=None,options=None): order_id = self.initOrder() Ordering = contract if options is None else dict(options.items() + contract.items()) self._order[order_id]={ 'contract': contract, 'order': order, 'Ordering': Ordering } Log.info("Place Order: {0}".format(Ordering)) placed_time = self.now() self.sendRequest('NewOrderSingle',Ordering) self._orderStatus[order_id].setAttr('placedTime',placed_time)
return order_id
def updateOrderStatus(self, id, msg): self.setStatus(id,msg) Log.info("Order Statu ...")
def getOrderStatus(self,order_id): return self._orderStatus[order_id]
Translate broker response to desired data format
42
Real-time report
Whoever works on the report component can focus on making the report pretty
self.fields = { 'OrderQty':50.0, 'CumQty':50.0, 'OrdType':'1', 'ExecType':'F', 'OrdStatus':'2', 'Symbol':'CL', 'LastQty':0, 'LeavesQty':0.0, 'OrderID':'208552373', 'Account':'3353084QAFU', 'TargetCompID':'3353084', 'LastPx':92.56, 'SenderCompID':'TRAD', 'MsgType':'8', 'AvgPx':92.5414, 'MaturityMonthYear':'201212', 'Side':'1', 'SecurityType':'FUTURE', 'ClOrdID': None }
Roadmap
• Good coding style • Business requirements • Software architecture • Select the right tools
43
Select the Right Tool
• Language • Database • Web framework
• No single correct answer • Trade-off trade-off trade-off
44
Language
45https://twitter.com/gerardolsj/status/634126156501876737
**
Language
• Static typing / Dynamic typing • Client side/ Server side • Functional / Imperative
46
47
less verbose evaluate data at run time
better optimization less runtime errors
Dynamic typing Static typing
https://www.sitepoint.com/typing-versus-dynamic-typing/
48
Request data from server Display info
Fulfill user request Data processing
Client side Server side
https://www.sitepoint.com/typing-versus-dynamic-typing/
- Javascript - UI realted: CSS, HTML
- PHP - Python - Ruby - Java - c++/C
49
What task to perform How to perform task
Functional Imperative
http://wiki.c2.com/?QuickSortInHaskellhttp://www.algolist.net/Algorithms/Sorting/Quicksort
Example
• Splitting a string - what language you would choose to do the task?
50
#include <iostream> #include <string>#include <sstream>#include <time.h>#include <vector>
using namespace std;
class StringRef{private: char const* begin_; int size_;
public: int size() const { return size_; } char const* begin() const { return begin_; } char const* end() const { return begin_ + size_; }
StringRef( char const* const begin, int const size ) : begin_( begin ) , size_( size ) {}};
vector<StringRef> split3( string const& str, char delimiter = ' ' ){ vector<StringRef> result;
enum State { inSpace, inToken };
State state = inSpace; char const* pTokenBegin = 0; // Init to satisfy compiler. for( auto it = str.begin(); it != str.end(); ++it ) { State const newState = (*it == delimiter? inSpace : inToken); if( newState != state ) { switch( newState ) { case inSpace: result.push_back( StringRef( pTokenBegin, &*it - pTokenBegin ) ); break; case inToken: pTokenBegin = &*it; } } state = newState; } if( state == inToken ) { result.push_back( StringRef( pTokenBegin, &*str.end() - pTokenBegin ) ); } return result;}
int main() { string input_line; vector<string> spline; long count = 0; int sec, lps; time_t start = time(NULL);
cin.sync_with_stdio(false); //disable synchronous IO
while(cin) { getline(cin, input_line); //spline.clear(); //empty the vector for the next line to parse
//I'm trying one of the two implementations, per compilation, obviously:// split1(spline, input_line); //split2(spline, input_line);
vector<StringRef> const v = split3( input_line ); count++; };
count--; //subtract for final over-read sec = (int) time(NULL) - start; cerr << "C++ : Saw " << count << " lines in " << sec << " seconds." ; if (sec > 0) { lps = count / sec; cerr << " Crunch speed: " << lps << endl; } else cerr << endl; return 0;}
Task: Split a string
Pros: Performance, Less error prompt
http://stackoverflow.com/questions/9378500/why-is-splitting-a-string-slower-in-c-than-python
52
#!/usr/bin/env pythonfrom __future__ import print_function import timeimport sys
count = 0start_time = time.time()dummy = None
for line in sys.stdin: dummy = line.split() count += 1
delta_sec = int(time.time() - start_time)print("Python: Saw {0} lines in {1} seconds. ".format(count, delta_sec), end='')if delta_sec > 0: lps = int(count/delta_sec) print(" Crunch Speed: {0}".format(lps))else: print('')
Task: Split a string
Pros: Less dev time, easy to read
http://stackoverflow.com/questions/9378500/why-is-splitting-a-string-slower-in-c-than-python
Database
• SQL: • has to define schema first • suitable for tightly related data • vertically scale
• NoSQL: • good for unstructured data • horizontally scale
53https://www.sitepoint.com/sql-vs-nosql-differences/ http://www.thegeekstuff.com/2014/01/sql-vs-nosql-db/?utm_source=tuicool
Wikipedia
54
Django, flask (Python) MongoDB ElasticSearch
Web framework NoSQL
web framework
Frontend DatabaseController
Display infos Accept fronted request Get data from DB Process data
Data storage Search
Rails (Ruby)
Recap
• Good coding style is necessary for a good software engineer
• Understanding big picture of the system and trade-off can help you make coding decision better
55