Download - EuroPython2017 - Protocols and Practices Enforcing in python through bytecode and inspection

PROTOCOLS AND PRACTICES ENFORCING THROUGH INSPECTION

Alessandro Molina@__amol__

https://github.com/[email protected]

mailto:[email protected]

mailto:[email protected]

Who am I

● Currently maintaining TurboGears2 web framework and Beaker caching/session framework.

● Author of DukPy JS env for Python and DEPOT file storage framework

● Contributor to Ming ORM for MongoDB, Kajiki Template Engine, ToscaWidgets2 web widgets framework, etc…

Why?

● Being more the “library” kind of developer I tend to write a lot of independent pieces.

● When you put those together to do the real job it’s not always easy to communicate their design and philosophy

● Developers tend to do the best they can with what they have.

Not really how it was meant to be...

Then comes documentation

● To avoid misuses you try to cover examples for most reasonable use cases in documentation.

● You quickly discover that your definition of “reasonable” is not as common as you thought

Don’t know what to do?!

Defensive Programming

● “Defend against the impossible, because the impossible will happen”.

● “Defensive programming is a form of defensive design intended to ensure the continuing function of a piece of software under unforeseen circumstances.”

Protocols & Expectations

● Protocols define how components interact with the rest of the world. Invest time in enforcing them and refuse violations.

● Developers have expectations out of your libraries, your libraries should have expectations too

Enforcing Protocols

● Interfaces, Signatures, Types, Assertions are all ways to express a protocol.

● They can provide expectations about joints between your code and users code

● But they can do little about expectations on “context” where your library runs in.

The Context

● Python is a Dynamic language with powerful inspection techniques.

● Inspection is often used for Debugging, but it’s a powerful tool to check expectations.

● Your library can inspect the surroundings to check that it expectations are met

Case #1: Import Time

● In Python a common anti-pattern is to rely on import time side effects to register to something. IE: Events

REGISTERED = {}

def onevent(event):

def onevent_deco(f):

REGISTERED.setdefault(event, []).append(f)

return f

return onevent_deco

def fire(event):

for f in REGISTERED.get(event, tuple()): f()

Case #1: Import Time@onevent('someevent')

def listener():

print('SOME EVENT!')

fire('someevent')

def factory(what):

@onevent('otherevent')

def f():

print(what)

return f

factory('HI')

fire('otherevent')


● What if factory gets never called?

● Uh? Where did my event go?

● Your event handling library can assert that it only gets used in a global context.


import inspect

def onevent(event):

def onevent_deco(f):

ctx = inspect.currentframe().f_back

if ctx.f_code.co_name != '<module>':

raise RuntimeError('Registering an event handler'

'into a transient scope!')

REGISTERED.setdefault(event, []).append(f)

return f

return onevent_deco


Traceback (most recent call last):

File "03_global_only.py", line 55, in <module>

factory('HI')

File "03_global_only.py", line 50, in factory

@onevent('otherevent')

File "03_global_only.py", line 36, in onevent_deco

raise RuntimeError('Registering an event handler into a

transient scope!')

The Context

● Checking for anti-patterns is something static code analysis tools usually do

● But they are one more dependency and piece to integrate into build pipeline.

● They are usually pretty complex to adapt with custom checks if even possible.

● They can check your code only

Inspection

● Inspection can be easily integrated into any pure-python testsuite and doesn’t require any dependency.

● It can test other people code too if they use yours.

● It can be expensive, so make sure you only enable it at test-time.

Code● Inspection does not stop to objects,

modules and classes.

● You can actually inspect code itself

● And is usually a great way to understand what’s going on.

Understanding Code# According to

https://docs.python.org/3/reference/expressions.html#operator-pr

ecedence

# the two should evaluate the same as evalutaion order is from

left to right and precedence is the same.

def func():

return True == False in [False, 5]

def func2():

return (True == False) in [False, 5]

print(func())

False

print(func2())

True

What the heck?!

Syntax Tree and ByteCode● The AST allows use to understand what’s

going on at compile time.

● ByteCode allows use to understand what’s going on at run time.

● Both are provided out of the box through the dis and ast modules.

Understanding Code - Execution

def func():

return True == False in [False, 5]

import dis

dis.dis(func)

Understanding Code - Execution 5 0 LOAD_GLOBAL 0 (True)

3 LOAD_GLOBAL 1 (False)

6 DUP_TOP

7 ROT_THREE

8 COMPARE_OP 2 (==)

11 JUMP_IF_FALSE_OR_POP 27

14 LOAD_GLOBAL 1 (False)

17 LOAD_CONST 1 (5)

20 BUILD_LIST 2

23 COMPARE_OP 6 (in)

26 RETURN_VALUE

>> 27 ROT_TWO

28 POP_TOP

29 RETURN_VALUE

Understanding Code - Parsing

>>> ast.dump(ast.parse('True == False in [False, 5]'))

Module(body=[Expr(value=Compare(

left=Name(id='True', ctx=Load()),

ops=[Eq(), In()],

comparators=[Name(id='False', ctx=Load()),

List(elts=[Name(id='False', ctx=Load()),

Num(n=5)],

ctx=Load())]

))])

Understanding Code - Parsing

● Someone already got what’s happening

True == False in [False, 5]

● It’s easy to guess what’s happening if we change it a little bit...True == False == [False, 5]

Thanks code! I got it!

Case #2: Cyclomatic Complexity

● States the complexity of a program. It is a quantitative measure of the number of linearly independent paths.

● Can be simplified as the number of IF/LOOP statements + 1 (the main path)

● A good limit is usually ~7

Case #2: Cyclomatic Complexity

def dosomething(x):

if x == 5:

print('Fifth')

else:

print('Hell')

if x == 7:

print('Seventh')

else:

print('Heaven')

Case #2: Cyclomatic Complexityimport dis

complexity = 1

for i in dis.get_instructions(dosomething):

complexity += int('JUMP_IF' in i.opname or

'FOR_ITER' == i.opname)

# 7 is usually considered a threshold over which we should split

the function

if complexity > 7:

print('You should refactor!')

# Complexity was 3, so we are fine!

I see the code!

Then?

● Once you are able to inspect the context your code runs into, what you need to check pretty much depends on what your code does and needs.

● I often successfully used inspection to ensure some properties are retained on long term on code I work on in a team.

For Example?

for f in get_methods_called_by(Resource.destroy):

with mock.patch.object(Resource, f, spec=True,

side_effect=RuntimeError('Error')):

Resource().destroy()

# Assert no files are left behind

# even when a function used by delete_attachments fails

● I used code inspection to ensure no files are left behind due to a failure in any method called when a resource is destroyed.

How to use it, it’s your choiceYou didn’t really expect me to have a talk about byte code without citing Matrix, right?

But seriously… only do this during tests or setup phases, it’s expensive and complex!

Questions?