Python profiling
-
Upload
dreampuf -
Category
Technology
-
view
1.023 -
download
4
description
Transcript of Python profiling
Introducton
性能分析(Performance analysis 或 profiling)是收集程序运行时信息为手段研究程序行为的分析方法。
-- Wikipedia
Introductondef qks(ls): if not ls : return ls p, ls = ls[0], ls[1:] return qks([i for i in ls if i < p]) + [p] + qks([i for i in ls if i >= p]) if __name__ == "__main__": import random ls = [random.randint(0, 100) for i in xrange(10)]
Agenda
time utility, time module, timeit, sys.getsizeof
Module: profile, cProfile, hotshot, pystats
Line Profiler, Memory Profiler, Objgraph
RunSnakeRun
In Action
Conclusion
绿曼巴蛇,最快的蛇,每小时11公里。
time utility
dreampuf@HX:~/hub/fortest$ time python test_timeit.py0.143877983093
real 0m0.184s #实际总耗时user 0m0.166s #应用耗时sys 0m0.015s #应用期间执行系统指令耗时
time utility
dreampuf@HX:~/hub/fortest$ time sleep 5 real 0m5.036s user 0m0.001s sys 0m0.003s
time utility
系统级别耗时(⼀一般而言 real > user + sys)应用耗时和系统调用耗时比率
time module dreampuf@HX:~/hub/fortest$ python test_timeit.py 2.31266021729e-05
import time def time_module(): start = time.time() qks(ls) return time.time() - start COUNT = 100000 print sum([time_module() for i in xrange(COUNT)])/COUNT
time module
平台精度灵活
time module class TimerIt(object): def __enter__(self): self.start = time.time() def __exit__(self, *args): print time.time() - self.start
with t: qks(ls*10) with t: qks(ls*100)
timeit
@Timer def caller(): qks(ls) print caller.timeit(10000) #0.14427113533
timeit
In [1]: from timeit import timeitIn [2]: timeit('s.appendleft(3)', ...: 'import collections; s= collections.deque()', number=1000000)Out[2]: 0.1150519847869873In [3]: timeit('s.insert(0, 3)', 's=[]', number=1000000)Out[3]: 392.4638919830322
timeit
不同平台使用的time不同(Xinux 使用 `time.time()`, Windows 使用`time.clock()`)
python -m timeit -s "import collections; s = collections.deque()" -n 1000000 "s.appendleft(3)"
sys.getsizeof import sys def sys_getsizeof(): a = range(1000000) b = xrange(1000000) c = [i for i in xrange(1000000)] print sys.getsizeof(a) print sys.getsizeof(b) print sys.getsizeof(c)
sys.getsizeof import sys def sys_getsizeof(): a = range(1000000) b = xrange(1000000) c = [i for i in xrange(1000000)] print sys.getsizeof(a) #8000072 print sys.getsizeof(b) #40 print sys.getsizeof(c) #8697472
sys.getsizeof
class Entity(object): def __init__(self, name, age): self.name = name self.age = age es = [Entity("dreampuf", 100) for i in xrange(1000000)] print sys.getsizeof(es) #8697472 print sum([sys.getsizeof(i) for i in es]) #64000000
sys.getsizeof
和实际有所出入(Python对象缓存池)
容器还是对象
眼镜蛇王,聪明
Module: profile
from cProfile import Profile as profile def profile_module(): p = profile() p.enable() qks(ls*100) p.disable() print p.getstats()
Module: profile
dreampuf@HX:~/hub/fortest$ python test_timeit.py[_lsprof.profiler_entry(code="<method 'disable' of '_lsprof.Profiler' objects>", callcount=1....
Module: profile from cProfile import Profile as profile from pstats import Stats def profile_module(): p = profile() p.snapshot_stats() p.enable() dirs(".") p.disable() p.print_stats(2)
Module: profile dreampuf@HX:~/hub/fortest$ python test_timeit.py 135259 function calls in 0.071 seconds Ordered by: cumulative time ncalls tottime percall cumtime percall filename:lineno(function) 1 0.002 0.002 0.071 0.071 test_timeit.py:101(dirs) 22 0.005 0.000 0.068 0.003 test_timeit.py:74(main) 10519 0.008 0.000 0.047 0.000 ast.py:203(walk) 10498 0.004 0.000 0.035 0.000 {method 'extend' of 'c... 20975 0.015 0.000 0.031 0.000 ast.py:173(iter_child_nodes) 21 0.000 0.000 0.014 0.001 ast.py:32(parse) 21 0.014 0.001 0.014 0.001 {compile} 26234 0.009 0.000 0.011 0.000 ast.py:161(iter_fields) 39368 0.008 0.000 0.008 0.000 {isinstance}
.....
Module: profile
行级别调用信息(时间,次数)可以保存执行结果友好的输出信息
profile vs. cProfile vs. hotshot实现 开销 维护 平台
profile Python 高 持续 Python Like
cProfile Python C API 中 持续 CPython
hotshot C module 低 停止 CPython
Line Profiler
from line_profiler import LineProfiler def line_profiler(): p = LineProfiler() dirs_lineprofiler = p(dirs) dirs_lineprofiler(".") p.print_stats()
Line Profiler dreampuf@HX:~/hub/fortest$ python test_timeit.py Timer unit: 1e-06 s File: test_timeit.py Function: dirs at line 101 Total time: 0.108716 s Line # Hits Time Per Hit % Time Line Contents ============================================================== 101 def dirs(dirpath): 102 1 1129 1129.0 1.0 for current, ds, fs in os.walk(dirpath): 103 49 56 1.1 0.1 for f in fs: 104 48 50 1.0 0.0 if f[-3:] not in ('tml', '.py'): 105 26 16 0.6 0.0 continue 106 22 107461 4884.6 98.8 main(os.path.join(current, f)) 107 1 4 4.0 0.0 break
Line Profiler
直观准确(Python C API 实现的trace)
Memory Profiler
dreampuf@HX:~/hub/fortest$ python test_timeit.py Filename: test_timeit.py Line # Mem usage Increment Line Contents ================================================ 101 8.68 MB 0.00 MB def dirs(dirpath): 102 8.68 MB 0.01 MB for current, ds, fs in os.walk(dirpath): 103 11.59 MB 2.91 MB for f in fs: 104 11.59 MB 0.00 MB if f[-3:] not in ('tml', '.py'): 105 11.59 MB 0.00 MB continue 106 11.59 MB 0.00 MB main(os.path.join(current, f)) 107 11.59 MB 0.00 MB break
Objgraph
import objgraph def objgraph_profile(): c = (i for i in xrange(1000000)) objgraph.show_refs([c], filename='simple-c.png')
Objgraph
a = range(1000000)
Objgraph
b = xrange(1000000)
Objgraph
c = (i for i in xrange(1000000))
Objgraph
c = (i for i in xrange(1000000))
Objgraph
es = [Entity("dreampuf", 100) for i in xrange(1000)]
Objgraph
es = [Entity("dreampuf", 100) for i in xrange(1000)]
Run Snake Run
In Action File: /home/dreampuf/hub/guokr/dev_guokr/nutshell/tag/views_tag.py Function: tag_index at line 50 Total time: 0.144165 s Line # Hits Time Per Hit % Time Line Contents ============================================================== 50 @profile.profile 51 def tag_index(req, tag_name): 73 #标签页问答动态 74 1 4536 4536.0 3.1 question_list, page_next = qa_question_list(tag_name, TAG_ACTIVITY, page_len*(page_num-1), page_len) 75 2 4237 2118.5 2.9 activity_info_list = [qa_item(question_id) for question_id in question_list]
79 #取出100条做合并,不⼀一定能合并出5个 80 1 544 544.0 0.4 tag_hot_answer_id = cacheutil.mccallcache(600)(pgrpc.backend.get_tag_hot_qa)(tag_name, count=20) 81 1 23148 23148.0 16.1 tag_hot_answer = contents.get_list(tag_hot_answer_id) 82 #tag_hot_qa_id = list(set([i.related_content_id for i in tag_hot_answer])) 93 #标签页侧边栏 94 1 63515 63515.0 44.1 side = _tag_side(tag_name)
In Action Filename: /home/dreampuf/hub/guokr/dev_guokr/nutshell/tag/views_tag.py Line # Mem usage Increment Line Contents ================================================ 50 @profile.mem_profile 51 def tag_index(req, tag_name): 55 26.09 MB 1.68 MB if req.method != 'GET': 56 raise Http404
73 #标签页问答动态 74 25.26 MB -0.82 MB question_list, page_next = qa_question_list(tag_name, TAG_ACTIVITY, page_len*(page_num-1), page_len) 75 26.09 MB 0.82 MB activity_info_list = [qa_item(question_id) for question_id in question_list] 83 #有序且不重复 84 24.84 MB -1.25 MB tag_hot_qa_id = [] 85 26.09 MB 1.25 MB for i in tag_hot_answer:
In Action File: /home/dreampuf/hub/guokr/dev_guokr/nutshell/api/views_tag.py Function: tag_logo at line 29 Total time: 0.32325 s Line # Hits Time Per Hit % Time Line Contents ============================================================== 31 @profile.profile 32 def tag_logo(req, tag_name): 33 1 13 13.0 0.0 ukey = req.session.get('ukey') 36 1 32441 32441.0 10.0 tag_info = services.minerva.get_tag(tag=tag_name) 46 1 17087 17087.0 5.3 form = TagLogoForm(req.POST, req.FILES) 47 1 274 274.0 0.1 if not form.is_valid(): 48 return sessauth.api_error(2,u'参数格式错误') 51 1 8438 8438.0 2.6 data,request_id, filename = get_upload_file(req, FILE_UPLOAD_USER_MAXSIZE) 55 1 5 5.0 0.0 try: 56 1 104795 104795.0 32.4 gkimage = imageutil.GKImage(data).resize((48, 48)) 57 1 395 395.0 0.1 new_data = gkimage.str_value() 58 except Exception, ex: 59 print ex
In Action Filename: /home/dreampuf/hub/guokr/dev_guokr/nutshell/api/views_tag.py Line # Mem usage Increment Line Contents ================================================ 31 @profile.mem_profile 32 25.62 MB 0.00 MB def tag_logo(req, tag_name): 33 25.62 MB 0.01 MB ukey = req.session.get('ukey') 46 26.31 MB 0.68 MB form = TagLogoForm(req.POST, req.FILES) 47 26.31 MB 0.00 MB if not form.is_valid(): 48 return sessauth.api_error(2,u'参数格式错误') 50 26.31 MB 0.00 MB try: 51 26.95 MB 0.64 MB data,request_id, filename = get_upload_file(req, FILE_UPLOAD_USER_MAXSIZE) 52 except UploadError,ex: 53 return sessauth.api_error(ex.errnum,ex.errmsg) 55 26.95 MB 0.00 MB try: 56 27.11 MB 0.16 MB gkimage = imageutil.GKImage(data).resize((48, 48)) 57 27.11 MB 0.00 MB new_data = gkimage.str_value() 58 except Exception, ex: 59 print ex 79 27.29 MB 0.18 MB pgrpc.cache.kill('tag', tag_name) 87 26.66 MB -0.64 MB 'context': tag_info['logo'], 89 26.68 MB 0.02 MB services.juno.audit(**data)
How to
定位问题,缓存
优化算法,使用trick
理清逻辑,缓存,分时计算
Conclusion
理解你的程序不要过度抽象使用合适的Entity
使用`__slots__`
避免重复引用不常见情况使用异常,避免对常见情况使用异常推崇函数式编程和迭代,使用装饰器和元类
Reference- [究竟是什么使Linux滴答地响](http://article.yeeyan.org/view/20180/5371)- [The Python Profilers](http://docs.python.org/library/profile.html)- [Profiling threads](http://code.activestate.com/recipes/465831-profiling-threads/)- [High Performance Python tutorial](http://ianozsvald.com/HighPerformancePythonfromTrainingatEuroPython2011_v0.2.pdf)- [RunSnakeRun](http://www.vrplumber.com/programming/runsnakerun/) 图形性能分析工具- [用profile协助程序性能优化](http://blog.csdn.net/lanphaday/article/details/1483728)- [profile,cProfile, and stats - Performance analysis of Python programs](http://www.doughellmann.com/PyMOTW/profile/)- [Line Profiler](http://packages.python.org/line_profiler/)- [Memory Profiler](https://github.com/fabianp/memory_profiler)- [Objgraph](http://mg.pov.lt/objgraph/)- [A guide to analyzing Python performance](http://www.huyng.com/posts/python-performance-analysis/)- [性能分析](http://zh.wikipedia.org/wiki/性能分析)