This is an old revision of the document!
Table of Contents
Python
Man, there's too much to do and note. Logging some stuff to investigate later…
- 2011 PyCon tutorials announcement further details at the tutorial list.
- They mention using IPython enhanced shell. Hmm.
- Of course, there's getting Python running under Xampp.
- Look into the microframework Flask. Here's a tutorial on making a blog with it.
- dataset, an abstraction layer above relational databases.
- docopt for a stupidly easy and correct command-line interface description language.
- Best practices by Vladimir Keleshev.
- You can ask a new-style class for its .__subclasses__() It uses weak references. Look at this interesting example with a child class inside a method.
- A Power Point deck by Alex Martelli describing Dependency Injection vs. Template Method (override base method) vs. Monkey Patching.
Would be nice to create a binary search in text files in Python. Maybe based on an answer deep in Reading Huge File in Python.
This Explanation of Python "Yield" also mentions (at the bottom) explanations for decorators and metaclasses.
Would be good to experiment with least-squares polynomial fitting in Python.
Data Analysis
Template Files to Start With
You have some template files in svn:
/Code/python_templates/trunk/make_standalone_application.bat /Code/python_templates/trunk/setup.py /Code/python_templates/trunk/template_cron_job.py /Code/python_templates/trunk/template_gui_app.py /Code/python_templates/trunk/template_gui_app_with_worker_thread.py
For / Else (Nobreak)
Python has a For/Else keyword that should have been called, “nobreak.”
Sorting
From an old note-to-self…
import operator rows.sort(key=operator.itemgetter(4)) # or rows.sort(lambda x, y : x[4] == y[4] and cmp(x[2],y[2]) or cmp(x[4], y[4]))
…not like I could just find the same info at the Python wiki or anything.
Prepopulating lists with objects
Remember when you lost a couple of hours thinking that the following line created a list of objects.
l = [Obj()] * n
It doesn't. It creates a list of references to one object.
What you meant to write was this:
l = [Obj() for _ in range(n)]
Linux script that takes either stdin or files
if __name__=='__main__': if len(sys.argv) < 2: # Process lines coming from stdin. while 1: line = sys.stdin.readline() if not line: break my_process_line( line.rstrip() ) else: # Process lines of the files specified. for fname in sys.argv[1:]: if not os.path.exists( fname ): treat_argument_as_literal( fname ) continue with open( fname, 'r' ) as f: while 1: line = f.readline() if not line: break my_process_line( line.rstrip() )
The With statement
Making code more beautiful with "with". (Also mentions yield.)
cProfile vs. line_profiler and kernprof
Raymond Hettinger likes Robert Kern's line_profiler much better than profile and cProfile.
Here it is: line_profiler and kernprof
timeit
import timeit def Use_a(): pass def Use_b(): pass def Run_all_tests(): my_setup() Use_a() Use_b() if __name__ == '__main__': # t = timeit.timeit( 'Run_all_tests()', 'from __main__ import Run_all_tests', number=1 ) # print dir( t ) # print t t = timeit.Timer( 'Run_all_tests()', 'from __main__ import Run_all_tests' ) print t.timeit()
When diving in, cProfile may come in handy.
import cProfile def my_function(): # Complicated stuff pass if __name__ == '__main__': cProfile.run( "my_function()" )
Dynamically Calculating Column Size
data = '''\ 234 127 34 23 45567 23 12 4 4 45 23456 2 1 444 567''' # Split input data by row and then on spaces rows = [ line.strip().split(' ') for line in data.split('\n') ] # Reorganize data by columns cols = zip(*rows) # Compute column widths by taking maximum length of values per column col_widths = [ max(len(value) for value in col) for col in cols ] # Create a suitable format string format = ' '.join(['%%%ds' % width for width in col_widths ]) # Print each row using the computed format for row in rows: print format % tuple(row)
Which outputs:
234 127 34 23 45567 23 12 4 4 45 23456 2 1 444 567
Also, here's a nice summary of string formatting in Python.
Different Types of Objects
class A: """ Old, obsolete. """ def __init__(self): self.__m_x = 0 def getx(self): return self.__m_x def setx(self, x): if x < 0: x = 0 self.__m_x = x x = property( getx, setx ) class B: """ Old; very small, was for for multitudes of objects. """ __slots__ = [ "__m_x" ] def __init__(self): self.__m_x = 0 def getx(self): return self.__m_x def setx(self, x): if x < 0: x = 0 self.__m_x = x x = property( getx, setx ) class C(object): """ New, reccommended. """ def __init__(self): self.__x = 0 def getx(self): return self.__x def setx(self, x): if x < 0: x = 0 self.__x = x x = property( getx, setx )
filelock
Evan Fosmark has a filelock module. But here's a quick and dirty implementation of a lock that uses the current file:
import os import fcntl import inspect # Maybe use os.path.abspath(__file__) ? with open(os.path.abspath inspect.getfile(inspect.currentframe())), 'r') as f: try: fcntl.flock(f, fcntl.LOCK_EX) call_that_cannot_be_concurrent() finally: fcntl.flock(f, fcntl.LOCK_UN)
Fibonacci Generator with Itertools
import itertools def fib(n): """Print a Fibonacci series up to n.""" a, b = 0, 1 while True: yield a b = a + b yield b a = a + b if __name__ == '__main__': for x in itertools.islice(fib(), 5): print x # for i in range( 5 ): # print i, fib( i )
TODO
TODO Link to my tips from LiveJournal and GMail, and why I chose which timing modules.