====== General Python Notes ====== Man, there's too much to do and note. Logging some stuff to investigate later... * [[http://pycon.blogspot.com/2011/12/announcing-pycon-2012-tutorials.html|2011 PyCon tutorials announcement]] further details at the [[https://us.pycon.org/2012/schedule/lists/tutorials/|tutorial list]]. * They mention using [[http://ipython.org/|IPython]] enhanced shell. Hmm. * [[http://pytools.codeplex.com/releases/view/76089|Python Tools for Visual Studio 2010]]. * [[http://wiki.python.org/moin/PythonSpeed/PerformanceTips|Python Performance Tips at the Python wiki]]. * Of course, there's [[:xampp|getting Python running under Xampp]]. * There's [[http://www.infoworld.com/d/application-development/pillars-python-six-python-web-frameworks-compared-169442|A comparison of CubicWeb, Django, Pyramid, Web.py, Web2py, and Zope 2]] * Look into the microframework [[http://flask.pocoo.org/|Flask]]. Here's a [[http://blog.miguelgrinberg.com/post/the-flask-mega-tutorial-part-i-hello-world|tutorial on making a blog with it]]. Consider flask with gunicorn and nginx. * [[https://dataset.readthedocs.org/en/latest/|dataset]], an abstraction layer above relational databases. * [[http://docopt.org/|docopt]] for a stupidly easy and correct command-line interface description language. * [[http://stevenloria.com/python-best-practice-patterns-by-vladimir-keleshev-notes/|Best practices]] by Vladimir Keleshev. * [[https://twitter.com/brandon_rhodes/status/449938490516860928|You can ask a new-style class for its .__subclasses__()]] It uses weak references. Look at [[https://gist.github.com/etrepum/9862280|this interesting example with a child class inside a method]]. * A Power Point deck by Alex Martelli describing [[http://www.aleax.it/yt_pydi.pdf|Dependency Injection vs. Template Method (override base method) vs. Monkey Patching]]. * [[http://stackoverflow.com/questions/101268/hidden-features-of-python|Hidden Features of Python]] at StackOverflow * [[http://chriskiehl.com/article/parallelism-in-one-line|Parallelism in one line]] Would be nice to create a binary search in text files in Python. Maybe based on an answer deep in [[http://stackoverflow.com/questions/744256/reading-huge-file-in-python|Reading Huge File in Python]]. This [[http://stackoverflow.com/questions/231767/the-python-yield-keyword-explained/231855#231855|Explanation of Python "Yield"]] also mentions (at the bottom) explanations for decorators and metaclasses. Would be good to experiment with [[http://pingswept.org/2009/01/24/least-squares-polynomial-fitting-in-python/|least-squares polynomial fitting in Python]]. ===== Data Analysis ===== Looks like I should dive into: * [[http://docs.scipy.org/doc/|NumPy]] * [[http://pandas.pydata.org/|Pandas]] * [[http://ipython.org/|IPython]]. (As seen above.) ===== Template Files to Start With ===== You have some template files in svn: /Code/python_templates/trunk/make_standalone_application.bat /Code/python_templates/trunk/setup.py /Code/python_templates/trunk/template_cron_job.py /Code/python_templates/trunk/template_gui_app.py /Code/python_templates/trunk/template_gui_app_with_worker_thread.py ===== For / Else (Nobreak) ===== Python has a [[http://nedbatchelder.com/blog/201110/forelse.html|For/Else keyword]] that should have been called, "nobreak." ===== Sorting ===== From an old [[http://dblume.livejournal.com/28319.html|note-to-self]]... import operator rows.sort(key=operator.itemgetter(4)) # or rows.sort(lambda x, y : x[4] == y[4] and cmp(x[2],y[2]) or cmp(x[4], y[4])) ...not like I could just find the same info at the [[http://wiki.python.org/moin/HowTo/Sorting|Python wiki]] or anything. :-P ===== Prepopulating lists with objects ===== Remember when you lost a couple of hours thinking that the following line created a list of objects. l = [Obj()] * n It doesn't. It creates a list of references to one object. What you meant to write was this: l = [Obj() for _ in range(n)] ===== Linux script that takes either stdin or files ===== Newer way, **use [[https://docs.python.org/3/library/fileinput.html|module fileinput]]**. Older way: if __name__=='__main__': if len(sys.argv) < 2: # Process lines coming from stdin. while 1: line = sys.stdin.readline() if not line: break my_process_line(line.rstrip()) else: # Process lines of the files specified. for fname in sys.argv[1:]: if not os.path.exists(fname): treat_argument_as_literal(fname) continue with open(fname, 'r') as f: while 1: line = f.readline() if not line: break my_process_line(line.rstrip()) ===== The With statement ===== [[http://amix.dk/blog/post/19663#Making-ugly-code-more-beautiful-using-Pythons-with-statement|Making code more beautiful with "with"]]. (Also mentions yield.) ===== cProfile vs. line_profiler and kernprof ===== Raymond Hettinger [[https://twitter.com/raymondh/status/341010756101341187|likes Robert Kern's line_profiler much better than profile and cProfile]]. Here it is: [[http://pythonhosted.org/line_profiler/|line_profiler and kernprof]] ===== timeit ===== import timeit def Use_a(): pass def Use_b(): pass def Run_all_tests(): my_setup() Use_a() Use_b() if __name__ == '__main__': # t = timeit.timeit( 'Run_all_tests()', 'from __main__ import Run_all_tests', number=1 ) # print dir( t ) # print t t = timeit.Timer( 'Run_all_tests()', 'from __main__ import Run_all_tests' ) print t.timeit() When diving in, cProfile may come in handy. import cProfile def my_function(): # Complicated stuff pass if __name__ == '__main__': cProfile.run("my_function()") ===== Dynamically Calculating Column Size ====== [[http://stackoverflow.com/questions/3685195/line-up-columns-of-numbers-print-output-in-table-format|Line up columns]] data = '''\ 234 127 34 23 45567 23 12 4 4 45 23456 2 1 444 567''' # Split input data by row and then on spaces rows = [line.strip().split(' ') for line in data.split('\n')] # Reorganize data by columns cols = zip(*rows) # Compute column widths by taking maximum length of values per column col_widths = [max(len(value) for value in col) for col in cols] # Create a suitable format string format = ' '.join(['%%%ds' % width for width in col_widths]) # Print each row using the computed format for row in rows: print format % tuple(row) Which outputs: 234 127 34 23 45567 23 12 4 4 45 23456 2 1 444 567 Also, here's a [[http://knowledgestockpile.blogspot.com/2011/01/string-formatting-in-python_09.html|nice summary of string formatting in Python]]. ===== Different Types of Objects ===== class A: """ Old, obsolete. """ def __init__(self): self.__m_x = 0 def getx(self): return self.__m_x def setx(self, x): if x < 0: x = 0 self.__m_x = x x = property(getx, setx) class B: """ Old; very small, was for for multitudes of objects. """ __slots__ = ["__m_x"] def __init__(self): self.__m_x = 0 def getx(self): return self.__m_x def setx(self, x): if x < 0: x = 0 self.__m_x = x x = property(getx, setx) class C(object): """ New, recommended. """ def __init__(self): self.__x = 0 def getx(self): return self.__x def setx(self, x): if x < 0: x = 0 self.__x = x x = property(getx, setx) ====== filelock ====== Evan Fosmark has a filelock module. But here's a quick and dirty implementation of a lock that uses the current file: import os import fcntl import inspect # Maybe use os.path.abspath(__file__) ? with open(os.path.abspath inspect.getfile(inspect.currentframe())), 'r') as f: try: fcntl.flock(f, fcntl.LOCK_EX) call_that_cannot_be_concurrent() finally: fcntl.flock(f, fcntl.LOCK_UN) ====== Various Approaches to threaded URL Requests ====== * [[https://stackoverflow.com/questions/2632520/what-is-the-fastest-way-to-send-100-000-http-requests-in-python|Use Queue and threading's Thread]] * [[https://www.shanelynn.ie/using-python-threading-for-multiple-results-queue/|Use threading and store results in a pre-allocated list]]. Then use Queue for lots of URLs. * [[https://dev.to/rhymes/how-to-make-python-code-concurrent-with-3-lines-of-code-2fpe|Use concurrent.futures' ThreadPoolExecutor and map()]]. * Or, use the doc's [[https://docs.python.org/3/library/concurrent.futures.html#threadpoolexecutor-example|ThreadPoolExecutor Example]]. * And, as mentioned in Parallelism in One Line, [[https://chriskiehl.com/article/parallelism-in-one-line|multiprocessing.dummy's Pool and its own map()]]. ====== Fibonacci Generator with Itertools ====== import itertools def fib(n): """Print a Fibonacci series up to n.""" a, b = 0, 1 while True: yield a b = a + b yield b a = a + b if __name__ == '__main__': for x in itertools.islice(fib(), 5): print x # for i in range( 5 ): # print i, fib( i ) ====== Parse C++ Code with nm and graphviz's dot ====== sudo apt-get install graphviz # Read `nm` output output = subprocess.check_output(['nm', '--demangle', entry.path]).decode('utf-8') for line in output.split('\n'): if 'Singleton' in line: # Make a .dot file with open(filename + '.dot', 'w') as file: file.write('digraph {\n') for f, destinations in self.graph.items(): for d in destinations: file.write(f' "{f}" -> "{d}"\n') def create_png_image(filename): with open(filename + '.png', 'wb') as fout: p1 = subprocess.run(['dot', filename + '.dot', '-Tpng'], stdout=fout) ====== TODO ====== TODO Link to my tips from LiveJournal and GMail, and why I chose which timing modules.