User Tools

Site Tools


This is an old revision of the document!


Man, there's too much to do and note. Logging some stuff to investigate later…

Would be nice to create a binary search in text files in Python. Maybe based on an answer deep in Reading Huge File in Python.

This Explanation of Python "Yield" also mentions (at the bottom) explanations for decorators and metaclasses.

Would be good to experiment with least-squares polynomial fitting in Python.

Data Analysis

Looks like I should dive into:

Template Files to Start With

You have some template files in svn:


Linux or Bash Tips

Useful bash command for finding strings within python files…

find . -name \*.py -type f -print0 | xargs -0 grep -nI "timeit"

Interesting way to use grep -v to remove paths from a list generated by find. Not sure about the escaped | character, though…

find $PWD -regex ".*\.[hcHC]\(pp\|xx\)?" | \
    grep -v " \|unwantedpath/unwantedpath2\|unwantedpath3" > cscope.files
cscope -q -b

And these two have nothing to do with Python. Here's how to find if a symbol is in a library, and how to search lots of object files and print the filename above the search…

nm obj-directory/libmyobject.a | c++filt | grep Initialize_my_obj
find bindirectory/ -name \*.a -exec nm /dev/null {} \; 2>/dev/null | \
    c++filt | grep -P "(^bindirectory.*\.a|T Initialize_my_obj)"

Also handy to merge two streams together…

( cat file1 && cat file2 ) | sort

When a little quick math is needed, use bc

$ bc <<< "obase=16;ibase=10;15"
$ bc -l <<< 1/3
$ bc <<< "scale=2; 1/3"
$ bc <<< "obase=10;ibase=16;B"

and, when coverting from hex to dec…

echo $((0x2dec))

But, then again, does that really seem easier than,

python -c "print int('B',16)"

There's a bash way to calculate how many days ago a date was:

$ echo $(( ($(date +%s) - $(date -d "2012-4-16" +%s)) / 86400 ))

And a Python way…

python -c "import datetime; print ( - 2012, 4, 16 )).days"

And for displaying lines to get cut instead of wrapped:

cat_one_line_per_row() {
  cat "$@" | expand | cut -b1-$COLUMNS

ctags's man page says that one of its bugs is that it has too many options. Ain't that the truth. Make note of the obscure flag here, –c++-kinds=+p, that tells ctags to process prototypes and method declarations.

ctags -n --if0=yes --c++-kinds=+p --langmap=c++:+.inl.lst \ --file-tags=yes -R --extra=fq \
    --exclude=unwanted_file.lst \
    --exclude='*unwanted-directory*/*' \

When it's desirable to clip output to exactly the width of the window:

alias clip="expand | cut -b1-\$COLUMNS"

When you want to repeat a command a few times…

seq 1 50 | xargs -I{} -n1 echo '{} Hello World!'

For / Else (Nobreak)

Python has a For/Else keyword that should have been called, “nobreak.”


From an old note-to-self

import operator
# or
rows.sort(lambda x, y : x[4] == y[4] and cmp(x[2],y[2]) or cmp(x[4], y[4]))

…not like I could just find the same info at the Python wiki or anything. :-P

Prepopulating lists with objects

Remember when you lost a couple of hours thinking that the following line created a list of objects.

    l = [Obj()] * n

It doesn't. It creates a list of references to one object.

What you meant to write was this:

    l = [Obj() for _ in range(n)]

Linux script that takes either stdin or files

if __name__=='__main__':
    if len(sys.argv) < 2:
        # Process lines coming from stdin.
        while 1:
            line = sys.stdin.readline()
            if not line:
            my_process_line( line.rstrip() )
        # Process lines of the files specified.
        for fname in sys.argv[1:]:
            if not os.path.exists( fname ):
                treat_argument_as_literal( fname )
            with open( fname, 'r' ) as f:
                while 1:
                    line = f.readline()
                    if not line:
                    my_process_line( line.rstrip() )

The With statement

Making code more beautiful with "with". (Also mentions yield.)

cProfile vs. line_profiler and kernprof


import timeit
def Use_a():
def Use_b():
def Run_all_tests():
if __name__ == '__main__':
#    t = timeit.timeit( 'Run_all_tests()', 'from __main__ import Run_all_tests', number=1 )
#    print dir( t )
#    print t
    t = timeit.Timer( 'Run_all_tests()', 'from __main__ import Run_all_tests' )
    print t.timeit()

When diving in, cProfile may come in handy.

import cProfile
def my_function():
    # Complicated stuff
if __name__ == '__main__': "my_function()" )

Dynamically Calculating Column Size

Line up columns

data = '''\
234 127 34 23 45567
23 12 4 4 45
23456 2 1 444 567'''
# Split input data by row and then on spaces
rows = [ line.strip().split(' ') for line in data.split('\n') ]
# Reorganize data by columns
cols = zip(*rows)
# Compute column widths by taking maximum length of values per column
col_widths = [ max(len(value) for value in col) for col in cols ]
# Create a suitable format string
format = ' '.join(['%%%ds' % width for width in col_widths ])
# Print each row using the computed format
for row in rows:
  print format % tuple(row)

Which outputs:

  234 127 34  23 45567
   23  12  4   4    45
23456   2  1 444   567

Also, here's a nice summary of string formatting in Python.

Different Types of Objects

class A:
    """ Old, obsolete. """
    def __init__(self):
        self.__m_x = 0
    def getx(self):
        return self.__m_x
    def setx(self, x):
        if x < 0: x = 0
        self.__m_x = x
    x = property( getx, setx )
class B:
    """ Old; very small, was for for multitudes of objects. """
    __slots__ = [ "__m_x" ]
    def __init__(self):
        self.__m_x = 0
    def getx(self):
        return self.__m_x
    def setx(self, x):
        if x < 0: x = 0
        self.__m_x = x
    x = property( getx, setx )        
class C(object):
    """ New, reccommended. """
    def __init__(self):
        self.__x = 0
    def getx(self):
        return self.__x
    def setx(self, x):
        if x < 0: x = 0
        self.__x = x
    x = property( getx, setx )


TODO Link to my tips from LiveJournal and GMail, and why I chose which timing modules.

python/python.1380328097.txt.gz · Last modified: 2023/04/12 20:44 (external edit)