This is an old revision of the document!
Table of Contents
shell tips
Here's a tutorial: Advanced Bash-Scripting Guide.
Quick Tips
The History Expansion character is “!”. To search the history for a previous “scp” command and only print it, try the first line below. But if you want to interactively find that command, type <Ctrl>+r,scp
.
$ !?scp?:p $ ^rscp
bash expansion
$ cp file{,.bk}
expands to
$ cp file file.bk
Replace all files that end with .JPG to .jpeg
for file in *.JPG; do mv $file ${file%.JPG}.jpeg; done for file in *.JPG; do mv $file ${file/JPG/jpeg}; done
Then there are two different “rename” commands:
rename .JPG .jpg *.JPG rename "s/JPG/jpg/" *.JPG
Command Template
Here's a template for shell commands that demonstrates a number of arguments, length of argument, etc. It could still stand a bit of clean-up according to the Google Shell Style Guide.
Another good resource is Better Bash Scripting in 15 minutes.
#!/usr/bin/env bash set -eu -o pipefail # See: https://sipb.mit.edu/doc/safe-shell/ declare -r SCRIPT_NAME=$(basename "$BASH_SOURCE") ## exit the shell (default status code: 1) after printing the message to stderr die() { echo >&2 "$1" exit ${2-1} } ## the options used by this script DISK=e declare -i VERBOSE=0 ## exit the shell (with status 2) after printing the message usage() { echo "\ $SCRIPT_NAME -hv [Drive Letter] (default: $DISK) -h Print this help text -v Enable verbose output " exit 2; } ## Process the options while getopts "hv" OPTION do case $OPTION in h) usage;; v) VERBOSE=1;; \?) usage;; esac done ## Process the arguments shift $(($OPTIND - 1)) if [ $# -eq 0 ]; then : # Let the default be used elif [ $# -eq 1 ]; then if [ ${#1} -eq 1 ]; then DISK=$1 else # 64 is EX_USAGE from sysexits.h die "$SCRIPT_NAME: Drive Letter can only be one character long." 64 fi else usage; fi ## Lock this if only one instance can run at a time # UNIQUE_BASE=${TMPDIR:-/tmp}/"$SCRIPT_NAME".$$ LOCK_FILE=${TMPDIR:-/tmp}/"$SCRIPT_NAME"_"$DISK".lock if [ -f "$LOCK_FILE" ]; then die "$SCRIPT_NAME is already running. ($LOCK_FILE was found.)" fi trap "rm -f $LOCK_FILE" EXIT touch $LOCK_FILE ## The main work of this script if [ ! -d /cygdrive/"$DISK"/backup/Users ]; then mkdir -p /cygdrive/"$DISK"/backup/Users fi ((VERBOSE==1)) && echo "Starting at $(date)" rsync /cygdrive/c/Users/me /cygdrive/"$DISK"/backup/Users # We add "|| true" because we don't want to stop # if the directory was already empty rm -r /cygdrive/c/Users/me/tmp/* || true # Note how we find the number of cores to use make -C build_subdirectory all -j$(grep -c ^processor /proc/cpuinfo)
Miscellaneous Shell Tips
If you want a single column of just the file and path names, you can get it like so:
ls --format=single-column
But if you don't know what you're doing, you might construct something like so:
ls -Al | tr -s ' ' | cut -d ' ' -f10-
- List “almost all” items in “long” format (one line per item)
- Squeeze repeats of the space character
- Cut out everything from before the 10th column and show everything afterwards.
Of course, if you could assert the following:
- none of the first columns were repeats (awk would only identify the first repeated column)
- the desired column didn't have delimiters in it (filenames with spaces)
…you could use awk
... | awk '{print $10}'
Anyway, given a list of directories, they can be inserted into a cp command with xargs if you need.
cat list_of_directories_at_one_level.txt | xargs -I {} cp -r $SOURCEDIRPREFIX:{} $DEST
Useful bash command for finding strings within python files…
find . -name \*.py -type f -print0 | xargs -0 grep -nI "timeit" find . -type f \( -name \*.[ch]pp -or -name \*.[ch] \) -print0 | xargs -0 grep -nI printf
Interesting way to use grep -v
to remove paths from a list generated by find
. Not sure about the escaped |
character, though…
#!/bin/bash find $PWD -regex ".*\.[hcHC]\(pp\|xx\)?" | \ grep -v " \|unwantedpath/unwantedpath2\|unwantedpath3" > cscope.files cscope -q -b
Here's how to find if a symbol is in a library, and how to search lots of object files and print the filename above the search…
nm obj-directory/libmyobject.a | c++filt | grep Initialize_my_obj find bindirectory/ -name \*.a -exec nm /dev/null {} \; 2>/dev/null | \ c++filt | grep -P "(^bindirectory.*\.a|T Initialize_my_obj)"
Also handy to merge two streams together…
( cat file1 && cat file2 ) | sort
When a little quick math is needed, use bc
$ bc <<< "obase=16;ibase=10;15" F $ bc -l <<< 1/3 .33333333333333333333 $ bc <<< "scale=2; 1/3" .33 $ bc <<< "obase=10;ibase=16;B" 11
and, when coverting from hex to dec…
echo $((0x2dec))
But, then again, does that really seem easier than,
python -c "print int('B',16)"
There's a bash way to calculate how many days ago a date was:
$ echo $(( ($(date +%s) - $(date -d "2012-4-16" +%s)) / 86400 ))
And a Python way…
python -c "import datetime; print (datetime.date.today() - datetime.date( 2012, 4, 16 )).days"
And for displaying lines clipped at the right edge of the window instead wrapped:
cat_one_line_per_row() { cat "$@" | expand | cut -b1-$COLUMNS }
or a “clip” command like so:
alias clip="expand | cut -b1-\$COLUMNS"
ctags's man page says that one of its bugs is that it has too many options. Ain't that the truth. Make note of the obscure flag here, –c++-kinds=+p
, that tells ctags to process prototypes and method declarations.
ctags -n --if0=yes --c++-kinds=+p --langmap=c++:+.inl.lst \ --langmap=asm:+.inc --file-tags=yes -R --extra=fq \ --exclude=unwanted_file.lst \ --exclude='*unwanted-directory*/*' \ --regex-C++='/^.*CINIT.(.+),.*,.*,.*/CURLOPT_\1/'
When you want to repeat a command a few times…
seq 1 50 | xargs -I{} -n1 echo '{} Hello World!'
When you've set up Perforce to use an application for diff with export P4DIFF='vim -d'
, you can still do a regular diff like so:
$ P4DIFF=; p4 diff hello-world.cpp
It's hard to be sure which Perforce changelist you sync'ed if you didn't explicitly sync to a changelist.
So, use p4_sync
to sync to a specific changelist, and update a source file too.
p4_sync() { p4 changes -s submitted -m1 ... | tee p4_sync_to_change.txt changelist=`cut -d " " -f 2 p4_sync_to_change.txt` changelist_filename=changelist.h p4 sync ...@$changelist if [ -w $changelist_filename ] then sed -i 's/"[0-9]\+";/"'$changelist'";/' $changelist_filename fi }
Keywords: bash shell sh zsh
.vimrc tips
Here's an alternative way to automatically save backups (with dates in the filename) everytime you save a file.
set backup set backupdir=~/.vim/backup/ au BufWritePre * let &bex = '-' . strftime( "%Y%m%d-%H%M%S" )
That makes a lot of files, so you can clean out the backups with a cron job like this:
# at 3 in the morning on Mondays, delete files older than 30 days 0 3 * * 1 find $HOME/.vim/backup/ -type f -mtime +30 -delete
Calculate web server bandwidth
Use awk
to add up a column from Apache logs:
$ sudo cat /var/log/httpd/access_log | awk '{SUM+=$10}END{print SUM/1024/1024}' 0.855597
expect tips
What to do when it's not sure you're going to make a connection?
set times 0 set made_connection 0 set timeout 120 while { $times < 2 && $made_connection == 0 } { spawn nc $SERVER send "\r" expect { "login:" { send "john.doe\r" set made_connection 1 } eof { sleep 1s set times [ expr $times + 1 ] } timeout { puts "Didn't expect to timeout." exit } } }
I think the following is wrong-headed. It's not usually the case that spawn will fail.
set times 0; while { $times < 2 && $made_connection == 0 } { if { [ catch { spawn nc $SERVER } pid ] } { set times [ expr $times + 1 ]; sleep 1s; } else { set made_connection 1 } }
Perl tips
The module Search::Dict
has a “look
” function that can be used to do a binary search in an ordered dictionary file (a logfile (or log file) that starts with timestamps works). File::SortedSeek
might also be recommended.
Application Memory Usage
Use VM Resident Set Size. See VmRSS below. (Note the difference between RSS and VmRSS. If one process has memory mapped, it's not usable any any other process)
host:# ps -ef | grep etflix default 1532 1081 6 22:06 ? 00:01:21 pkg_/metflix root 2108 1046 0 22:26 ? 00:00:00 grep etflix host:# pidof netflix 1532 host:# cat /proc/1532/status Name: MAIN ... Groups: VmPeak: 220776 kB VmSize: 210096 kB VmLck: 0 kB VmHWM: 95168 kB VmRSS: 74488 kB ...
Or, while running an application, to see how much is free over time, do this from another shell:
while [ 1 ] do free -m | grep Mem sleep 3 done
Alternatively, to see the RSS use of that process alone:
PID=$(pidof yourprocess); while true; do sync; grep VmRSS /proc/$PID/status; sleep 1; done
Measuring Available Memory
This note doesn't entirely make sense to me. Maybe need to study up on “cat /proc/meminfo” vs. “cat /proc/vmstat” vs. “vmstat”.
The best measure I've found for “available memory” is nr_inactive_file_pages+nr_active_file_pages+nr_free_pages from /proc/vmstat. And then you have to subtract out some heuristically determined value which is base system working set. (That heuristically determined value can be 30-40MB.)
The command free
just isn't a great indicator in general of how much memory is available because it doesn't account for the cached file-backed pages that could be dumped to make more memory available.
Shared Memory Usage
To increase limit to 256MB from command line:
echo "268435456" > /proc/sys/kernel/shmmax echo "268435456" > /proc/sys/kernel/shmall
Or, edit /etc/sysctl.conf:
kernel.shmmax= 268435456 kernel.shmall= 268435456
Performance Metrics
- Use perf-timechart
And you can scrape logs that start with timecodes to create Spreadsheet charts. Given logs like:
2016-10-13 19:54:44 memory 22a4
On a Macintosh:
grep memory devicelogs.txt | tr -s ' ' | cut -d " " -f 1,2,4 | \ sed 's/\([0-9\-]\+\) \([0-9:]\+\).[0-9]\+ \([0-9a-f]\+\)/\1,\2,=DATEVALUE("\1")+TIMEVALUE("\2"),=HEX2DEC("\3")/' > heapinfo.csv; \ open heapinfo.csv -a "Microsoft Excel"
And on Linux, instead of opening Microsoft Excel, that last line would be:
libreoffice --calc heapinfo.csv
Cron
Keep tasks serialized with flock(1):
( flock -n 9 || exit 1 # ... commands executed under lock ... ) 9>/var/lock/mylockfile
Retrieving Symbols with addr2line
You can gather a backtrace (stacktrace) with this piped command to addr2line.
$ cat << EOF | cut -d " " -f 3 | tr -d "[]" | \ addr2line -e builds/austin/src/platform/gibbon/netflix | \ xargs -d '\n' realpath --relative-to=. > 7/22 app() [0xf7878] (0xf7878) > 8/22 app() [0x39c2f8] (0x39c2f8) > 9/22 app() [0xe1964] (0xe1964) > EOF src/Application.h:106 (discriminator 3) src/platform/main.cpp:521 src/Application.cpp:95
Sort by Frequency
I ran the following P4 command to find out who's been editing a file recently:
$ find . -name fname.cpp | xargs p4 filelog -s -m 10 | \ awk '/^\.\.\. #/ {print $9}' | cut -d @ -f 1 | sort | uniq -c | sort -nr
jq Tips
jq is really handy. Here's a tip for some processing I often do:
- fruits.txt
{ "fruits": { "apple": { "name": "Apple", "price" : 2 }, "banana": { "name": "Banana", "price" : 3 }, "count": 2, "open": true } }
$ jq '.fruits|del(.count,.open)|with_entries(.value |= .price)' fruits.txt { "apple": 2, "banana": 3 } # with_entries(f) is an alias for to_entries | map(x) | from_entries jq '.fruits|del(.count,.open)|to_entries|map(.value |= .price)|from_entries' f { "apple": 2, "banana": 3 } $ jq '.fruits|del(.count,.open)|[to_entries[]|{(.key): .value.price}]|add' fruits.txt { "apple": 2, "banana": 3 }
- fruit_ip.txt
{ "192.168.144.52": { "ipAddress": "192.168.144.52", "attributes": { "model": "apple", "name": "David's apple" } }, "192.168.144.40": { "ipAddress": "192.168.144.40", "attributes": { "model": "banana", "name": "David's banana" } } }
$ jq '[to_entries[]|{"key":.value.attributes.name,"value":.key}]|from_entries' fruit_ip.txt { "David's apple": "192.168.144.52", "David's banana": "192.168.144.40" } $ jq '[.[]|{(.attributes.name):.ipAddress}]|add' fruit_ip.txt { "David's apple": "192.168.144.52", "David's banana": "192.168.144.40" } $ jq -r "to_entries|map(\"\(.value.attributes.name) = \(.key)\")|.[]" fruit_ip.txt David's apple = 192.168.144.52 David's banana = 192.168.144.40
XML XPath Tips
Here's how to use xmllint to take an XPath path to extract info from the Roku ECP Apps list.
http 192.168.1.128:8060/query/apps | xmllint --xpath '//app[text()="dxb"]/@id' - | cut -d\" -f 2 http 192.168.1.128:8060/query/apps | xmllint --xpath '//app[contains(text(), "Netflix")]/@id' - | cut -d\" -f 2
Protips for find
How to use "sh -c" without {} in find's -exec.
Given a lib directory, I wanted to find all the actual .so files that needed libz.
find lib -type f -name \*.so\* -exec sh -c 'objdump -p "$1" | grep "NEEDED.*libz"' - {} \; -print
Note that you can pass -print (or -and -print) after a -exec argument. You can also use -printf, ex., -printf "\t%f\n"
. Also, the “ - ” is just a placeholder for $0 (usually the command name, in this case “sh”), we want $1 to be {}. It outputs results like:
NEEDED libz.so.1 lib/libprotoc.so.13.0.2
Tip: tar up the last day's worth of log files:
find . -mtime -1 -name \*.log -print0 | tar -jcvf logs.tar.bz2 --null -T -
Additional Keywords
Linux, Unix, *nix