====== shell tips ====== Here's a tutorial: [[http://tldp.org/LDP/abs/html/|Advanced Bash-Scripting Guide]]. ===== Quick Tips ===== The [[http://www.reddit.com/r/bashtricks/comments/hdfzc/execute_previous_command_as_root/|History Expansion character]] is "!". To search the history for a previous "scp" command and only print it, try the first line below. But if you want to interactively find that command, type ''+r,scp''. $ !?scp?:p $ ^rscp ===== bash expansion ===== $ cp file{,.bk} expands to $ cp file file.bk Replace all files that end with .JPG to .jpeg for file in *.JPG; do mv $file ${file%.JPG}.jpeg; done for file in *.JPG; do mv $file ${file/JPG/jpeg}; done Then there are two different "rename" commands: rename .JPG .jpg *.JPG rename "s/JPG/jpg/" *.JPG ===== Command Template ===== Here's a template for shell commands that demonstrates a number of arguments, length of argument, etc. It could still stand a bit of clean-up according to the [[http://google-styleguide.googlecode.com/svn/trunk/shell.xml|Google Shell Style Guide]]. Another good resource is [[http://robertmuth.blogspot.com/2012/08/better-bash-scripting-in-15-minutes.html|Better Bash Scripting in 15 minutes]]. #!/usr/bin/env bash set -eu -o pipefail # See: https://sipb.mit.edu/doc/safe-shell/ declare -r SCRIPT_NAME=$(basename "$BASH_SOURCE") # exit (default status code: 1) after printing the message to stderr die() { echo >&2 "$1" exit ${2-1} } ## the options used by this script DISK=e declare -i VERBOSE=0 ## exit the shell (with status 2) after printing the message usage() { echo "\ $SCRIPT_NAME -hv [Drive Letter] (default: $DISK) -h Print this help text -v Enable verbose output " exit 2; } ## Process the options while getopts "hv" OPTION do case $OPTION in h) usage;; v) VERBOSE=1;; \?) usage;; esac done ## Process the arguments shift $(($OPTIND - 1)) if [ $# -eq 0 ]; then : # Let the default be used elif [ $# -eq 1 ]; then if [ ${#1} -eq 1 ]; then DISK=$1 else # 64 is EX_USAGE from sysexits.h die "$SCRIPT_NAME: Drive Letter can only be one character long." 64 fi else usage; fi ## Lock this if only one instance can run at a time # UNIQUE_BASE=${TMPDIR:-/tmp}/"$SCRIPT_NAME".$$ LOCK_FILE=${TMPDIR:-/tmp}/"$SCRIPT_NAME"_"$DISK".lock if [ -f "$LOCK_FILE" ]; then die "$SCRIPT_NAME is already running. ($LOCK_FILE was found.)" fi trap "rm -f $LOCK_FILE" EXIT touch $LOCK_FILE ## The main work of this script if [ ! -d /cygdrive/"$DISK"/backup/Users ]; then mkdir -p /cygdrive/"$DISK"/backup/Users fi ((VERBOSE==1)) && echo "Starting at $(date)" rsync /cygdrive/c/Users/me /cygdrive/"$DISK"/backup/Users # We add "|| true" because we don't want to stop # if the directory was already empty rm -r /cygdrive/c/Users/me/tmp/* || true # Note how we find the number of cores to use make -C build_subdirectory all -j$(grep -c ^processor /proc/cpuinfo) ===== Miscellaneous Shell Tips ===== If you want a single column of just the file and path names, you can get it like so: ls --format=single-column But if you don't know what you're doing, you might construct something like so: ls -Al | tr -s ' ' | cut -d ' ' -f10- - List "almost all" items in "long" format (one line per item) - Squeeze repeats of the space character - Cut out everything from before the 10th column and show everything afterwards. Of course, if you could assert the following: * none of the first columns were repeats (awk would only identify the first repeated column) * the desired column didn't have delimiters in it (filenames with spaces) ...you could use awk ... | awk '{print $10}' Anyway, given a list of directories, they can be inserted into a cp command with xargs if you need. cat list_of_directories_at_one_level.txt | xargs -I {} cp -r $SOURCEDIRPREFIX:{} $DEST Useful bash command for finding strings within python files... find . -name \*.py -type f -print0 | xargs -0 grep -nI "timeit" find . -type f \( -name \*.[ch]pp -or -name \*.[ch] \) -print0 | xargs -0 grep -nI printf Interesting way to use ''grep -v'' to remove paths from a list generated by ''find''. Not sure about the escaped ''|'' character, though... #!/bin/bash find $PWD -regex ".*\.[hcHC]\(pp\|xx\)?" | \ grep -v " \|unwantedpath/unwantedpath2\|unwantedpath3" > cscope.files cscope -q -b Here's how to find if a symbol is in a library, and how to search lots of object files and print the filename above the search... nm obj-directory/libmyobject.a | c++filt | grep Initialize_my_obj find bindirectory/ -name \*.a -exec nm /dev/null {} \; 2>/dev/null | \ c++filt | grep -P "(^bindirectory.*\.a|T Initialize_my_obj)" Also handy to merge two streams together... ( cat file1 && cat file2 ) | sort When a little quick math is needed, use ''bc'' $ bc <<< "obase=16;ibase=10;15" F $ bc -l <<< 1/3 .33333333333333333333 $ bc <<< "scale=2; 1/3" .33 $ bc <<< "obase=10;ibase=16;B" 11 and, when coverting from hex to dec... echo $((0x2dec)) But, then again, does that really seem easier than, python -c "print int('B',16)" There's a bash way to calculate how many days ago a date was: $ echo $(( ($(date +%s) - $(date -d "2012-4-16" +%s)) / 86400 )) And a Python way... python -c "import datetime; print (datetime.date.today() - datetime.date( 2012, 4, 16 )).days" And for displaying lines clipped at the right edge of the window instead wrapped: cat_one_line_per_row() { cat "$@" | expand | cut -b1-$COLUMNS } or a "clip" command like so: alias clip="expand | cut -b1-\$COLUMNS" ctags's man page says that one of its bugs is that it has too many options. Ain't that the truth. Make note of the obscure flag here, ''--c++-kinds=+p'', that tells ctags to process prototypes and method declarations. ctags -n --if0=yes --c++-kinds=+p --langmap=c++:+.inl.lst \ --langmap=asm:+.inc --file-tags=yes -R --extra=fq \ --exclude=unwanted_file.lst \ --exclude='*unwanted-directory*/*' \ --regex-C++='/^.*CINIT.(.+),.*,.*,.*/CURLOPT_\1/' When you want to repeat a command a few times... seq 1 50 | xargs -I{} -n1 echo '{} Hello World!' When you've set up Perforce to use an application for diff with ''export P4DIFF='vim -d' '', you can still do a regular diff like so: $ P4DIFF=; p4 diff hello-world.cpp It's [[http://stackoverflow.com/questions/47007/determining-the-last-changelist-synced-to-in-perforce|hard to be sure which Perforce changelist you sync'ed if you didn't explicitly sync to a changelist]]. So, use ''p4_sync'' to sync to a specific changelist, and update a source file too. p4_sync() { p4 changes -s submitted -m1 ... | tee p4_sync_to_change.txt changelist=`cut -d " " -f 2 p4_sync_to_change.txt` changelist_filename=changelist.h p4 sync ...@$changelist if [ -w $changelist_filename ] then sed -i 's/"[0-9]\+";/"'$changelist'";/' $changelist_filename fi } Keywords: bash shell sh zsh ====== .vimrc tips ====== Here's an alternative way to automatically save backups (with dates in the filename) everytime you save a file. set backup set backupdir=~/.vim/backup/ au BufWritePre * let &bex = '-' . strftime( "%Y%m%d-%H%M%S" ) That makes a lot of files, so you can clean out the backups with a cron job like this: # at 3 in the morning on Mondays, delete files older than 30 days 0 3 * * 1 find $HOME/.vim/backup/ -type f -mtime +30 -delete ====== Calculate web server bandwidth ====== Use ''awk'' to add up a column from Apache logs: $ sudo cat /var/log/httpd/access_log | awk '{SUM+=$10}END{print SUM/1024/1024}' 0.855597 ====== expect tips ====== What to do when it's not sure you're going to make a connection? set times 0 set made_connection 0 set timeout 120 while { $times < 2 && $made_connection == 0 } { spawn nc $SERVER send "\r" expect { "login:" { send "john.doe\r" set made_connection 1 } eof { sleep 1s set times [ expr $times + 1 ] } timeout { puts "Didn't expect to timeout." exit } } } I think the following is wrong-headed. It's not usually the case that spawn will fail. set times 0; while { $times < 2 && $made_connection == 0 } { if { [ catch { spawn nc $SERVER } pid ] } { set times [ expr $times + 1 ]; sleep 1s; } else { set made_connection 1 } } ====== Perl tips ====== The module ''Search::Dict'' has a "''look''" function that can be used to do a binary search in an ordered dictionary file (a logfile (or log file) that starts with timestamps works). ''File::SortedSeek'' might also be recommended. ====== Application Memory Usage ====== Use VM Resident Set Size. See VmRSS below. (Note the [[http://stackoverflow.com/questions/10400751/how-do-vmrss-and-resident-set-size-match|difference between RSS and VmRSS]]. If one process has memory mapped, it's not usable any any other process) host:# ps -ef | grep etflix default 1532 1081 6 22:06 ? 00:01:21 pkg_/metflix root 2108 1046 0 22:26 ? 00:00:00 grep etflix host:# pidof netflix 1532 host:# cat /proc/1532/status Name: MAIN ... Groups: VmPeak: 220776 kB VmSize: 210096 kB VmLck: 0 kB VmHWM: 95168 kB VmRSS: 74488 kB ... Or, while running an application, to see how much is free over time, do this from another shell: while [ 1 ] do free -m | grep Mem sleep 3 done Alternatively, to see the RSS use of that process alone: PID=$(pidof yourprocess); while true; do sync; grep VmRSS /proc/$PID/status; sleep 1; done ====== Measuring Available Memory ====== This note doesn't entirely make sense to me. Maybe need to study up on "cat /proc/meminfo" vs. "cat /proc/vmstat" vs. "vmstat". The best measure I've found for "available memory" is nr_inactive_file_pages+nr_active_file_pages+nr_free_pages from /proc/vmstat. And then you have to subtract out some heuristically determined value which is base system working set. (That heuristically determined value can be 30-40MB.) The command ''free'' just isn't a great indicator in general of how much memory is available because it doesn't account for the cached file-backed pages that could be dumped to make more memory available. ====== Shared Memory Usage ====== To increase limit to 256MB from command line: echo "268435456" > /proc/sys/kernel/shmmax echo "268435456" > /proc/sys/kernel/shmall Or, edit /etc/sysctl.conf: kernel.shmmax= 268435456 kernel.shmall= 268435456 ====== Performance Metrics ====== * Use [[http://man7.org/linux/man-pages/man1/perf-timechart.1.html|perf-timechart]] * [[https://github.com/gperftools/gperftools|gperftools]] And you can scrape logs that start with timecodes to create Spreadsheet charts. Given logs like: 2016-10-13 19:54:44 memory 22a4 On a Macintosh: grep memory devicelogs.txt | tr -s ' ' | cut -d " " -f 1,2,4 | \ sed 's/\([0-9\-]\+\) \([0-9:]\+\).[0-9]\+ \([0-9a-f]\+\)/\1,\2,=DATEVALUE("\1")+TIMEVALUE("\2"),=HEX2DEC("\3")/' > heapinfo.csv; \ open heapinfo.csv -a "Microsoft Excel" And on Linux, instead of opening Microsoft Excel, that last line would be: libreoffice --calc heapinfo.csv ====== Cron ====== Keep tasks serialized with [[https://linux.die.net/man/1/flock|flock(1)]]: ( flock -n 9 || exit 1 # ... commands executed under lock ... ) 9>/var/lock/mylockfile ====== Retrieving Symbols with addr2line ====== You can gather a backtrace (stacktrace) with this piped command to addr2line. $ cat << EOF | cut -d " " -f 3 | tr -d "[]" | \ addr2line -e builds/austin/src/platform/gibbon/netflix | \ xargs -d '\n' realpath --relative-to=. > 7/22 app() [0xf7878] (0xf7878) > 8/22 app() [0x39c2f8] (0x39c2f8) > 9/22 app() [0xe1964] (0xe1964) > EOF src/Application.h:106 (discriminator 3) src/platform/main.cpp:521 src/Application.cpp:95 ====== Sort by Frequency ====== I ran the following P4 command to find out who's been editing a file recently: $ find . -name fname.cpp | xargs p4 filelog -s -m 10 | \ awk '/^\.\.\. #/ {print $9}' | cut -d @ -f 1 | sort | uniq -c | sort -nr ====== jq Tips ====== jq is really handy. Here's a tip for some processing I often do: { "fruits": { "apple": { "name": "Apple", "price" : 2 }, "banana": { "name": "Banana", "price" : 3 }, "count": 2, "open": true } } $ jq '.fruits|del(.count,.open)|with_entries(.value |= .price)' fruits.txt { "apple": 2, "banana": 3 } # with_entries(f) is an alias for to_entries | map(x) | from_entries jq '.fruits|del(.count,.open)|to_entries|map(.value |= .price)|from_entries' f { "apple": 2, "banana": 3 } $ jq '.fruits|del(.count,.open)|[to_entries[]|{(.key): .value.price}]|add' fruits.txt { "apple": 2, "banana": 3 } { "": { "ipAddress": "", "attributes": { "model": "apple", "name": "David's apple" } }, "": { "ipAddress": "", "attributes": { "model": "banana", "name": "David's banana" } } } $ jq '[to_entries[]|{"key":.value.attributes.name,"value":.key}]|from_entries' fruit_ip.txt { "David's apple": "", "David's banana": "" } $ jq '[.[]|{(.attributes.name):.ipAddress}]|add' fruit_ip.txt { "David's apple": "", "David's banana": "" } $ jq -r "to_entries|map(\"\(.value.attributes.name) = \(.key)\")|.[]" fruit_ip.txt David's apple = David's banana = ====== XML XPath Tips ====== Here's how to use xmllint to take an XPath path to extract info from the Roku ECP Apps list. http | xmllint --xpath '//app[text()="dxb"]/@id' - | cut -d\" -f 2 http | xmllint --xpath '//app[contains(text(), "Netflix")]/@id' - | cut -d\" -f 2 ====== Protips for find ====== [[https://stackoverflow.com/a/2962015/9181|How to use "sh -c" without {} in find's -exec]]. Given a lib directory, I wanted to find all the actual .so files that needed libz. find lib -type f -name \*.so\* -exec sh -c 'objdump -p "$1" | grep "NEEDED.*libz"' - {} \; -print Note that you can pass -print (or -and -print) after a -exec argument. You can also use -printf, ex., ''-printf %%"%%\t%f\n%%"%%''. Also, the " - " is just a placeholder for $0 (usually the command name, in this case "sh"), we want $1 to be {}. It outputs results like: NEEDED libz.so.1 lib/libprotoc.so.13.0.2 Tip: tar up the last day's worth of log files: find . -mtime -1 -name \*.log -print0 | tar -jcvf logs.tar.bz2 --null -T - ====== Additional Keywords ====== Linux, Unix, *nix