Saturday, May 18, 2013

Using Gnuplot in Bash Shell Scripts

This is an example of how Gnuplot scripts can be embedded within Bash scripts using Bash Here Documents (heredocs):

#!/bin/bash

filename="log/load_me.log.1368725550"

awk '/for 10000000/ {c+=1; if(c > 46){n+=1; printf "%s\t%s\n", n, $2} }' $filename > log/values.dat

window_size=50
ruby running_average.rb log/values.dat $window_size > log/averages.dat

gnuplot -p <<EOSCRIPT

set title 'Time to Insert Relationships in Batches of 10M in Neo4j batch-import'
set xlabel 'Batches of 10M Rels on `date "+%Y-%m-%d at %H:%M:%S %Z"`'
set ylabel 'Time Taken (s)'
set grid

# Draw trend line
f(x) = a*x**3 + b*x**2 + c*x + d

set fit quiet
fit f(x) 'log/values.dat' using 1:(\$2/1000) via a,b,c,d
unset fit

# Scale column No.2 by 1000 to turn ms in s.
plot 'log/values.dat' using 1:(\$2/1000) title 'Time to Insert Rels' with lines, f(x) title 'Fit', 'log/averages.dat' using 1:5 title "Average $window_size" with lines

pause -1 "\n\nHit return to continue\n\n"

EOSCRIPT

num_stats=`wc -l log/values.dat | awk '{print $1}'`

tail -$window_size log/values.dat | awk -v num_stats="$num_stats" '{n+=1; s+=$2}
END{
  avg=s/(n*1000);
  print "Average of last", n, "is", avg, "(s)";
  print "Num Rels Stats =", num_stats;
  print "Hours Remaining =", (3300-num_stats+46)*avg/3600;
  printf "Percentage complete = %.2f%%\n", (num_stats * 100 / 3300.0); 
}'

echo -e "Completed at `date`\n\n"

References

Gnuplot fit command
http://www.manpagez.com/info/gnuplot/gnuplot-4.6.0/gnuplot_263.php
Bash Here Documents
https://www.tldp.org/LDP/abs/html/here-docs.html
Wikipedia Here Document
https://en.wikipedia.org/wiki/Here_document