Skip to content
Tim Cuthbertson edited this page Jan 10, 2019 · 11 revisions

Examples

These are all pretty specific use cases, but that's the point - if your problem is common, you will probably find more specific tools to address them. But when you have that weird requirement, that's when a very general tool like piep can save the day:

List the total size of all root directories (ignoring mounts)

ls -1 / | piep 'root=set(pp) | pp = sh("mount").splitlines() \
  | p.split() | p[2] | p[1:] | "/" not in p | bool \
  | root.difference(set(pp)) | "/"+p \
  | sh("du","-hs", *pp, stderr=open(os.devnull), stdout=sys.stdout)'

Explanation:

  • root=set(pp) : convert input (ls -1 /) into into a set, for later
  • | pp = sh("mount").splitlines() : run mount, and save its lines as the new pp
  • | p.split() | p[2] | p[1:] | "/" not in p : grab the path component, trim off the leading slash and remove any paths that have more than one slash
  • | bool : filter out the empty result
  • | root.difference(set(pp)) : set pp to the difference of the stored root variable and the current pp value
  • | "/"+p : add back leading slash
  • | sh("du","-hs", *pp, stderr=open(os.devnull), stdout=sys.stdout) : run du -hs on all paths. Print directly to stdout, and ignore stderr

Rename files in a directory so their alphanumeric order matches date order

ls -1rt | piep 'enumerate(pp, 1) | list(pp) | idx, name = p \
  | newname = "%03d-%s" % p | os.rename(name, newname) or newname'

Explanation:

  • ls -1rt : list files by modification date
  • enumerate(pp, 1) : turn each filename into a pair of (index, filename) with indexes starting from 1
  • list(pp) : make sure the input sequence is fully consumed before proceeding (you could run into issues if we start renaming things before ls is complete)
  • idx, name = p : extract the index and name from the tuple
  • newname = "%03d-%s" % p : format a new filename (using 3 digits for the index, increase this if you have more than 1000 files)
  • os.rename(name, newname) or newname : perform the rename (since rename returns None, also return newname so we can see it in the output)

live memory usage graph (for a single process)

Requires xmgrace program

env PYTHONUNBUFFERED=1 piep -m itertools -m time \
-e PID=$(pgrep 'process-name') \
-e 'START=time.time()' \
-e 'def delay(n,x): time.sleep(n); return x' -n \
'pp = itertools.repeat(None) | delay(1,time.time()) \
  | x = int(p - START) | y = sh("ps","-o","rss=","-p", str(PID)) \
  | y = float(str(y)) / 1000 \
  | ["g0.s0 point {}, {}".format(x,y), "autoscale", "redraw"] \
  | pp.merge() \
  | itertools.chain(["yaxis label \"RSS (mb)\"", "xaxis label \"time (s)\""], pp)' \
| xmgrace -barebones -dpipe -

Explanation:

  • Command line:
  • env PYTHONUNBUFFERED=1: write each output line immediately
  • piep -m itertools -m time: import itertools and time modules
  • -e PID=$(pgrep 'process-name'): lookup PID for the given process name (you can also just use a raw number if you know it)
  • -e 'START=time.time()': set START time (seconds)
  • -e 'def delay(n,x): time.sleep(n); return x': create a small helper function which sleeps before returning the given value
  • -n: no-input (this is a self-constructing pipe)
  • Pipeline:
  • pp = itertools.repeat(None): create an infinite stream of None elements
  • | delay(1,time.time()): Sleep for 1 second and then return the current time
  • | x = int(p - START): Set x to the seconds since monitoring started
  • | y = sh("ps","-o","rss=","-p", str(PID)): Set y to the current memory usage (rss) of the target pid
  • | y = float(str(y)) / 1000: convert ps output into a number, and divide it by 1000 to get mb
  • | ["g0.s0 point {}, {}".format(x,y), "autoscale", "redraw"]: format each x & y value for the xmgrace program, including instructions to rescale & redraw the graph after each point
  • | pp.merge(): Merge each array value into one long sequence
  • | itertools.chain(["yaxis label \"RSS (mb)\"", "xaxis label \"time (s)\""], pp)': prefix the output with a header to set axis labels
  • Graphing:
  • | xmgrace -barebones -dpipe -: pipe everything into xmgrace, which will do the actual graphing

Show TODOs in a git codebase, ordered by commit date:

requires rg (ripgrep) and git

rg --json TODO | piep \
	-m time -m pygments -m pygments.formatters -m json \
	-e 'text = lambda d: d.get("text", None)' \
	-e 'devnull = open(os.devnull, "w")' \
	-e 'from pygments.token import *' \
	'json.loads | p["type"] == "match" | p["data"] | path = text(p["path"]) | line = p["line_number"]
	| contents = text(p["lines"]) | path is not None | (line, path, contents)
	| blame = (sh("git", "blame", "-L%d,%d" % (line,line), "--line-porcelain", path, stderr=devnull) or "").splitlines()
	| blame = next(iter(filter(lambda x: x.startswith("committer-time "), blame)))
	| ts = int(blame.split()[1]) | (ts, time.strftime("%Y-%m-%d", time.localtime(ts)), path, line, contents.strip())
	| list(pp)
	| pp.sortby(lambda x: x[0])
	| maxfile = max([len(x[2]) for x in pp])+1
	| maxlineno = max([len(str(x[3])) for x in pp])
	| pygments.format([
		(Keyword, p[1]),
		(Text, " "),
		(String, ("%%%ds" % maxfile) % (p[2] + ":")),
		(Literal.Number, ("%%-%dd" % maxlineno) % (p[3])),
		(Whitespace, " "),
		(Text, p[4])
	], pygments.formatters.get_formatter_by_name("console"))'

Explanation:

  • Command line:
  • rg --json TODO: grep for TODO lines, outputting in JSON format
  • piep -m time -m pygments -m pygments.formatters -m json: import modules
  • -e 'text = lambda d: d.get("text", None)' helper to extract text from ripgrep JSON
  • -e 'devnull = open(os.devnull, "w")': open /dev/nullfor hidinggit blame` errors
  • -e 'from pygments.token import *': import token objects, for output colouring
  • Pipeline:
  • 'json.loads | p["type"] == "match": load JSON input, and restrict to mach objects
  • | p["data"] | path = text(p["path"]) | line = p["line_number"] | contents = text(p["lines"]): extract the parts we care about from the JSON object
  • | path is not None: filter out anything that doesn't have a valid path
  • | blame = (sh("git", "blame", "-L%d,%d" % (line,line), "--line-porcelain", path, stderr=devnull) or "").splitlines(): run git blame on just this specific file/line, ignoring errors (rg might have found matches that aren't tracked in git)
  • | blame = next(iter(filter(lambda x: x.startswith("committer-time "), blame))): get just the committer-time line from git blame output
  • | ts = int(blame.split()[1]): extract just the timestamp from the blame output
  • | (ts, time.strftime("%Y-%m-%d", time.localtime(ts)), path, line, contents.strip()): collect all the information we'll need - timestamp, formatted time, path, line and contents
  • | list(pp) : force pp to be a list, so we can iterate over it multiple times below
  • | pp.sortby(lambda x: x[0]): sort by timestamp (a unix date, i.e. number of seconds)
  • | maxfile = max([len(x[2]) for x in pp])+1: get the maximum filename length, for aligning output
  • | maxlineno = max([len(str(x[3])) for x in pp]): get the maximum lineno, for aligning output
  • | pygments.format( ... ): use pygments to format with nice colours (pygments is already used for colourized output in piep, so we know it's available)