Wednesday, July 17, 2013

Parsing an Apache log and splitting the output into different files

Working on a project at work debugging why some requests end up with a 503 HTTP status code, I needed to split up apache logs so that I could parse them with GnuPlot.

This snippet will read multiple apache logfiles, filter out everything except status codes and the timestamp, and then split the output into different files depending on what status code it is.

cat "$LOG_DIR/"* |\
cut -f 4,5,9 -d ' ' | `#Filter out everything except timestamp and HTTP status code`\
tr -d "[]" | `#Remove [] around timestamps from Apache logs`\
sort | `#Since input is from many files, sort according to timestamp`\
tee \
>(grep -P '2\d\d$' > "$STAT_DIR/2xx.log") `#Pipe 2xx status codes to its own file`\
>(grep -P '4\d\d$' > "$STAT_DIR/4xx.log") `#Pipe 4xx status codes to its own file`\
>(grep -P '5\d\d$' > "$STAT_DIR/5xx.log") `#Pipe 5xx status codes to its own file`\
> /dev/null

No comments:

Post a Comment