Featured

Tutorial: Rapid Script Development with Bash, JC, and JQ

I thought it would be fun to show the power of using JSON for rapid development of Bash scripts. When you don’t need to parse unstructured command output and common string types manually you are freed to focus on the code logic which can really reduce development and troubleshooting cycles.

To that end, I created a toy CLI application that scans the local subnet to see which hosts respond to ICMP requests. This application took less than a half-hour to write (plus some fine-tuning) and I think it demonstrates some cool concepts utilizing the JSON command output of several commands, including ifconfig, ping, arp, date, and wc. We use jc to convert the command output to JSON and jq to grab the values we want from that output.

We will also be utilizing the ip-address parser that comes with jc. It acts like a subnet calculator but provides all of the values in a nice JSON schema for easy access with jq.

Yes, this is not as efficient as pinging the broadcast address and checking the local ARP table, but this is arguably more fun!

Here is the Bash script of around 80 lines. Because there is no low-level command output parsing via grep, cut, sed, awk, etc. it is pretty simple to understand, but we’ll go through some of the interesting parts in more detail.

Here is some sample output when using the script:

% ./scansubnet.sh    
Please enter the interface name as command argument.

Options:
lo0
en0

% ./scansubnet.sh wrong-interface                     
ifconfig: interface wrong-interface does not exist

% ./scansubnet.sh lo0
Subnet is too large (16777214 IPs). Exiting.

% ./scansubnet.sh en0
My IP: 192.168.1.221/24
Sending ICMP requests to 254 IPs: 192.168.1.1 - 192.168.1.254
Start Time: 2022-08-29T07:05:40
   13.623 ms   192.168.1.249   f0:ef:86:f6:21:84   camera1.local
  350.634 ms   192.168.1.72    f0:18:98:3:f8:39    laptop1.local
   10.645 ms   192.168.1.243   fc:ae:34:a1:35:82   
  561.997 ms   192.168.1.188   18:3e:ef:d3:3f:82   laptop2.local
   19.775 ms   192.168.1.254   fc:ae:34:a1:3a:80   router.local
                          <snip>
   27.917 ms   192.168.1.197   cc:a7:c1:5e:c3:f1   camera2.local
   28.582 ms   192.168.1.235   56:c8:36:64:2a:8d   camera3.local
   38.199 ms   192.168.1.246   d8:30:62:2e:a6:cf   extender.local
   44.617 ms   192.168.1.242   50:14:79:1e:42:3e   vacuum.local
    5.350 ms   192.168.1.88    c8:d0:83:cd:f4:2d   tv.local
    0.087 ms   192.168.1.221   a4:83:e7:2d:62:4e   laptop3.local
Scanned 192.168.1.0/24 subnet in 27 seconds.
30 alive hosts found.
End Time: 2022-08-29T07:06:07

Note: For best results, use jc version 1.21.1 or higher in this script.

Notice there are some basic checks and features:

  • If you don’t specify an interface it will suggest any interfaces on the system with a configured IPv4 address
  • If you select an interface attached to too large of a subnet (say a 127.0.0.1/8 address of a loopback) it will exit
  • It will show the subnet information, including the number of hosts it will scan and the range of IP addresses to be scanned
  • It will log the start and end time and calculate how long it took to complete the scan
  • It will run the pings in the background for parallel processing
  • It will report the round-trip time, IP, MAC address, and name (if known) of the hosts that respond

Let’s take a closer look!

Displaying valid interface options

We could make the user figure out the valid interface names that can be used as the only command argument. Instead, we can use the ifconfig output passed through jc and a simple jq query to print valid options (that is, interfaces with an IPv4 address configured):

if [[ $1 == "" ]]; then
    echo "Please enter the interface name as command argument."
    echo
    echo "Options:"
    # Only show interfaces with an assigned IP address
    jc ifconfig | jq -r '.[] | select(.ipv4_addr != null) | .name'
    exit 1
fi

Once jc converts the ifconfig output to JSON we pipe it to jq so it will filter only items with an ipv4_addr field that is not set to null. Pretty straightforward and easy to read.

Note: We could have gotten the same result by using the ip command with the JSON output option instead of parsing the ifconfig output to JSON via jc, but this allows the script to be more cross-platform with macOS, BSD, etc.

Grab the selected interface IP and subnet mask

Once the user selects a valid interface, let’s grab the IP and Subnet:

interfaceInfo=$(jc ifconfig "$1") || exit 1
ip=$(jq -r '.[0].ipv4_addr' <<<"$interfaceInfo")
mask=$(jq -r '.[0].ipv4_mask' <<<"$interfaceInfo")

Here you can see we parse the ifconfig output to JSON with jc in the first line. If the interface name from the user is invalid, ifconfig will give us a non-zero exit code and quit the program with a helpful message. Then we do two variable assignments to pull the IP address and subnet mask via jq queries.

Grab detailed subnet information for the IP/Mask

Next we take the Interface IP and Mask values and use the IP Address string parser in jc to gather more detailed subnet information we will use later in the script.

ipInfo=$(jc --ip-address <<<"$ip/$mask")
network=$(jq -r '.network' <<<"$ipInfo")
numHosts=$(jq -r '.hosts' <<<"$ipInfo")
cidrMask=$(jq -r '.cidr_netmask' <<<"$ipInfo")
firstHostIp=$(jq -r '.first_host' <<<"$ipInfo")
lastHostIp=$(jq -r '.last_host' <<<"$ipInfo")
firstHost=$(jq -r '.int.first_host' <<<"$ipInfo")
lastHost=$(jq -r '.int.last_host' <<<"$ipInfo")

The IP Address parser in jc is nice because it acts like a subnet calculator and gives us a lot of data, including the subnet, number of hosts in the subnet, first host, and last host in different formats (decimal, hex, binary, and standard format). This will help us build a simple for loop that will do most of the work. All we need to do is pick the fields we want with jq. Here is an example of all of the information available with the jc IP Address string parser (IPv6 is also supported):

% echo 192.168.1.10/255.255.255.0 | jc --ip-address -p
{
  "version": 4,
  "max_prefix_length": 32,
  "ip": "192.168.1.10",
  "ip_compressed": "192.168.1.10",
  "ip_exploded": "192.168.1.10",
  "scope_id": null,
  "ipv4_mapped": null,
  "six_to_four": null,
  "teredo_client": null,
  "teredo_server": null,
  "dns_ptr": "10.1.168.192.in-addr.arpa",
  "network": "192.168.1.0",
  "broadcast": "192.168.1.255",
  "hostmask": "0.0.0.255",
  "netmask": "255.255.255.0",
  "cidr_netmask": 24,
  "hosts": 254,
  "first_host": "192.168.1.1",
  "last_host": "192.168.1.254",
  "is_multicast": false,
  "is_private": true,
  "is_global": false,
  "is_link_local": false,
  "is_loopback": false,
  "is_reserved": false,
  "is_unspecified": false,
  "int": {
    "ip": 3232235786,
    "network": 3232235776,
    "broadcast": 3232236031,
    "first_host": 3232235777,
    "last_host": 3232236030
  },
  "hex": {
    "ip": "c0:a8:01:0a",
    "network": "c0:a8:01:00",
    "broadcast": "c0:a8:01:ff",
    "hostmask": "00:00:00:ff",
    "netmask": "ff:ff:ff:00",
    "first_host": "c0:a8:01:01",
    "last_host": "c0:a8:01:fe"
  },
  "bin": {
    "ip": "11000000101010000000000100001010",
    "network": "11000000101010000000000100000000",
    "broadcast": "11000000101010000000000111111111",
    "hostmask": "00000000000000000000000011111111",
    "netmask": "11111111111111111111111100000000",
    "first_host": "11000000101010000000000100000001",
    "last_host": "11000000101010000000000111111110"
  }
}

In addition to accepting a CIDR subnet mask, the jc IP Address string parser also accepts a dotted-quad subnet mask (that’s how ifconfig gives it to us) and provides us the CIDR notation for it in the cidr_netmask field. The IP Address string parser is fairly liberal in the IP formats it will accept.

We’ll see how having the first_host and last_host values in decimal makes for easy looping later.

Sanity check the subnet size

We can use the hosts value from the jc IP Address string parser to see if the subnet is a suitable size to scan. If the subnet supports any more than 1022 hosts (/22) then we don’t want to bother spinning up that many ping processes in the background for the scan. The following code does that sanity check for us:

if [[ $numHosts -gt 1022 ]]; then
    echo "Subnet is too large ($numHosts IPs). Exiting."
    exit 1
fi

Grab the start time in ISO and Unix Epoch format

Next we want to grab the start time. The date command parser in jc gives us the current time in ISO and Epoch formats that we can easily pull with jq. This allows us to display the time in a nice, standard human readable format and also have the date-time information in an easy-to-use format for calculating the duration later.

startTime=$(jc date)
startTimeIso=$(jq -r '.iso' <<<"$startTime")
startTimeEpoch=$(jq -r '.epoch' <<<"$startTime")

Here are all of the fields available when running the date command through jc:

% jc -p date
{
  "year": 2022,
  "month": "Aug",
  "month_num": 8,
  "day": 28,
  "weekday": "Sun",
  "weekday_num": 7,
  "hour": 1,
  "hour_24": 13,
  "minute": 39,
  "second": 20,
  "period": "PM",
  "timezone": "PDT",
  "utc_offset": null,
  "day_of_year": 240,
  "week_of_year": 34,
  "iso": "2022-08-28T13:39:20",
  "epoch": 1661719160,
  "epoch_utc": null,
  "timezone_aware": false
}

Show the user what is going to happen

Next we use a series of simple echo commands to provide the subnet and time information back to the user before the scan:

echo "My IP: $ip/$cidrMask"
echo "Sending ICMP requests to $numHosts IPs: $firstHostIp - $lastHostIp"
echo "Start Time: $startTimeIso"

The main loop

Now comes the fun part – here is the main loop where we ping every host in the subnet and record the round-trip time, IP, MAC address, and name of each host that responds:

for (( ipDecimal=firstHost; ipDecimal<=lastHost; ipDecimal++ )); do
    # Do each ping in the background for parallel processing
    {
        # grab the packets received and rtt values from the ping output
        thisIp=$(jc --ip-address <<<"$ipDecimal" | jq -r '.ip')
        pingResult=$(ping -c1 "$thisIp" | jc --ping)
        packetsReceived=$(jq -r '.packets_received' <<<"$pingResult")
        rtTime=$(jq -r '.round_trip_ms_max' <<<"$pingResult")

        if [[ $packetsReceived -gt 0 ]]; then
            # Grab the MAC address and name for each alive host from the arp command
            thisIpArpInfo=$(arp -a | jc --arp | jq --arg myip "$thisIp" '.[] | select(.address == $myip)')
            thisIpMac=$(jq -r '.hwaddress // empty' <<<"$thisIpArpInfo")
            thisIpName=$(jq -r '.name // empty' <<<"$thisIpArpInfo")

            printf "%9.3f ms   %-16s%-20s%s\n" "$rtTime" "$thisIp" "$thisIpMac" "$thisIpName" | tee -a "$tempFile"
        fi
    } &
done
wait

Let’s break this down a little bit:

for (( ipDecimal=$firstHost; ipDecimal<=$lastHost; ipDecimal++ )); do

We use a C-style for loop which allows us to use those decimal versions of the first and last host IP addresses. I told you those decimal values would come in handy!

        thisIp=$(jc --ip-address <<<"$ipDecimal" | jq -r '.ip')
        pingResult=$(ping -c1 "$thisIp" | jc --ping)
        packetsReceived=$(jq -r '.packets_received' <<<"$pingResult")
        rtTime=$(jq -r '.round_trip_ms_max' <<<"$pingResult")

The decimal IP format is nice to loop over, but unfortunately the ping and arp commands do not seem to accept IP addresses in decimal format (at least not on all platforms). Not to worry – we simply send the decimal IP address to the jc IP Address string parser and it will tell us what the IP address is in standard dotted-quad notation.

Then we give ping that IP address and parse its output with jc. We only care about the packets_received and round_trip_ms_max fields, so we assign them to Bash variables.

Next, let’s take a look at the if block:

        if [[ $packetsReceived -gt 0 ]]; then
            # Grab the MAC address and name for each alive host from the arp command
            thisIpArpInfo=$(arp -a | jc --arp | jq --arg myip "$thisIp" '.[] | select(.address == $myip)')
            thisIpMac=$(jq -r '.hwaddress // empty' <<<"$thisIpArpInfo")
            thisIpName=$(jq -r '.name // empty' <<<"$thisIpArpInfo")

            printf "%9.3f ms   %-16s%-20s%s\n" "$rtTime" "$thisIp" "$thisIpMac" "$thisIpName" | tee -a "$tempFile"
        fi

There’s a bit going on in this if block:

  • We only run the below commands if there was an ICMP reply from the ping output
  • Since we got an ICMP reply, we check the ARP table via the arp -a command and filter for the current IP address’ MAC address and name. Having jc parse the arp-a output into JSON allows us to use a simple jq query to accomplish this.
  • Notice the use of the --arg option in jq that allows us to use the $thisIp value in the query.
  • Notice the jq -r '.name // empty' section. This tells jq to output an empty string if it sees a null value for name.
  • We use the printf command with string format specifications to print our output in nice, even columns.
  • The tee command copies what is printed to the screen and appends it to a temporary file that we will use later.
    } &
done
wait

The & at the end of the Bash command grouping tells Bash to run all of the commands enclosed in {} brackets in the background, so we get parallel processing. The wait command tells bash to pause until all of the background processes complete.

Grab the end time

After all of the background ping and arp processes return, we can grab the end time by parsing the date command with jc and returning the iso and epoch values:

endTime=$(jc date)
endTimeIso=$(jq -r '.iso' <<<"$endTime")
endTimeEpoch=$(jq -r '.epoch' <<<"$endTime")
totalTime=$((endTimeEpoch-startTimeEpoch))

Then we subtract the epoch values to get the total run time.

Grab the number of alive hosts

We can run the temporary file through wc to get the number of lines. The wc parser in jc makes it easy to pull the number of lines with a quick jq query. Then we delete the temporary file.

totalAlive=$(jc wc "$tempFile" | jq '.[0].lines')
rm "$tempFile"

Print the summary message

Finally, we print a summary message with the total run-time, subnet information, number of alive hosts, and the human-readable end time:

echo "Scanned $network/$cidrMask subnet in $totalTime seconds."
echo "$totalAlive alive hosts found."
echo "End Time: $endTimeIso"

Conclusion

There you go – that was a pretty fun exercise demonstrating how you can rapidly develop a prototype in Bash using the output of existing commands on the system without needing to manually parse them. By using jq to query the JSON output from the commands and jc the script becomes very easy to understand. It’s nearly self documenting!

Let me know if you have built any cool scripts or programs with jc and jq!

Featured

JC Version 1.21.0 Released

I’m excited to announce the release of jc version 1.21.0 available on github and pypi. jc now supports over 100 standard and streaming parsers. Thank you to the Open Source community for making this possible!

jc can be installed via pip or through several official OS package repositories, including Debian, Ubuntu, Fedora, openSUSE, Arch Linux, NixOS Linux, Guix System Linux, FreeBSD, and macOS. For more information on how to get jc, see the project README.

To upgrade with pip:

$ pip3 install --upgrade jc
Sections

    What’s New

    • New --meta-out or -M option to add metadata to the JSON output, including a UTC timestamp, parser name, magic command, and magic command exit code
    • IP Address string parser
    • Syslog standard and streaming string parsers (RFC 3164 and RFC 5424)
    • CEF standard and streaming string parsers
    • PLIST file parser (XML and binary support)
    • mdadm command parser
    • Add -n support to the traceroute parser
    • Other minor parser fixes

    New Features

    The new --meta-out command option adds a _jc_meta key to the output objects that contains the parser name, a UTC timestamp, and the magic command and exit code information if the magic syntax is used.

    Standard parser output can either be an array of objects (list of dictionaries) or a single object (dictionary). If the output is an array of objects, then each object in the array will have the _jc_meta field added. If the output is a singe object, then the _jc_meta field will be added to that single object. In the case of streaming parsers, discrete objects are emitted for each item. Each object will have a _jc_meta field added.

    Here is an example with the ping parser and the magic syntax.

    $ jc --meta-out -p ping -c3 192.168.1.252
    {
      "destination_ip": "192.168.1.252",
      "data_bytes": 56,
      "pattern": null,
      "destination": "192.168.1.252",
      "packets_transmitted": 3,
      "packets_received": 0,
      "packet_loss_percent": 100.0,
      "duplicates": 0,
      "responses": [
        {
          "type": "timeout",
          "icmp_seq": 0,
          "duplicate": false
        },
        {
          "type": "timeout",
          "icmp_seq": 1,
          "duplicate": false
        }
      ],
      "_jc_meta": {
        "parser": "ping",
        "timestamp": 1661128157.294033,
        "magic_command": [
          "ping",
          "-c3",
          "192.168.1.252"
        ],
        "magic_command_exit": 2
      }
    }

    New Parsers

    IP Address string parser

    Support for IPv4 and IPv6 CIDR strings. (Documentation)

    Standard and decimal IP notation is supported. The output includes subnet information in standard, decimal, hex, and binary notation.

    $ echo 192.168.2.10/24 | jc --ip-address -p
    {
      "version": 4,
      "max_prefix_length": 32,
      "ip": "192.168.2.10",
      "ip_compressed": "192.168.2.10",
      "ip_exploded": "192.168.2.10",
      "scope_id": null,
      "ipv4_mapped": null,
      "six_to_four": null,
      "teredo_client": null,
      "teredo_server": null,
      "dns_ptr": "10.2.168.192.in-addr.arpa",
      "network": "192.168.2.0",
      "broadcast": "192.168.2.255",
      "hostmask": "0.0.0.255",
      "netmask": "255.255.255.0",
      "cidr_netmask": 24,
      "hosts": 254,
      "first_host": "192.168.2.1",
      "last_host": "192.168.2.254",
      "is_multicast": false,
      "is_private": true,
      "is_global": false,
      "is_link_local": false,
      "is_loopback": false,
      "is_reserved": false,
      "is_unspecified": false,
      "int": {
        "ip": 3232236042,
        "network": 3232236032,
        "broadcast": 3232236287,
        "first_host": 3232236033,
        "last_host": 3232236286
      },
      "hex": {
        "ip": "c0:a8:02:0a",
        "network": "c0:a8:02:00",
        "broadcast": "c0:a8:02:ff",
        "hostmask": "00:00:00:ff",
        "netmask": "ff:ff:ff:00",
        "first_host": "c0:a8:02:01",
        "last_host": "c0:a8:02:fe"
      },
      "bin": {
        "ip": "11000000101010000000001000001010",
        "network": "11000000101010000000001000000000",
        "broadcast": "11000000101010000000001011111111",
        "hostmask": "00000000000000000000000011111111",
        "netmask": "11111111111111111111111100000000",
        "first_host": "11000000101010000000001000000001",
        "last_host": "11000000101010000000001011111110"
      }
    }
    
    $ echo 127:0:de::1%128/96 | jc --ip-address -p
    {
      "version": 6,
      "max_prefix_length": 128,
      "ip": "127:0:de::1",
      "ip_compressed": "127:0:de::1%128",
      "ip_exploded": "0127:0000:00de:0000:0000:0000:0000:0001",
      "scope_id": "128",
      "ipv4_mapped": null,
      "six_to_four": null,
      "teredo_client": null,
      "teredo_server": null,
      "dns_ptr": "1.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.....0.7.2.1.0.ip6.arpa",
      "network": "127:0:de::",
      "broadcast": "127:0:de::ffff:ffff",
      "hostmask": "::ffff:ffff",
      "netmask": "ffff:ffff:ffff:ffff:ffff:ffff::",
      "cidr_netmask": 96,
      "hosts": 4294967294,
      "first_host": "127:0:de::1",
      "last_host": "127:0:de::ffff:fffe",
      "is_multicast": false,
      "is_private": false,
      "is_global": true,
      "is_link_local": false,
      "is_loopback": false,
      "is_reserved": true,
      "is_unspecified": false,
      "int": {
        "ip": 1531727573536155682370944093904699393,
        "network": 1531727573536155682370944093904699392,
        "broadcast": 1531727573536155682370944098199666687,
        "first_host": 1531727573536155682370944093904699393,
        "last_host": 1531727573536155682370944098199666686
      },
      "hex": {
        "ip": "01:27:00:00:00:de:00:00:00:00:00:00:00:00:00:01",
        "network": "01:27:00:00:00:de:00:00:00:00:00:00:00:00:00:00",
        "broadcast": "01:27:00:00:00:de:00:00:00:00:00:00:ff:ff:ff:ff",
        "hostmask": "00:00:00:00:00:00:00:00:00:00:00:00:ff:ff:ff:ff",
        "netmask": "ff:ff:ff:ff:ff:ff:ff:ff:ff:ff:ff:ff:00:00:00:00",
        "first_host": "01:27:00:00:00:de:00:00:00:00:00:00:00:00:00:01",
        "last_host": "01:27:00:00:00:de:00:00:00:00:00:00:ff:ff:ff:fe"
      },
      "bin": {
        "ip": "000000010010011100000000000000000000000011011110000000...",
        "network": "0000000100100111000000000000000000000000110111100...",
        "broadcast": "00000001001001110000000000000000000000001101111...",
        "hostmask": "000000000000000000000000000000000000000000000000...",
        "netmask": "1111111111111111111111111111111111111111111111111...",
        "first_host": "0000000100100111000000000000000000000000110111...",
        "last_host": "00000001001001110000000000000000000000001101111..."
      }
    }

    Syslog string parser (RFC 5424)

    Support for RFC 5424 Syslog strings. Multiple syslog strings separated by newline characters are supported. (Documentation)

    $ echo "<165>1 2003-08-24T05:14:15.000003-07:00 192.0.2.1 myproc 8710 - - %% It's time to make the do-nuts." | jc --syslog -p
    [
      {
        "priority": 165,
        "version": 1,
        "timestamp": "2003-08-24T05:14:15.000003-07:00",
        "hostname": "192.0.2.1",
        "appname": "myproc",
        "proc_id": 8710,
        "msg_id": null,
        "structured_data": null,
        "message": "%% It's time to make the do-nuts.",
        "timestamp_epoch": 1061727255,
        "timestamp_epoch_utc": null
      }
    ]

    Syslog string streaming parser (RFC 5424)

    Support for RFC 5424 Syslog strings. Multiple syslog strings separated by newline characters are supported. This is a streaming parser and it outputs JSON Lines. (Documentation)

    $ cat syslog.txt | jc --syslog-s -p
    {"priority":165,"version":1,"timestamp":"2003-08-24T05:14:15.000003-...}
    {"priority":165,"version":1,"timestamp":"2003-08-24T05:14:16.000003-...}
    ...

    Syslog string parser (BSD-style RFC 3164)

    Support for RFC 3164 Syslog strings. Multiple syslog strings separated by newline characters are supported. (Documentation)

    $ echo "<34>Oct 11 22:14:15 mymachine su: 'su root' failed for lonvick on /dev/pts/8" | jc --syslog-bsd -p
    [
      {
        "priority": 34,
        "date": "Oct 11 22:14:15",
        "hostname": "mymachine",
        "tag": "su",
        "content": "'su root' failed for lonvick on /dev/pts/8"
      }
    ]

    Syslog string streaming parser (BSD-style RFC 3164)

    Support for RFC 3164 Syslog strings. Multiple syslog strings separated by newline characters are supported. This is a streaming parser and it outputs JSON Lines. (Documentation)

    $ cat syslog.txt | jc --syslog-bsd-s -p
    {"priority":34,"date":"Oct 11 22:14:15","hostname":"mymachine","t...}
    {"priority":34,"date":"Oct 11 22:14:16","hostname":"mymachine","t...}
    ...

    CEF string parser

    Support for standard CEF log lines as documented in the Microfocus Arcsight CEF specification. (Documentation)

    $ cat cef.log | jc --cef -p
    [
      {
        "deviceVendor": "Trend Micro",
        "deviceProduct": "Deep Security Agent",
        "deviceVersion": "<DSA version>",
        "deviceEventClassId": "4000000",
        "name": "Eicar_test_file",
        "agentSeverity": 6,
        "CEFVersion": 0,
        "dvchost": "hostname",
        "string": "hello \"world\"!",
        "start": "Nov 08 2020 12:30:00.111 UTC",
        "start_epoch": 1604867400,
        "start_epoch_utc": 1604838600,
        "Host_ID": 1,
        "Quarantine": 205,
        "myDate": "Nov 08 2022 12:30:00.111",
        "myDate_epoch": 1667939400,
        "myDate_epoch_utc": null,
        "myFloat": 3.14,
        "deviceEventClassIdNum": 4000000,
        "agentSeverityString": "Medium",
        "agentSeverityNum": 6
      }
    ]

    CEF string streaming parser

    Support for standard CEF log lines as documented in the Microfocus Arcsight CEF specification. This is a streaming parser and it outputs JSON Lines. (Documentation)

    $ cat cef.log | jc --cef-s
    {"deviceVendor":"Fortinet","deviceProduct":"FortiDeceptor","deviceV...}
    {"deviceVendor":"Trend Micro","deviceProduct":"Deep Security Agent"...}
    ...

    PLIST file parser

    Support for binary and XML PLIST files. (Documentation)

    $ cat info.plist | jc --plist -p
    {
      "NSAppleScriptEnabled": true,
      "LSMultipleInstancesProhibited": true,
      "CFBundleInfoDictionaryVersion": "6.0",
      "DTPlatformVersion": "GM",
      "CFBundleIconFile": "GarageBand.icns",
      "CFBundleName": "GarageBand",
      "DTSDKName": "macosx10.13internal",
      "NSSupportsAutomaticGraphicsSwitching": true,
      "RevisionDate": "2018-12-03_14:10:56",
      "UTImportedTypeDeclarations": [
        {
          "UTTypeConformsTo": [
            "public.data",
            "public.content"
      ...
    }

    mdadm command parser

    Linux support for mdadm command output. The --examine and --query options are supported. (Documentation)

    $ mdadm --query --detail /dev/md0 | jc --mdadm -p
    {
      "device": "/dev/md0",
      "version": "1.1",
      "creation_time": "Tue Apr 13 23:22:16 2010",
      "raid_level": "raid1",
      "array_size": "5860520828 (5.46 TiB 6.00 TB)",
      "used_dev_size": "5860520828 (5.46 TiB 6.00 TB)",
      "raid_devices": 2,
      "total_devices": 2,
      "persistence": "Superblock is persistent",
      "intent_bitmap": "Internal",
      "update_time": "Tue Jul 26 20:16:31 2022",
      "state": "clean",
      "active_devices": 2,
      "working_devices": 2,
      "failed_devices": 0,
      "spare_devices": 0,
      "consistency_policy": "bitmap",
      "name": "virttest:0",
      "uuid": "85c5b164:d58a5ada:14f5fe07:d642e843",
      "events": 2193679,
      "device_table": [
        {
          "number": 3,
          "major": 8,
          "minor": 17,
          "state": [
            "active",
            "sync"
          ],
          "device": "/dev/sdb1",
          "raid_device": 0
        },
        {
          "number": 2,
          "major": 8,
          "minor": 33,
          "state": [
            "active",
            "sync"
          ],
          "device": "/dev/sdc1",
          "raid_device": 1
        }
      ],
      "array_size_num": 5860520828,
      "used_dev_size_num": 5860520828,
      "name_val": "virttest:0",
      "uuid_val": "85c5b164:d58a5ada:14f5fe07:d642e843",
      "state_list": [
        "clean"
      ],
      "creation_time_epoch": 1271226136,
      "update_time_epoch": 1658891791
    }

    Happy parsing!

    Featured

    Convert X.509 Certificates to JSON with JC

    There are some cool hacks out there that will help you extract X.509 certificate metadata to JSON values. Since jc converts so many other things to JSON, I figured it would make sense to add this functionality. I wanted to make sure jc could handle both binary and text-encoded certificates of most any type, well-known and user-defined extensions, and also ensure the output was convenient for use in scripts.

    At first, I considered parsing -text output from openssl. It would not have been too hard to do – except for finding a way to reliably parse unknown certificate extensions. Ultimately I wanted to not only support openssl output, but even native certificate file formats so you could pipe the certificate file directly to jc like: cat certificate.crt | jc --x509-cert.

    If I had gone the original route I would have needed two parsers: one openssl parser and another X.509 certificate parser – maybe even multiple parsers for different certificate formats.

    I started building an X.509 certificate file parser that supports DER and PEM-encoded certificates first. Serendipitously, I found that this method provides all of the desired functionality in a single parser! This method suports:

    • Most any binary certificate format (DER, PKCS #7, PKCS #12, etc.)
    • PEM-encoded certificates
    • openssl command output (and any other command that can output DER and PEM)
      • Allows conversion of password-protected certificate files to JSON
      • Allows conversion of most any certificate format to JSON
    • Well-known and user-defined X.509 certificate extensions
    • Certificate files with multiple certificates bundled
    • Convenience fields (e.g. dates in timestamp format as well as ISO format)

    Converting DER Certificate Files to JSON

    The most basic (but not necessarily the most popular) X.509 certificate file is simply a binary DER format certificate. Certificate file extensions are arbitrary, so there is no guarantee you have simple DER-encoded certificate by looking at the extension. But typical file extensions for DER-encoded certificates can be .der, .cer, .crt, etc.

    jc can natively convert DER-encoded certificate files to JSON:

    $ cat certificate.crt | jc --x509-cert

    Converting PEM Certificate Files to JSON

    PEM encoded certificate files are pretty much just base64-encoded DER certificates and will many times have a .pem file extension. But, again, certificate file extensions are arbitrary, so a valid PEM file could also have a .crt, .cer, or any other extension.

    PEM files can also contain more than one certificate. For instance, there might be a certificate chain with the web server certificate and one or more intermediate certificates encoded in the file. Also, a PEM file can include other objects like private keys. Don’t worry – jc can handle multiple certificates and will ignore anything but certificates in the JSON output.

    jc can natively convert PEM-encoded certificates to JSON:

    $ cat certificate.pem | jc --x509-cert

    Converting PKCS #7 Certificate Files to JSON

    PKCS #7 certificates will typically have a .p7b or .p7c file extension and can be either binary DER-encoded or text PEM-encoded.

    jc will not natively convert PKCS #7 certificate files to JSON, but don’t worry! You can easily convert the PKCS #7 file to vanilla X.509 DER or PEM with openssl so jc can convert it to JSON:

    $ openssl pkcs7 \
              -in certificate.p7b \
              -inform der \
              -print_certs | jc --x509-cert

    Note that the -inform argument is not needed if the PKCS #7 file is PEM encoded.

    Converting PKCS #12 Certificate Files to JSON

    PKCS #12 files are a password-protected binary format that can contain certificates, private keys, and other objects. You will typically see a .pfx or .p12 extension on these files.

    jc will not natively convert binary PKCS #12 certificate files to JSON, but don’t worry! You can easily convert the PKCS #12 file to vanilla X.509 DER or PEM with openssl so jc can convert it to JSON:

    $ openssl pkcs12 \
              -info \
              -in certificate.pfx \
              -passin pass:abc123 \
              -passout pass: | jc --x509-cert

    Note that you need to specify the certificate file password in the -passin parameter. You can set any password to the -passout parameter so you won’t be prompted for one when the command is run. In this example we set it to blank.

    Using in a Script

    Let’s put all of the pieces together and show how you can use JSON output in a script.

    No matter the certificate type, the JSON output will be consistent. The schema can be found in the jc documentation for the X.509 certificate parser. Here is an example of a Certificate Authority certificate converted from a PKCS #12 file:

    [
      {
        "tbs_certificate": {
          "version": "v3",
          "serial_number": "e1:3f:bc:97:7c:10:1d:b8",
          "signature": {
            "algorithm": "sha1_rsa",
            "parameters": null
          },
          "issuer": {
            "country_name": "FR",
            "state_or_province_name": "Alsace",
            "locality_name": "Strasbourg",
            "organization_name": "www.freelan.org",
            "organizational_unit_name": "freelan",
            "common_name": "Freelan Sample Certificate Authority",
            "email_address": "contact@freelan.org"
          },
          "validity": {
            "not_before": 1335521864,
            "not_after": 1338113864,
            "not_before_iso": "2012-04-27T10:17:44+00:00",
            "not_after_iso": "2012-05-27T10:17:44+00:00"
          },
          "subject": {
            "country_name": "FR",
            "state_or_province_name": "Alsace",
            "locality_name": "Strasbourg",
            "organization_name": "www.freelan.org",
            "organizational_unit_name": "freelan",
            "common_name": "Freelan Sample Certificate Authority",
            "email_address": "contact@freelan.org"
          },
          "subject_public_key_info": {
            "algorithm": {
              "algorithm": "rsa",
              "parameters": null
            },
            "public_key": {
              "modulus": "e0:e9:fb:ca:10:70:af:8c:4e:e5:8f:65:5c:49:65:1e:f9:a5:a2:b8:cd:c5:27:82:ea:58:5d:64:86:58:55:cf:4d:5e:ef:b2:c1:64:ea:f2:27:78:f0:2b:4c:bf:93:...",
              "public_exponent": 65537
            }
          },
          "issuer_unique_id": null,
          "subject_unique_id": null,
          "extensions": [
            {
              "extn_id": "key_identifier",
              "critical": false,
              "extn_value": "23:6c:2d:3d:3e:29:5d:78:b8:6c:3e:aa:e2:bb:2e:1e:6c:87:f2:53"
            },
            {
              "extn_id": "authority_key_identifier",
              "critical": false,
              "extn_value": {
                "key_identifier": "23:6c:2d:3d:3e:29:5d:78:b8:6c:3e:aa:e2:bb:2e:1e:6c:87:f2:53",
                "authority_cert_issuer": null,
                "authority_cert_serial_number": null
              }
            },
            {
              "extn_id": "basic_constraints",
              "critical": false,
              "extn_value": {
                "ca": true,
                "path_len_constraint": null
              }
            }
          ]
        },
        "signature_algorithm": {
          "algorithm": "sha1_rsa",
          "parameters": null
        },
        "signature_value": "b0:44:9a:49:0a:0a:7b:4b:e9:3d:05:3e:97:de:40:5e:7e:89:c4:10:e6:2d:c9:65:c1:3e:9b:b2:1b:74:25:9b:5a:dd:85:ce:ba:0c:21:85:a2:b0:e6:4f:18:cc:98:..."
      }
    ]

    Note: jc does not verify the integrity of the certificate, which requires calculating the hash of the certificate body and comparing it to the the hash in the certificate’s signature after it (the hash) is decrypted with the issuer certificate’s public key.

    Notice the first (and only) certificate in this JSON array has a tbs_certificate.validity object that contains not_before and not_after values in both epoch timestamp and ISO formats. This should make it easy for us to check whether the certificate is valid in a Bash script using a JSON parser like jq:

    #!/bin/bash
    
    # grab the validity information from the first certificate in the pkcs12 file
    cert_json=$(
        openssl pkcs12 \
            -info \
            -in certificate.pfx \
            -passin pass:abc123 \
            -passout pass: | jc --x509-cert
    )
    
    not_before=$(
        echo "$cert_json" | jq .[0].tbs_certificate.validity.not_before
    )
    
    not_after=$(
        echo "$cert_json" | jq .[0].tbs_certificate.validity.not_after
    )
    
    # compare the timestamps to the current time
    current_time=$(date '+%s')
    
    if [[ "$not_before" -lt "$current_time" ]] && [[ "$not_after" -gt "$current_time" ]]; then
        echo "Certificate is valid"
    else
        echo "Certificate is invalid"
    fi

    And here is the output for an expired certificate. (note the STDERR and STDOUT lines have been distinguished):

    $ ./checkcert.sh
    MAC Iteration 2048                                                    # STDERR
    MAC verified OK                                                       # STDERR
    PKCS7 Encrypted data: pbeWithSHA1And40BitRC2-CBC, Iteration 2048      # STDERR
    Certificate bag                                                       # STDERR
    PKCS7 Data                                                            # STDERR
    Shrouded Keybag: pbeWithSHA1And3-KeyTripleDES-CBC, Iteration 2048     # STDERR
    Certificate is invalid                                                # STDOUT
    

    There are a lot more things to check than just the not_before and not_after fields for a true certificate validation, so this should be considered a toy example to get you started. I hope this new jc X.509 certificate parser helps you in your automation scripts!

    Featured

    JC Version 1.20.0 Released

    I’m excited to announce the release of jc version 1.20.0 available on github and pypi. jc now supports over 100 standard and streaming parsers. Thank you to the Open Source community for making this possible!

    jc can be installed via pip or through several official OS package repositories, including Debian, Ubuntu, Fedora, openSUSE, Arch Linux, NixOS Linux, Guix System Linux, FreeBSD, and macOS. For more information on how to get jc, see the project README.

    To upgrade with pip:

    $ pip3 install --upgrade jc
    Sections

      What’s New

      • Add YAML output option with the -y option
      • Add top -b standard and streaming parsers tested on linux
      • Add plugin_parser_count, standard_parser_count, and streaming_parser_count keys to jc -a output
      • Add is_compatible function to the utils module
      • Fix pip-show parser for packages with a multi-line license field
      • Fix ASCII Table parser for cases where centered headers cause mis-aligned fields

      New Parsers

      top -b command parser

      Support for the top -b command. (Documentation)

      $ top -b -n 3 | jc --top -p
      [
        {
          "time": "11:20:43",
          "uptime": 118,
          "users": 2,
          "load_1m": 0.0,
          "load_5m": 0.01,
          "load_15m": 0.05,
          "tasks_total": 108,
          "tasks_running": 2,
          "tasks_sleeping": 106,
          "tasks_stopped": 0,
          "tasks_zombie": 0,
          "cpu_user": 5.6,
          "cpu_sys": 11.1,
          "cpu_nice": 0.0,
          "cpu_idle": 83.3,
          "cpu_wait": 0.0,
          "cpu_hardware": 0.0,
          "cpu_software": 0.0,
          "cpu_steal": 0.0,
          "mem_total": 3.7,
          "mem_free": 3.3,
          "mem_used": 0.2,
          "mem_buff_cache": 0.2,
          "swap_total": 2.0,
          "swap_free": 2.0,
          "swap_used": 0.0,
          "mem_available": 3.3,
          "processes": [
            {
              "pid": 2225,
              "user": "kbrazil",
              "priority": 20,
              "nice": 0,
              "virtual_mem": 158.1,
              "resident_mem": 2.2,
              "shared_mem": 1.6,
              "status": "running",
              "percent_cpu": 12.5,
              "percent_mem": 0.1,
              "time_hundredths": "0:00.02",
              "command": "top",
              "parent_pid": 1884,
              "uid": 1000,
              "real_uid": 1000,
              "real_user": "kbrazil",
              "saved_uid": 1000,
              "saved_user": "kbrazil",
              "gid": 1000,
              "group": "kbrazil",
              "pgrp": 2225,
              "tty": "pts/0",
              "tty_process_gid": 2225,
              "session_id": 1884,
              "thread_count": 1,
              "last_used_processor": 0,
              "time": "0:00",
              "swap": 0.0,
              "code": 0.1,
              "data": 1.0,
              "major_page_fault_count": 0,
              "minor_page_fault_count": 736,
              "dirty_pages_count": 0,
              "sleeping_in_function": null,
              "flags": "..4.2...",
              "cgroups": "1:name=systemd:/user.slice/user-1000.+",
              "supplementary_gids": [
                10,
                1000
              ],
              "supplementary_groups": [
                "wheel",
                "kbrazil"
              ],
              "thread_gid": 2225,
              "environment_variables": [
                "XDG_SESSION_ID=2",
                "HOSTNAME=localhost"
              ],
              "major_page_fault_count_delta": 0,
              "minor_page_fault_count_delta": 4,
              "used": 2.2,
              "ipc_namespace_inode": 4026531839,
              "mount_namespace_inode": 4026531840,
              "net_namespace_inode": 4026531956,
              "pid_namespace_inode": 4026531836,
              "user_namespace_inode": 4026531837,
              "nts_namespace_inode": 4026531838
            },
            ...
          ]
        }
      ]

      top -b command streaming parser

      Support for the top -b command. This is a streaming parser and it outputs JSON Lines. (Documentation):

      $ top -b | jc --top-s
      {"time":"11:24:50","uptime":2,"users":2,"load_1m":0.23,"load_5m":...}
      ...

      v1.20.1 Updates

      • Add postconf -M parser tested on linux
      • Update asciitable and asciitable-m parsers to preserve case in key names when using the -r or raw=True options.
      • Add long options (e.g. --help, --about, --pretty, etc.)
      • Add shell completions for Bash and Zsh
      • Fix id parser for cases where the user or group name is not present

      postconf -M command parser

      Linux support for the postconf -m command. (Documentation):

      $ postconf -M | jc --postconf -p          # or jc -p postconf -M
      [
        {
          "service_name": "smtp",
          "service_type": "inet",
          "private": false,
          "unprivileged": null,
          "chroot": true,
          "wake_up_time": null,
          "process_limit": null,
          "command": "smtpd",
          "no_wake_up_before_first_use": null
        },
        {
          "service_name": "pickup",
          "service_type": "unix",
          "private": false,
          "unprivileged": null,
          "chroot": true,
          "wake_up_time": 60,
          "process_limit": 1,
          "command": "pickup",
          "no_wake_up_before_first_use": false
        }
      ]

      Long Options

      jc now supports long CLI options:

      Options:
          -a,  --about        about jc
          -C,  --force-color  force color output even when using pipes (overrides -m)
          -d,  --debug        debug (double for verbose debug)
          -h,  --help         help (--help --parser_name for parser documentation)
          -m,  --monochrome   monochrome output
          -p,  --pretty       pretty print output
          -q,  --quiet        suppress warnings (double to ignore streaming errors)
          -r,  --raw          raw output
          -u,  --unbuffer     unbuffer output
          -v,  --version      version info
          -y,  --yaml-out     YAML output
          -B,  --bash-comp    gen Bash completion: jc -B > /etc/bash_completion.d/jc
          -Z,  --zsh-comp     gen Zsh completion: jc -Z > "${fpath[1]}/_jc"
      

      Shell Completions

      Bash and Zsh completions are now available for jc! If your system is already set up for completions you can run the following to enable completions:

      Bash

      Linux
      $ jc -B > /etc/bash_completion.d/jc
      
      macOS
      $ jc -B > /usr/local/etc/bash_completion.d/jc

      Zsh

      Linux and macOS
      $ jc -Z > "${fpath[1]}/_jc"

      v1.20.2 Updates

      • Add gpg --with-colons command parser tested on linux
      • Add DER and PEM encoded X.509 Certificate parser
      • Add Bash and Zsh completion scripts to DEB and RPM packages

      gpg –with-colons command parser

      Linux support for the gpg --with-colons command. (Documentation):

      $ gpg --with-colons --show-keys file.gpg | jc --gpg -p
      [
        {
          "type": "pub",
          "validity": "f",
          "key_length": "1024",
          "pub_key_alg": "17",
          "key_id": "6C7EE1B8621CC013",
          "creation_date": "899817715",
          "expiration_date": "1055898235",
          "certsn_uidhash_trustinfo": null,
          "owner_trust": "m",
          "user_id": null,
          "signature_class": null,
          "key_capabilities": "scESC",
      ...

      X.509 DER/PEM Certificate Files

      Support for DER and PEM encoded certificate files (Documentation):

      $ cat alice.crt | jc --x509-cert -p
      [
        {
          "tbs_certificate": {
            "version": "v3",
            "serial_number": "01",
            "signature": {
              "algorithm": "sha1_rsa",
              "parameters": null
            },
            "issuer": {
              "country_name": "FR",
              "state_or_province_name": "Alsace",
              "locality_name": "Strasbourg",
              "organization_name": "www.freelan.org",
              "organizational_unit_name": "freelan",
              "common_name": "Freelan Sample Certificate Authority",
              "email_address": "contact@freelan.org"
            },
            "validity": {
              "not_before": 1335522678,
              "not_after": 1650882678,
              "not_before_iso": "2012-04-27T10:31:18+00:00",
              "not_after_iso": "2022-04-25T10:31:18+00:00"
            },
            "subject": {
              "country_name": "FR",
              "state_or_province_name": "Alsace",
              "organization_name": "www.freelan.org",
              "organizational_unit_name": "freelan",
              "common_name": "alice",
              "email_address": "contact@freelan.org"
            },
            "subject_public_key_info": {
              "algorithm": {
                "algorithm": "rsa",
                "parameters": null
              },
              "public_key": {
                "modulus": "dd:6d:bd:f8:80:fa:d7:de:1b:1f:a7:a3:2e:b2:02:e2:16:f6:52:0a:3c:bf:a6:42:f8:ca:dc:93:67:4d:60:c3:4f:8d:c3:8a:00:1b:f1:c4:4b:41:6a:69:d2:69:e5:3f:21:8e:c5:0b:f8:22:37:ad:b6:2c:4b:55:ff:7a:03:72:bb:9a:d3:ec:96:b9:56:9f:cb:19:99:c9:32:94:6f:8f:c6:52:06:9f:45:03:df:fd:e8:97:f6:ea:d6:ba:bb:48:2b:b5:e0:34:61:4d:52:36:0f:ab:87:52:25:03:cf:87:00:87:13:f2:ca:03:29:16:9d:90:57:46:b5:f4:0e:ae:17:c8:0a:4d:92:ed:08:a6:32:23:11:71:fe:f2:2c:44:d7:6c:07:f3:0b:7b:0c:4b:dd:3b:b4:f7:37:70:9f:51:b6:88:4e:5d:6a:05:7f:8d:9b:66:7a:ab:80:20:fe:ee:6b:97:c3:49:7d:78:3b:d5:99:97:03:75:ce:8f:bc:c5:be:9c:9a:a5:12:19:70:f9:a4:bd:96:27:ed:23:02:a7:c7:57:c9:71:cf:76:94:a2:21:62:f6:b8:1d:ca:88:ee:09:ad:46:2f:b7:61:b3:2c:15:13:86:9f:a5:35:26:5a:67:f4:37:c8:e6:80:01:49:0e:c7:ed:61:d3:cd:bc:e4:f8:be:3f:c9:4e:f8:7d:97:89:ce:12:bc:ca:b5:c6:d2:e0:d9:b3:68:3c:2e:4a:9d:b4:5f:b8:53:ee:50:3d:bf:dd:d4:a2:8a:b6:a0:27:ab:98:0c:b3:b2:58:90:e2:bc:a1:ad:ff:bd:8e:55:31:0f:00:bf:68:e9:3d:a9:19:9a:f0:6d:0b:a2:14:6a:c6:4c:c6:4e:bd:63:12:a5:0b:4d:97:eb:42:09:79:53:e2:65:aa:24:34:70:b8:c1:ab:23:80:e7:9c:6c:ed:dc:82:aa:37:04:b8:43:2a:3d:2a:a8:cc:20:fc:27:5d:90:26:58:f9:b7:14:e2:9e:e2:c1:70:73:97:e9:6b:02:8e:d3:52:59:7b:00:ec:61:30:f1:56:3f:9c:c1:7c:05:c5:b1:36:c8:18:85:cf:61:40:1f:07:e8:a7:06:87:df:9a:77:0b:a9:64:72:03:f6:93:fc:e0:02:59:c1:96:ec:c0:09:42:3e:30:a2:7f:1b:48:2f:fe:e0:21:8f:53:87:25:0d:cb:ea:49:f5:4a:9b:d0:e3:5f:ee:78:18:e5:ba:71:31:a9:04:98:0f:b1:ad:67:52:a0:f2:e3:9c:ab:6a:fe:58:84:84:dd:07:3d:32:94:05:16:45:15:96:59:a0:58:6c:18:0e:e3:77:66:c7:b3:f7:99",
                "public_exponent": 65537
              }
            },
            "issuer_unique_id": null,
            "subject_unique_id": null,
            "extensions": [
              {
                "extn_id": "basic_constraints",
                "critical": false,
                "extn_value": {
                  "ca": false,
                  "path_len_constraint": null
                }
              },
              {
                "extn_id": "2.16.840.1.113730.1.13",
                "critical": false,
                "extn_value": "16:1d:4f:70:65:6e:53:53:4c:20:47:65:6e:65:72:61:74:65:64:20:43:65:72:74:69:66:69:63:61:74:65"
              },
              {
                "extn_id": "key_identifier",
                "critical": false,
                "extn_value": "59:5f:c9:13:ba:1b:cc:b9:a8:41:4a:8a:49:79:6a:36:f6:7d:3e:d7"
              },
              {
                "extn_id": "authority_key_identifier",
                "critical": false,
                "extn_value": {
                  "key_identifier": "23:6c:2d:3d:3e:29:5d:78:b8:6c:3e:aa:e2:bb:2e:1e:6c:87:f2:53",
                  "authority_cert_issuer": null,
                  "authority_cert_serial_number": null
                }
              }
            ]
          },
          "signature_algorithm": {
            "algorithm": "sha1_rsa",
            "parameters": null
          },
          "signature_value": "13:e7:02:45:3e:a7:ab:bd:b8:da:e7:ef:74:88:ac:62:d5:dd:10:56:d5:46:07:ec:fa:6a:80:0c:b9:62:be:aa:08:b4:be:0b:eb:9a:ef:68:b7:69:6f:4d:20:92:9d:18:63:7a:23:f4:48:87:6a:14:c3:91:98:1b:4e:08:59:3f:91:80:e9:f4:cf:fd:d5:bf:af:4b:e4:bd:78:09:71:ac:d0:81:e5:53:9f:3e:ac:44:3e:9f:f0:bf:5a:c1:70:4e:06:04:ef:dc:e8:77:05:a2:7d:c5:fa:80:58:0a:c5:10:6d:90:ca:49:26:71:84:39:b7:9a:3e:e9:6f:ae:c5:35:b6:5b:24:8c:c9:ef:41:c3:b1:17:b6:3b:4e:28:89:3c:7e:87:a8:3a:a5:6d:dc:39:03:20:20:0b:c5:80:a3:79:13:1e:f6:ec:ae:36:df:40:74:34:87:46:93:3b:a3:e0:a4:8c:2f:43:4c:b2:54:80:71:76:78:d4:ea:12:28:d8:f2:e3:80:55:11:9b:f4:65:dc:53:0e:b4:4c:e0:4c:09:b4:dc:a0:80:5c:e6:b5:3b:95:d3:69:e4:52:3d:5b:61:86:02:e5:fd:0b:00:3a:fa:b3:45:cc:c9:a3:64:f2:dc:25:59:89:58:0d:9e:6e:28:3a:55:45:50:5f:88:67:2a:d2:e2:48:cc:8b:de:9a:1b:93:ae:87:e1:f2:90:50:40:d9:0f:44:31:53:46:ad:62:4e:8d:48:86:19:77:fc:59:75:91:79:35:59:1d:e3:4e:33:5b:e2:31:d7:ee:52:28:5f:0a:70:a7:be:bb:1c:03:ca:1a:18:d0:f5:c1:5b:9c:73:04:b6:4a:e8:46:52:58:76:d4:6a:e6:67:1c:0e:dc:13:d0:61:72:a0:92:cb:05:97:47:1c:c1:c9:cf:41:7d:1f:b1:4d:93:6b:53:41:03:21:2b:93:15:63:08:3e:2c:86:9e:7b:9f:3a:09:05:6a:7d:bb:1c:a7:b7:af:96:08:cb:5b:df:07:fb:9c:f2:95:11:c0:82:81:f6:1b:bf:5a:1e:58:cd:28:ca:7d:04:eb:aa:e9:29:c4:82:51:2c:89:61:95:b6:ed:a5:86:7c:7c:48:1d:ec:54:96:47:79:ea:fc:7f:f5:10:43:0a:9b:00:ef:8a:77:2e:f4:36:66:d2:6a:a6:95:b6:9f:23:3b:12:e2:89:d5:a4:c1:2c:91:4e:cb:94:e8:3f:22:0e:21:f9:b8:4a:81:5c:4c:63:ae:3d:05:b2:5c:5c:54:a7:55:8f:98:25:55:c4:a6:90:bc:19:29:b1:14:d4:e2:b0:95:e4:ff:89:71:61:be:8a:16:85"
        }
      ]

      v1.20.4 Updates

      • Add URL string parser
      • Add Email Address string parser
      • Add JWT string parser
      • Add ISO 8601 Datetime string parser
      • Add UNIX Epoch Timestamp string parser
      • Add M3U/M3U8 file parser
      • Add pager functionality to help (parser documentation only)
      • Minor parser performance optimizations

      jc version 1.20.4 includes a few new string parsers that can be very useful in scripts.

      The url string parser not only allows you to pull out the specific parts of the URL you are interested in (e.g. path, query, hostname, etc.) but it also provides encoded and decoded versions of all of those values.

      Similarly, the Email Address string parser allows you to quickly parse out the username and domain, even if Gmail “plus” addressing is used. The parser also allows you to separate out the username from the “plus” suffix.

      JWT strings can now be parsed into their constituent Header, Payload, and Signature parts. The Payload is presented as a standard object.

      And parsing time strings, including ISO 8601 Datetimes and Unix timestamps, just got easier. Both new parsers provide you detailed date information that you can use in your scripts. These parsers are a nice complement to the existing date command parser.

      Finally an M3U/M3U8 parser is included for media playlists. It includes the ability to parse extended information and, since these files are not usually well maintained, the parser fails gracefully for unparsable lines.

      Other minor improvements include more/less paging when accessing parser documentation at the command line via jc --help --parser-name.

      For more details on each of the new parsers, see below.

      URL string parser

      This parser outputs Normalized, Encoded, and Decoded versions of the URL and all of the URL parts. (Documentation)

      This allows you to pull specific information from the URL, including the scheme, netloc, user, password, hostname, port, path, path list, query, and fragment all three ways. For example, the following URL could be decoded:

      $ echo 'http://%D0%BE%D0%B1%D0%BD%D0%BE%D0%B2%D0%BB%D0%B5%D0%BD%D0%B8%D0%B5%D0%BF%D0%BE%D0%B3%D0%BE%D0%B4%D1%8B.%72%75:%38%30' | jc --url | jq .decoded.hostname
      "обновлениепогоды.ru"

      You can easily grab the path string and a path list:

      $ echo 'https://example.com/this/is/a/path' | jc --url | jq .path
      "/this/is/a/path"
      
      $ echo 'https://example.com/this/is/a/path' | jc --url | jq .path_list
      [
        "this",
        "is",
        "a",
        "path"
      ]

      Or even the query string and object:

      $ echo 'https://example.com?user=joe&selections=gardening&selections=plumbing' | jc --url | jq .query
      "user=joe&selections=gardening&selections=plumbing"
      
      $ echo 'https://example.com?user=joe&selections=gardening&selections=plumbing' | jc --url | jq .query_obj
      {
        "user": [
          "joe"
        ],
        "selections": [
          "gardening",
          "plumbing"
        ]
      }

      There are many other use cases that the url parser can help with. Here is a full example of the output:

      $ echo 'https://www.example.com:443/mypath?q1=foo&q2=bar#heading-1' | jc --url -p
      {
        "url": "https://www.example.com:443/mypath?q1=foo&q2=bar#heading-1",
        "scheme": "https",
        "netloc": "www.example.com:443",
        "path": "/mypath",
        "path_list": [
          "mypath"
        ],
        "query": "q1=foo&q2=bar",
        "query_obj": {
          "q1": [
            "foo"
          ],
          "q2": [
            "bar"
          ]
        },
        "fragment": "heading-1",
        "username": null,
        "password": null,
        "hostname": "www.example.com",
        "port": 443,
        "encoded": {
          "url": "https://www.example.com:443/mypath?q1=foo&q2=bar#heading-1",
          "scheme": "https",
          "netloc": "www.example.com:443",
          "path": "/mypath",
          "path_list": [
            "mypath"
          ],
          "query": "q1=foo&q2=bar",
          "fragment": "heading-1",
          "username": null,
          "password": null,
          "hostname": "www.example.com",
          "port": 443
        },
        "decoded": {
          "url": "https://www.example.com:443/mypath?q1=foo&q2=bar#heading-1",
          "scheme": "https",
          "netloc": "www.example.com:443",
          "path": "/mypath",
          "path_list": [
            "mypath"
          ],
          "query": "q1=foo&q2=bar",
          "fragment": "heading-1",
          "username": null,
          "password": null,
          "hostname": "www.example.com",
          "port": 443
        }
      }

      Email Address string parser

      The Email Address string parser allows you to easily pull the username and domain from an email address, even if it is using Gmail’s “plus” addressing. In those cases you can even pull the “plus” suffix. (Documentation)

      $ echo 'joe.user+spam@example.com' | jc --email-address -p
      {
        "username": "joe.user",
        "domain": "example.com",
        "local": "joe.user+spam",
        "local_plus_suffix": "spam"
      }

      JWT string parser

      jc can easily parse JWT strings into their constituent Header, Payload, and Signature parts. Note, the JWT parser does not check the integrity of the token. (Documentation)

      $ echo 'eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ.SflKxwRJSMeKKF2QT4fwpMeJf36POk6yJV_adQssw5c' | jc --jwt -p
      {
        "header": {
          "alg": "HS256",
          "typ": "JWT"
        },
        "payload": {
          "sub": "1234567890",
          "name": "John Doe",
          "iat": 1516239022
        },
        "signature": "49:f9:4a:c7:04:49:48:c7:8a:28:5d:90:4f:87:f0:a4:c7:89:7f:7e:8f:3a:4e:b2:25:5f:da:75:0b:2c:c3:97"
      }

      ISO 8601 Datetime string parser

      This parser explodes all of the relevant date and time fields when given an ISO 8601 datetime string. It will also provide a Unix timestamp and a normalized version of the ISO string. (Documentation)

      $ echo "2022-07-20T14:52:45Z" | jc --iso-datetime -p
      {
        "year": 2022,
        "month": "Jul",
        "month_num": 7,
        "day": 20,
        "weekday": "Wed",
        "weekday_num": 3,
        "hour": 2,
        "hour_24": 14,
        "minute": 52,
        "second": 45,
        "microsecond": 0,
        "period": "PM",
        "utc_offset": "+0000",
        "day_of_year": 201,
        "week_of_year": 29,
        "iso": "2022-07-20T14:52:45+00:00",
        "timestamp": 1658328765
      }

      UNIX Epoch Timestamp string parser

      In addition to the ISO 8601 Datetime string parser, the Timestamp parser takes in a 10+ digit epoch timestamp string and explodes it into all of the relevant date and time parts you might want to use, including a normalized ISO 8601 format string. Both Naive and timezone-aware UTC versions of the output are provided. (Documentation)

      $ echo '1658599410' | jc --timestamp -p
      {
        "naive": {
          "year": 2022,
          "month": "Jul",
          "month_num": 7,
          "day": 23,
          "weekday": "Sat",
          "weekday_num": 6,
          "hour": 11,
          "hour_24": 11,
          "minute": 3,
          "second": 30,
          "period": "AM",
          "day_of_year": 204,
          "week_of_year": 29,
          "iso": "2022-07-23T11:03:30"
        },
        "utc": {
          "year": 2022,
          "month": "Jul",
          "month_num": 7,
          "day": 23,
          "weekday": "Sat",
          "weekday_num": 6,
          "hour": 6,
          "hour_24": 18,
          "minute": 3,
          "second": 30,
          "period": "PM",
          "utc_offset": "+0000",
          "day_of_year": 204,
          "week_of_year": 29,
          "iso": "2022-07-23T18:03:30+00:00"
        }
      }

      M3U/M3U8 file parser

      jc can now parse M3U files, including extended information. Unparsable lines are noted with a warning message unless the --quiet flag is enabled. (Documentation)

      $ cat playlist.m3u | jc --m3u -p
      [
        {
          "runtime": 105,
          "display": "Example artist - Example title",
          "path": "C:\Files\My Music\Example.mp3"
        },
        {
          "runtime": 321,
          "display": "Example Artist2 - Example title2",
          "path": "C:\Files\My Music\Favorites\Example2.ogg"
        }
      ]

      Happy parsing!

      Featured

      Working with JSON in Various Shells

      I recently went through the exercise of testing jc on several traditional and next-gen shells to document the integrations. jc is a utility that converts the output of many commands and file-types to JSON for easier parsing in scripts. I have typically highlighted the use of JSON with Bash in concert with jq, but this is 2022 and there are so many more shells to choose from!

      In this article I’d like to give a quick snapshot of what it’s like to work with JSON in various traditional and next generation shells. Traditional shells like Bash and Windows Command Prompt (cmd.exe) don’t have built-in JSON support and require 3rd party utilities. Newer shells like NGS, Nushell, Oil, Elvish, Murex, and PowerShell have JSON serialization/deserialization and filtering capabilities built-in for a cleaner experience.

      Bash is still the automation workhorse of the Unix ecosystem and it’s not going away any time soon, but it’s good to see what capabilities are out there in more modern shells. Perhaps this will inspire you to try them out for yourself!


      Bash

      Bash is old. Bash is solid. Bash is ubiquitous. Bash isn’t going anywhere. I’ve done some crazy things with Bash in my career… Bash and me go a long way. That being said, using JSON in Bash is not always very ergonomic. Tools like jq, jello, jp, etc. help bridge the gap between 1970’s-2000’s POSIX line-based text manipulation to the modern-day JSON API reality.

      Here’s a simple example of how to pull a value from JSON and assign it to a variable in Bash:

      $ myvar=$(dig www.google.com | jc --dig | jq -r '.[0].answer[0].data')
      $ echo $myvar
      64.233.185.104

      If you would like to see more complex examples of assigning multiple JSON values to Bash arrays, see:


      Elvish

      Elvish is a next-gen shell that uses structured data in pipelines. It has JSON deserialization built-in, so you don’t need jq et-al to convert it into an Elvish data structure. You can explore structured data in a similar way to jq or Python.

      Here’s an example of loading a JSON object into a variable and displaying one of the JSON values using the from-json built-in:

      ~> var myvar = (dig www.google.com | jc --dig | from-json)
      ~> put $myvar[0]['answer'][0]['data']
      ▶ 64.233.185.104

      See the Elvish documentation for more details.


      Fish

      Fish is similar to Bash in that it does not have built-in support for JSON, but it’s a more modern take on the shell that provides nice autosuggestions, tab completion, syntax-highlighting, and a clean syntax that is optimized for interactive use.

      When working with JSON in Fish, you will typically use tools like jq, jello, jp, etc. to filter and query the data. Here are some examples showing how to assign filtered JSON data to a variable so it can be used elsewhere in the script:

      $ set myvar (dig www.google.com | jc --dig | jq -r '.[0].answer[0].data')
      $ echo $myvar
      64.233.185.104
      
      $ set myvar (jc dig www.google.com | jello -r '_[0].answer[0].data')
      $ echo $myvar
      64.233.185.104
      
      $ set myvar (jc dig www.google.com | jp -u '[0].answer[0].data')
      $ echo $myvar
      64.233.185.104

      Murex

      The Murex next generation shell is designed for DevOps productivity and includes native JSON capabilities. There are a couple ways to set JSON variables: you can use the cast json builtin to convert a string to a JSON variable, or you can define the JSON type when setting the variable. (e.g. set json myvar).

      There are also a couple different ways to access nested attributes within the JSON: you can use Index syntax (single bracket []) or Element syntax (double bracket [[]]).

      Here’s an example of setting a JSON variable and accessing a nested value using the Element syntax:

      ~ » jc dig www.google.com -> set json myvar
      ~ » $myvar[[.0.answer.0.data]] -> set mydata
      ~ » out $mydata
      64.233.185.104

      Check out the documentation for more information.


      NGS

      Next Generation Shell (NGS) is a modern shell that aims to be DevOps-friendly. To that end, it is no surprise that it has great JSON support out of the box. If you have Python experience, you will find yourself at home with many of the concepts.

      Here is a quick example of how to pull a value from JSON into a variable and output a specific value to STDOUT:

      myvar = ``jc dig www.google.com``[0].answer[0].data
      echo(myvar)
      
      # returns 64.233.185.104

      The double-backtick syntax runs the command and parses the JSON output. Then you can use bracket and dot notation to access the key you would like.

      There are many other ways to filter the objects, including map(), filter(), reject(), the_one(), etc. No jq required!


      Nushell

      Nushell’s website describes itself this way:

      “Nu pipelines use structured data so you can safely select, filter, and sort the same way every time. Stop parsing strings and start solving problems.”

      This is definitely a new take on the shell which works nicely with JSON data. In fact, Nushell has a from json builtin function that deserializes JSON into a native structured object. Here’s a quick example of how to assign a JSON object to a variable and filter it down to a desired value:

      > let myvar = (dig www.google.com | jc --dig | from json)
      > echo $myvar | get 0.answer.0.data
      64.233.185.104

      Check out the Nushell documentation for more filtering options.


      Oil

      If you are at home in Javascript or Python then you should check out Oil. Oil started out being compatible with Bash, but has since advanced into its own shell and scripting language that supports more robust structured objects.

      Oil comes with the json read builtin that deserializes JSON into a native Oil object. You can use standard bracket notation or a unique -> notation to access attributes within objects. Here’s an example:

      $ dig www.google.com | jc --dig | json read myvar
      $ var mydata = myvar[0]['answer'][0]['data']
      $ echo $mydata
      64.233.185.104

      For more details on working with JSON in Oil, see the documentation.


      PowerShell

      They say you either love or hate PowerShell. I have to admit, coming from a Bash background, I wasn’t too hot on PowerShell the first time I needed to create a script for it. It seemed needlessly verbose. And what were these objects? Why can’t I just pipe text between processes!?

      But I have to say it has grown on me because of its concept of passing structured objects between processes via pipes. Well, I neither love or hate PowerShell… I like the concept, but I’m still not a huge fan of some of the execution. It does have pretty good native JSON support, though.

      Here’s an example of loading JSON data from jc into an object using the ConvertFrom-Json utility and printing a specific property within the resulting object using bracket and dot notation:

      PS C:\> $myvar = dig www.google.com | jc --dig | ConvertFrom-Json
      PS C:\> Write-Output $myvar[0].answer[0].data
      64.233.185.104

      Here’s a good article with more detail on how to work with JSON in PowerShell.


      Windows Command Prompt (cmd.exe)

      Wow, this is a blast from the past! I don’t think I’ve written a batch file since the ’90s. Back then there was no such thing as JSON. I do remember doing some crazy login scripts with batch files back in the day, and I’m sure there are many (not mine) still in use today.

      At first I wasn’t sure if it was even practical to use JSON at the Windows Command Prompt, but I thought it would be fun to take on the challenge. Turns out, it wasn’t too terribly difficult, though I’m still not sure of the practicality.

      When at the Command Prompt, you can use tools like jq, jello, jp, etc. to filter and query the JSON:

      C:\> dig www.google.com | jc --dig | jq -r .[0].answer[0].data
      64.233.185.104
      
      C:\> jc dig www.google.com | jello -r _[0].answer[0].data
      64.233.185.104
      
      C:\> jc dig www.google.com | jp -u [0].answer[0].data
      64.233.185.104

      That’s fine and all, but can you actually load JSON values into variables? Yes you can – with the trusty FOR /F command!

      C:\> FOR /F "tokens=* USEBACKQ" %i IN (`dig www.google.com ^| jc --dig ^| jq -r .[0].answer[0].data`) DO SET myvar=%i
      C:\> ECHO %myvar%
      64.233.185.104

      Well, that’s a mouthful. But it does work. Batch files require double %% prefixes when setting the variables, so this is how you would do it in a batch file:

      FOR /F "tokens=* USEBACKQ" %%i IN (`dig www.google.com ^| jc --dig ^| jq -r .[0].answer[0].data`) DO SET myvar=%%i
      ECHO %myvar%
      
      :: returns 64.233.185.104

      I needed to make a visit to Stack Overflow to learn how to get this working. Was it worth it? I don’t know – maybe this will help some poor unfortunate soul someday searching “how to use json in batch file”. 🙂

      Conclusion

      That was fun – I’ve always enjoyed the command line and playing with different shells can spark inspiration for new ways of solving problems. There are lots of next-gen alternatives that are looking to take us to the 21st century shell experience. Did I leave out your favorite new shell?

      Happy JSON parsing!

      Featured

      Easily Convert git log Output to JSON

      There are lots of people interested in converting their git logs into beautiful JSON or JSON Lines for archive and analytics. It seems like it should be easy enough, but it is deceptively more complicated than it needs to be.

      When I got a feature request for jc to support git log output, my first instinct was to look into the robust git --format options. At first glance it seemed like a simple format string like this should work:

      git log --format='{"hash": "%H", "author": "%an", "subject": "%s", "body": "%b", "date": %at}'

      The problem is that git does not do any escaping when using those format variables. This will generate invalid JSON if there are newline characters or other special characters like quotation marks inside the data.

      I found several other solutions to the problem using custom scripts, but unfortunately some require installing interpreters like Node.js, some require specific git log ‑‑format options, or didn’t fully support options like ‑‑stat or ‑‑shortstat. Some solutions still did not even fully solve the string escaping issue.

      A Better git log Parser

      I decided jc would be a great git log parser. jc already supports around 100 other commands so this is right in jc‘s wheelhouse. I wanted to make the jc parser for git log as easy to use as the other parsers (e.g. jc git log), but also support more advanced git format and statistics options.

      In addition, I wanted to support both JSON and JSON Lines conversion. git logs can become huge over time, so being able to emit JSON Lines can reduce the memory overhead that would be incurred by generating a huge JSON array of logs.

      Finally, I wanted to add calculated timestamps (naive and time zone aware) to make the output more useful in scripts.

      The new git log standard and streaming parsers are now bundled with jc. They work just like any other jc parser and support several git log ‑format options as well as ‑stat and ‑shortstat. No need to worry about escaping special characters or using a specific format string. It just works out of the box!

      Here’s an example using both the fuller format option along with full stats using ‑stat.

      $ git log --format=fuller --stat | jc --git-log -p
      [
        {
          "commit": "af2c06cd284352eb47c44f2387d4600b1b322cbd",
          "author": "Kelly Brazil",
          "author_email": "kellyjonbrazil@gmail.com",
          "date": "Sun May 15 22:28:12 2022 -0700",
          "commit_by": "Kelly Brazil",
          "commit_by_email": "kellyjonbrazil@gmail.com",
          "commit_by_date": "Sun May 15 22:28:12 2022 -0700",
          "stats": {
            "files_changed": 1,
            "insertions": 2,
            "deletions": 2,
            "files": [
              "docs/parsers/pip_show.md"
            ]
          },
          "message": "doc update",
          "epoch": 1652678892,
          "epoch_utc": null
        },
        {
          "commit": "67a4c6f797dfeaba2ba50222e879bf4fb58678f4",
          "author": "Kelly Brazil",
          "author_email": "kellyjonbrazil@gmail.com",
          "date": "Sun May 15 22:23:00 2022 -0700",
          "commit_by": "Kelly Brazil",
          "commit_by_email": "kellyjonbrazil@gmail.com",
          "commit_by_date": "Sun May 15 22:23:00 2022 -0700",
          "stats": {
            "files_changed": 2,
            "insertions": 4,
            "deletions": 4,
            "files": [
              "jc/parsers/pip_show.py",
              "tests/fixtures/generic/pip-show-multiline-license.json"
            ]
          },
          "message": "add initial \\n to first line of multiline fields",
          "epoch": 1652678580,
          "epoch_utc": null
        },
        ...
      ]

      You could also use the magic syntax for the above example: jc p git log ‑format=fuller ‑stat

      Or, to output JSON Lines, use the streaming parser:

      $ git log --format=fuller --stat | jc --git-log-s
      {"commit":"af2c06cd284352eb47c44f2387d4600b1b322cbd","author":"Kelly Brazil","author_email":"kellyjonbrazil@gmail.com","date":"Sun May 15 22:28:12 2022 -0700","commit_by":"Kelly Brazil","commit_by_email":"kellyjonbrazil@gmail.com","commit_by_date":"Sun May 15 22:28:12 2022 -0700","stats":{"files_changed":1,"insertions":2,"deletions":2,"files":["docs/parsers/pip_show.md"]},"message":"doc update","epoch":1652678892,"epoch_utc":null}
      {"commit":"67a4c6f797dfeaba2ba50222e879bf4fb58678f4","author":"Kelly Brazil","author_email":"kellyjonbrazil@gmail.com","date":"Sun May 15 22:23:00 2022 -0700","commit_by":"Kelly Brazil","commit_by_email":"kellyjonbrazil@gmail.com","commit_by_date":"Sun May 15 22:23:00 2022 -0700","stats":{"files_changed":2,"insertions":4,"deletions":4,"files":["jc/parsers/pip_show.py","tests/fixtures/generic/pip-show-multiline-license.json"]},"message":"add initial \\n to first line of multiline fields","epoch":1652678580,"epoch_utc":null}
      ...

      Of course, other format options, like oneline, short, medium, and full are supported, as well as ‑shortstat. Check out the docs for all of the supported options. (standard and streaming)

      In the end, I believe it would be better for there to be a JSON output option built-into git, but until then, there is jc.

      Happy parsing!

      Featured

      JC Version 1.19.0 Released

      I’m excited to announce the release of jc version 1.19.0 available on github and pypi. jc now supports 100 standard and streaming parsers. Thank you to the Open Source community for making this possible!

      jc can be installed via pip or through several official OS package repositories, including Debian, Ubuntu, Fedora, openSUSE, Arch Linux, NixOS Linux, Guix System Linux, FreeBSD, and macOS. For more information on how to get jc, see the project README.

      To upgrade with pip:

      $ pip3 install --upgrade jc
      Sections

        What’s New

        • Add git log streaming parser that outputs JSON lines (or a lazy Iterable when used as a python library). This is great for converting very large git logs to JSON so the entire log does not need to be loaded into RAM.
        • Add chage --list command parser tested on linux
        • Fix git log standard parser for corner-cases where commit hash values are the only value in a line in messages
        • Fix df command parser for rare instances when a newline is found at the end of the output
        • Allow jc to pip install on unsupported python version 3.6 since this version is still widely in use. Note that jc is only tested on officially supported python versions.
        • Fix asciitable-m parser to skip some rows that contain detected column separator characters in cell data. A warning message will be printed to STDERR unless -q or quiet=True is used.
        • New zip package for Windows. Simply unzip the files anywhere in the execution PATH.

        New Parsers

        git log command streaming parser

        Support for the git log command. This is a streaming parser and it outputs JSON Lines. (Documentation):

        $ git log | jc --git-log-s
        {"commit":"a730ae18c8e81c5261db132df73cd74f272a0a26","author":"Kelly...}
        {"commit":"930bf439c06c48a952baec05a9896c8d92b7693e","author":"Kelly...}

        chage --list command parser

        Linux support for the chage --list command. (Documentation)

        $ chage --list joeuser | jc --chage -p
        {
          "password_last_changed": "never",
          "password_expires": "never",
          "password_inactive": "never",
          "account_expires": "never",
          "min_days_between_password_change": 0,
          "max_days_between_password_change": 99999,
          "warning_days_before_password_expires": 7
        }

        Happy parsing!

        Featured

        A New Way to Parse Plain Text Tables

        Every so often there are questions on sysadmin forums on how to parse and filter data from plain text tables. For example:

        +----+-----------------------+--------------------------------+---------+
        | id | name                  | url                            | version |
        +----+-----------------------+--------------------------------+---------+
        | 25 | example.com           | http://www.example.com/        | 3.8     |
        | 34 | anotherexample.com    | https://anotherexample.com/    | 3.2     |
        | 62 | yetanotherexample.com | https://yetanotherexample.com/ | 3.9     |
        +----+-----------------------+--------------------------------+---------+

        Traditionally you would use tools like grep, sed, and/or awk to grab the data you want from a table like this. Now there is a new way, with jc! Now, in version 1.18.6, jc can convert single-line and multi-line ASCII and Unicode tables to JSON with the asciitable and asciitable-m parsers. This then allows you to use JSON filters like jq or jello to filter the data and use in your Bash scripts or other applications.

        Here’s how to use the new parsers:

        $ echo '
        > +----+-----------------------+--------------------------------+---------+
        > | id | name                  | url                            | version |
        > +----+-----------------------+--------------------------------+---------+
        > | 25 | example.com           | http://www.example.com/        | 3.8     |
        > | 34 | anotherexample.com    | https://anotherexample.com/    | 3.2     |
        > | 62 | yetanotherexample.com | https://yetanotherexample.com/ | 3.9     |
        > +----+-----------------------+--------------------------------+---------+
        > ' | jc --asciitable -p
        [
          {
            "id": "25",
            "name": "example.com",
            "url": "http://www.example.com/",
            "version": "3.8"
          },
          {
            "id": "34",
            "name": "anotherexample.com",
            "url": "https://anotherexample.com/",
            "version": "3.2"
          },
          {
            "id": "62",
            "name": "yetanotherexample.com",
            "url": "https://yetanotherexample.com/",
            "version": "3.9"
          }
        ]

        If there are multi-line rows, then be sure to use the asciitable-m parser:

        $ echo '
        > ╒══════════╤═════════╤════════╕
        > │ foo      │ bar     │ baz    │
        > │          │         │ buz    │
        > ╞══════════╪═════════╪════════╡
        > │ good day │ 12345   │        │
        > │ mate     │         │        │
        > ├──────────┼─────────┼────────┤
        > │ hi there │ abc def │ 3.14   │
        > │          │         │        │
        > ╘══════════╧═════════╧════════╛' | jc --asciitable-m -p
        [
          {
            "foo": "good day\nmate",
            "bar": "12345",
            "baz_buz": null
          },
          {
            "foo": "hi there",
            "bar": "abc def",
            "baz_buz": "3.14"
          }
        ]

        Many different table styles are supported, as long as there is a header row at the top of the table.

        Of course, you can also use the parsers as python libraries:

        >>> import jc
        >>> table = '''
        ... Protocol  Address     Age (min)  Hardware Addr   Type   Interface
        ... Internet  10.12.13.1        98   0950.5785.5cd1  ARPA   FastEthernet2.13
        ... Internet  10.12.13.3       131   0150.7685.14d5  ARPA   GigabitEthernet2.13
        ... Internet  10.12.13.4       198   0950.5C8A.5c41  ARPA   GigabitEthernet2.17
        ... '''
        >>> jc.parse('asciitable', table)
        [{'protocol': 'Internet', 'address': '10.12.13.1', 'age_min': '98', 'hardware_addr': '0950.5785.5cd1', 'type': 'ARPA', 'interface': 'FastEthernet2.13'}, {'protocol': 'Internet', 'address': '10.12.13.3', 'age_min': '131', 'hardware_addr': '0150.7685.14d5', 'type': 'ARPA', 'interface': 'GigabitEthernet2.13'}, {'protocol': 'Internet', 'address': '10.12.13.4', 'age_min': '198', 'hardware_addr': '0950.5C8A.5c41', 'type': 'ARPA', 'interface': 'GigabitEthernet2.17'}]
        

        This can be used to parse the output of some commands that output plaintext tables. For example, the virsh command:

        # virsh list --all
         Id   Name          State
        ------------------------------
         3    rh8-vm01      running
         -    crc           shut off
         -    rh8-tower01   shut off
        #
        # virsh list -all | jc --asciitable -p
        [
          {
            "id": "3",
            "name": "rh8-vm01",
            "state": "running"
          },
          {
            "id": "-",
            "name": "crc",
            "state": "shut off"
          },
          {
            "id": "-",
            "name": "rh8-tower01",
            "state": "shut off"
          }
        ]

        Here’s how you can do the above in an Ansible playbook using the jc community.general plugin:

        - name: Get virsh state
          hosts: ubuntu
          tasks:
          - shell: virsh list --all
            register: result
          - set_fact:
              virsh_data: "{{ result.stdout | community.general.jc('asciitable') }}"
          - debug:
              msg: "The virsh state is: {{ virsh_data[0].state }}"

        For more information on jc, check out my post on Bringing the UNIX Philosophy to the 21st Century. See these posts for tips on how to use JSON in your Bash scripts.

        Happy parsing!

        Featured

        JC Version 1.18.1 Released

        I’m excited to announce the release of jc version 1.18.1 available on github and pypi. This release includes several enhancements when using jc as a Python library. Enhancements include some higher-level APIs and improved documentation to simplify the use of jc in Python programs and scripts. Error message improvements have been made on the CLI as well.

        jc can be installed via pip or through several official OS package repositories, including Debian, Ubuntu, Fedora, openSUSE, Arch Linux, NixOS Linux, Guix System Linux, FreeBSD, and macOS. For more information on how to get jc, see the project README.

        To upgrade with pip:

        $ pip3 install --upgrade jc
        Sections

          New Features

          • New high-level parse API that works for both builtin and custom plugin parsers
          >>> import jc
          >>> jc.parse('date', 'Thu Jan 27 11:40:00 PST 2022')
          {'year': 2022, 'month': 'Jan', 'month_num': 1, 'day': 27, 'weekday': 'Thu', 'weekday_num': 4, 'hour': 11, 'hour_24': 11, 'minute': 40, 'second': 0, 'period': 'AM', 'timezone': 'PST', 'utc_offset': None, 'day_of_year': 27, 'week_of_year': 4, 'iso': '2022-01-27T11:40:00', 'epoch': 1643312400, 'epoch_utc': None, 'timezone_aware': False}
          • Several other high-level functions in jc.lib that allow you to gather detailed parser information:
            • parser_mod_list() -> list
            • plugin_parser_mod_list() -> list
            • get_help(parser_module_name: str) -> None
          • Enhanced CLI error messages for certain OS errors that can happen when using the “magic syntax” (file permission errors, etc.)

          v1.18.2 Updates

          • Enhanced documentation for public functions, including type annotations
          • Additional high-level convenience functions:
            • parser_info(parser_module_name: str) -> dict
            • all_parser_info() -> list[dict]
          • Enhanced CLI error message to suggest setting locale to C when parsing errors occur
          • Bug fix for plugin parsers with underscore(s) in the name

          v1.18.3 Updates

          • Add rsync command and log file parser tested on linux and macOS
          • Add rsync command and log file streaming parser tested on linux and macOS
          • Add xrandr command parser tested on linux
          • Enhance timestamp performance with caching and format hints
          • Refactor ignore_exceptions functionality in streaming parsers
          • Fix man page in packages

          rsync command parser

          Linux and macOS support for the rsync command. (Documentation):

          $ rsync -i -a source/ dest | jc --rsync -p          # or  jc -p rsync -i -a source/ dest
          [
            {
              "summary": {
                "sent": 1708,
                "received": 8209,
                "bytes_sec": 19834.0,
                "total_size": 235,
                "speedup": 0.02
              },
              "files": [
                {
                  "filename": "./",
                  "metadata": ".d..t......",
                  "update_type": "not updated",
                  "file_type": "directory",
                  "checksum_or_value_different": false,
                  "size_different": false,
                  "modification_time_different": true,
                  "permissions_different": false,
                  "owner_different": false,
                  "group_different": false,
                  "acl_different": false,
                  "extended_attribute_different": false
                },
                ...
              ]
            }
          ]

          rsync command streaming parser

          Linux support for the rsync command. This is a streaming parser and it outputs JSON Lines. (Documentation):

          $ rsync -i -a source/ dest | jc --rsync-s
          {"type":"file","filename":"./","metadata":".d..t......","update_...}
          ...

          xrandr command parser

          Linux support for the  xrandr command. (Documentation):

          $ xrandr | jc --xrandr -p          # or  jc -p xrandr
          {
            "screens": [
              {
                "screen_number": 0,
                "minimum_width": 8,
                "minimum_height": 8,
                "current_width": 1920,
                "current_height": 1080,
                "maximum_width": 32767,
                "maximum_height": 32767,
                "associated_device": {
                  "associated_modes": [
                    {
                      "resolution_width": 1920,
                      "resolution_height": 1080,
                      "is_high_resolution": false,
                      "frequencies": [
                        {
                          "frequency": 60.03,
                          "is_current": true,
                          "is_preferred": true
                        },
                        {
                          "frequency": 59.93,
                          "is_current": false,
                          "is_preferred": false
                        }
                      ]
                    },
                    {
                      "resolution_width": 1680,
                      "resolution_height": 1050,
                      "is_high_resolution": false,
                      "frequencies": [
                        {
                          "frequency": 59.88,
                          "is_current": false,
                          "is_preferred": false
                        }
                      ]
                    }
                  ],
                  "is_connected": true,
                  "is_primary": true,
                  "device_name": "eDP1",
                  "resolution_width": 1920,
                  "resolution_height": 1080,
                  "offset_width": 0,
                  "offset_height": 0,
                  "dimension_width": 310,
                  "dimension_height": 170
                }
              }
            ],
            "unassociated_devices": []
          }

          v1.18.4 Updates

          • Add nmcli command parser tested on linux
          • Enhance parse error messages at the cli
          • Add standard and streaming parser list functions to the public API
          • Enhance python developer documentation formatting

          nmcli command parser

          Linux support for the nmcli command. (Documentation):

          $ nmcli connection show ens33 | jc --nmcli -p          # or  jc -p nmcli connection show ens33
          [
            {
              "connection_id": "ens33",
              "connection_uuid": "d92ece08-9e02-47d5-b2d2-92c80e155744",
              "connection_stable_id": null,
              "connection_type": "802-3-ethernet",
              "connection_interface_name": "ens33",
              "connection_autoconnect": "yes",
              ...
              "ip4_address_1": "192.168.71.180/24",
              "ip4_gateway": "192.168.71.2",
              "ip4_route_1": {
                "dst": "0.0.0.0/0",
                "nh": "192.168.71.2",
                "mt": 100
              },
              "ip4_route_2": {
                "dst": "192.168.71.0/24",
                "nh": "0.0.0.0",
                "mt": 100
              },
              "ip4_dns_1": "192.168.71.2",
              "ip4_domain_1": "localdomain",
              "dhcp4_option_1": {
                "name": "broadcast_address",
                "value": "192.168.71.255"
              },
              ...
              "ip6_address_1": "fe80::c1cb:715d:bc3e:b8a0/64",
              "ip6_gateway": null,
              "ip6_route_1": {
                "dst": "fe80::/64",
                "nh": "::",
                "mt": 100
              }
            }
          ]

          v1.18.5 Updates

          • Fix date parser to ensure AM/PM period string is always uppercase. Fixes broken tests in some locales

          v1.18.6 Updates

          • Add pidstat command parser tested on linux
          • Add pidstat command streaming parser tested on linux
          • Add mpstat command parser tested on linux
          • Add mpstat command streaming parser tested on linux
          • Add single-line ASCII and Unicode table parser
          • Add multi-line ASCII and Unicode table parser
          • Add a documentation option to parser_info() and all_parser_info()

          pidstat command parser

          Linux support for the pidstat command. (Documentation):

          $ pidstat -hl | jc --pidstat -p          # or  jc -p pidstat -hl
          [
            {
              "time": 1646859134,
              "uid": 0,
              "pid": 1,
              "percent_usr": 0.0,
              "percent_system": 0.03,
              "percent_guest": 0.0,
              "percent_cpu": 0.03,
              "cpu": 0,
              "command": "/usr/lib/systemd/systemd --switched-root --system..."
            },
            {
              "time": 1646859134,
              "uid": 0,
              "pid": 6,
              "percent_usr": 0.0,
              "percent_system": 0.0,
              "percent_guest": 0.0,
              "percent_cpu": 0.0,
              "cpu": 0,
              "command": "ksoftirqd/0"
            },
            {
              "time": 1646859134,
              "uid": 0,
              "pid": 2263,
              "percent_usr": 0.0,
              "percent_system": 0.0,
              "percent_guest": 0.0,
              "percent_cpu": 0.0,
              "cpu": 0,
              "command": "kworker/0:0"
            }
          ]

          pidstat command streaming parser

          Linux support for the pidstat command. This is a streaming parser and it outputs JSON Lines. (Documentation):

          $ pidstat -hl | jc --pidstat-s
          {"time":1646859134,"uid":0,"pid":1,"percent_usr":0.0,"percent_syste...}
          {"time":1646859134,"uid":0,"pid":6,"percent_usr":0.0,"percent_syste...}
          {"time":1646859134,"uid":0,"pid":9,"percent_usr":0.0,"percent_syste...}
          ...

          asciitable ASCII and Unicode table parser

          Supports parsing various styles of plain text tables. (Documentation):

          $ echo '
          >     ╒══════════╤═════════╤════════╕
          >     │ foo      │ bar     │ baz    │
          >     ╞══════════╪═════════╪════════╡
          >     │ good day │         │ 12345  │
          >     ├──────────┼─────────┼────────┤
          >     │ hi there │ abc def │ 3.14   │
          >     ╘══════════╧═════════╧════════╛' | jc --asciitable -p
          [
            {
              "foo": "good day",
              "bar": null,
              "baz": "12345"
            },
            {
              "foo": "hi there",
              "bar": "abc def",
              "baz": "3.14"
            }
          ]

          asciitable-m multi-line ASCII and Unicode table parser

          Supports parsing various styles of plain text tables with multi-line rows. (Documentation):

          $ echo '
          > +----------+---------+--------+
          > | foo      | bar     | baz    |
          > |          |         | buz    |
          > +==========+=========+========+
          > | good day | 12345   |        |
          > | mate     |         |        |
          > +----------+---------+--------+
          > | hi there | abc def | 3.14   |
          > |          |         |        |
          > +==========+=========+========+' | jc --asciitable-m -p
          [
            {
              "foo": "good day\nmate",
              "bar": "12345",
              "baz_buz": null
            },
            {
              "foo": "hi there",
              "bar": "abc def",
              "baz_buz": "3.14"
            }
          ]

          v1.18.7 Updates

          • Add git-log command parser tested on linux and macOS
          • Add update-alternatives --query command parser tested on linux
          • Add update-alternatives --get-selections command parser tested on linux
          • Fix key/value and INI parsers to allow duplicate keys
          • Fix YAML file parser for files including timestamp objects
          • Update xrandr parser: add a rotation field
          • Fix failing tests by moving template files
          • Add python interpreter version and path to -v and -a output

          git-log command parser

          Linux support for the git log command. (Documentation):

          $ git log --stat | jc --git-log -p          or:  jc -p git log --stat
          [
            {
              "commit": "728d882ed007b3c8b785018874a0eb06e1143b66",
              "author": "Kelly Brazil",
              "author_email": "kellyjonbrazil@gmail.com",
              "date": "Wed Apr 20 09:50:19 2022 -0400",
              "stats": {
                "files_changed": 2,
                "insertions": 90,
                "deletions": 12,
                "files": [
                  "docs/parsers/git_log.md",
                  "jc/parsers/git_log.py"
                ]
              },
              "message": "add timestamp docs and examples",
              "epoch": 1650462619,
              "epoch_utc": null
            },
            {
              "commit": "b53e42aca623181aa9bc72194e6eeef1e9a3a237",
              "author": "Kelly Brazil",
              "author_email": "kellyjonbrazil@gmail.com",
              "date": "Wed Apr 20 09:44:42 2022 -0400",
              "stats": {
                "files_changed": 5,
                "insertions": 29,
                "deletions": 6,
                "files": [
                  "docs/parsers/git_log.md",
                  "docs/utils.md",
                  "jc/parsers/git_log.py",
                  "jc/utils.py",
                  "man/jc.1"
                ]
              },
              "message": "add calculated timestamp",
              "epoch": 1650462282,
              "epoch_utc": null
            }
          ]

          update-alternatives --query command parser

          Linux support for the update-alternatives --query command. (Documentation):

          $ update-alternatives --query editor | jc --update-alt-q -p          # or:  jc -p update-alternatives --query editor
          {
            "name": "editor",
            "link": "/usr/bin/editor",
            "slaves": [
              {
                "name": "editor.1.gz",
                "path": "/usr/share/man/man1/editor.1.gz"
              },
              {
                "name": "editor.da.1.gz",
                "path": "/usr/share/man/da/man1/editor.1.gz"
              }
            ],
            "status": "auto",
            "best": "/bin/nano",
            "value": "/bin/nano",
            "alternatives": [
              {
                "name": "/bin/ed",
                "priority": -100,
                "slaves": [
                  {
                    "name": "editor.1.gz",
                    "path": "/usr/share/man/man1/ed.1.gz"
                  }
                ]
              },
              {
                "name": "/bin/nano",
                "priority": 40,
                "slaves": [
                  {
                    "name": "editor.1.gz",
                    "path": "/usr/share/man/man1/nano.1.gz"
                  }
                ]
              }
            ]
          }

          update-alternatives --get-selections command parser

          Linux support for the update-alternatives --get-selections command. (Documentation):

          $ update-alternatives --get-selections | jc --update-alt-gs -p          # or:  jc -p update-alternatives --get-selections
          [
            {
              "name": "arptables",
              "status": "auto",
              "current": "/usr/sbin/arptables-nft"
            },
            {
              "name": "awk",
              "status": "auto",
              "current": "/usr/bin/gawk"
            }
          ]

          v1.18.8 Updates

          • Fix update-alternatives --query parser for cases where slaves are not present
          • Fix UnicodeEncodeError on some systems where LANG=C is set and Unicode characters are in the output
          • Update history command parser: do not drop non-ASCII characters if the system is configured for UTF-8 encoding
          • Enhance “magic syntax” to always use UTF-8 encoding
          Featured

          Tips on Adding JSON Output to Your CLI App

          A couple of years ago I wrote a somewhat controversial article on the topic of Bringing the Unix Philosophy to the 21st Century by adding a JSON output option to CLI tools. This allows easier parsing in scripts by using JSON parsing tools like jq, jello, jp, etc. without arcane awk, sed, cut, tr, reverse, etc. incantations.

          It was controversial because there seem to be a lot of folks who don’t think writing bespoke parsers for each task is a big deal. Others think JSON is evil. There are strong feelings as can be seen in response to the article in the comments and also on Hacker News and Reddit.

          I’ll let the next generation of DevOps practitioners and developers come to their own conclusions on the basis of our arguments, but the tide is already turning. Something that was just wishful thinking a couple years ago is now becoming a reality! Now, more and more command line applications are offering JSON output as an option. And with jc, JSON output can even be coaxed out of older command line applications.

          Structured Output Support is Increasing

          Now, there are many new command line applications that offer structured output as an option, and even some older ones are adding the option. I find that more and more often when a parser is requested for jc, if I check the man page for the application, there is already a JSON or XML output option. Some examples include nvidia-smi, ffprobe, docker CLI, and tree. Even ip now supports JSON output with ip route, which wasn’t supported when I originally wrote about it in the article.

          I recently developed standard and streaming parsers for the iostat command and found that versions 11 and above now have a JSON output option. Way to go!

          But when looking at the JSON options for some of these commands, I found some things that could be improved.

          JSON Output Do’s and Don’ts

          While developing over 80 parsers for the jc project, I stumbled upon some best practices. My first goal was to make getting the data easy when using jq, as that was the only CLI JSON processing tool I really used at the time. With that initial goal, and input from scores of users, this is how I try to make the highest quality JSON output:

          Note: Many of these are completely subjective and are just my humble opinion. I’m willing to keep an open mind about these choices.

          • Do Make a Schema
          • Do Flatten the Structure
          • Do Output JSON Lines for Streaming Output
          • Do Use Predictable Key Names
          • Do Pretty Print with Two Spaces or Don’t Format at All
          • Don’t Use Special Characters in Key Names
          • Don’t Allow Duplicate Keys
          • Don’t Use Very Large Numbers
          • BONUS

          Let’s take a look at these in more detail.

          Do

          Here are some good practices when generating JSON output:

          Make a Schema

          When possible, which is almost always the case, I document a schema for the JSON output. This allows the user to know where they can always find an attribute and which type they can expect. (string, integer, float, boolean, null, object, array) This also allows you to test the output to make sure it conforms to the schema and there are no bugs.

          A schema doesn’t have to be complicated. It can be specified in the documentation, including the man page. I use this simple structure for jc documentation:

          [
            {
              "foo":      string,
              "bar":      float,   # null if blank
              "baz": [
                          integer
              ]
            }
          ]

          Flatten the Structure

          The best case is to output an object or an array of objects (most common) with no further nesting. Sometimes you can prefix an attribute name if nesting is not absolutely necessary. The idea is to make it as easy for the user to grab the value so they don’t need to traverse the data structure to get what they want.

          Sometimes this:

          [
            {
              "cpu": {
                "speed": 5,
                "temp": 33.2
              },
              "ram": {
                "speed": 11,
                "mb": 1024
              }
            }
          ]

          Can be turned into this:

          [
            {
              "cpu_speed": 5,
              "cpu_temp": 33.2,
              "ram_speed": 11,
              "ram_mb": 1024
            }
          ]

          This way I can easily filter the data in jq or other tools without having to traverse levels. Of course, this is not always possible or desirable, but keeping a flat structure should be considered for user convenience.

          This approach is also great for output that contains a long list of items. I’ll pick on iostat a bit here to make a point – but don’t take this the wrong way – I’m thrilled that the author of iostat has included a JSON output option and in no way want to discount the work put into that.

          The iostat JSON output option deeply nests the output like so:

          {
            "sysstat": {
              "hosts": [
                {
                  "nodename": "ubuntu",
                  "sysname": "Linux",
                  "release": "5.8.0-53-generic",
                  "machine": "x86_64",
                  "number-of-cpus": 2,
                  "date": "12/03/2021",
                  "statistics": [
                    {
                      "avg-cpu": {
                        "user": 0.6,
                        "nice": 0.01,
                        "system": 1.68,
                        "iowait": 0.14,
                        "steal": 0,
                        "idle": 97.58
                      },
                      "disk": [
                        {
                          "disk_device": "dm-0",
                          "tps": 29.07,
                          "kB_read/s": 502.25,
                          "kB_wrtn/s": 54.94,
                          "kB_dscd/s": 0,
                          "kB_read": 251601,
                          "kB_wrtn": 27524,
                          "kB_dscd": 0
                        },
          ...

          This makes sense and is very logical when you look at the output as an entire JSON document, but when dealing with command output from certain commands like iostat, vmstat, ping, ls, etc. which can have huge – even unlimited – amounts of output, it might make more sense to build the JSON structure into a format that is more easily consumed by tools like jq to be used in a pipeline.

          With this structure, the whole document needs to be loaded before the JSON is considered valid and searchable, though iostat output can actually go on indefinitely depending on how the command is run.

          I took a different approach with the jc iostat parser by using a flat array of objects and simply using a type attribute to denote which type of object it is. This allows very easy filtering in jq or other tools and also allows consistency with the streaming parser, which I’ll get to in another section.

          Here’s the jc version:

          [
            {
              "percent_user": 0.31,
              "percent_nice": 0.23,
              "percent_system": 0.48,
              "percent_iowait": 0.04,
              "percent_steal": 0.0,
              "percent_idle": 98.95,
              "type": "cpu"
            },
            {
              "device": "dm-0",
              "tps": 8.16,
              "kb_read_s": 137.26,
              "kb_wrtn_s": 129.0,
              "kb_dscd_s": 0.0,
              "kb_read": 395021,
              "kb_wrtn": 371240,
              "kb_dscd": 0,
              "type": "device"
            },
            {
              "device": "loop0",
              "tps": 0.01,
              "kb_read_s": 0.12,
              "kb_wrtn_s": 0.0,
              "kb_dscd_s": 0.0,
              "kb_read": 344,
              "kb_wrtn": 0,
              "kb_dscd": 0,
              "type": "device"
            },
          ...
          ]

          You’ll notice that jc doesn’t bother with metadata around the source of the data that generated the output or even the host statistics. This is because including the source just makes the object nesting deeper without adding value, and the header information is available with other tools like uname and date, though I could add them in a future parser version as an object with its own type if users want that data.

          Getting to the data in this structure is pretty easy: just loop over the array, filter by type (if needed), and pull attributes from the top-level of each object.

          Output JSON Lines for Streaming Output

          There’s another advantage to the array of flat objects structure discussed above, and that’s for programs like iostat and others that can stream output forever until the user hits <ctrl-c>. In this case, it would be difficult to pipe the output to a JSON filter, like jq, since the output would not be valid JSON until the program ends.

          For these cases, outputting JSON Lines (aka NDJSON) is a good choice.

          Unfortunately, this is what the iostat output looks like when running it indefinitely:

          $ iostat 1 -o JSON
          {"sysstat": {
            "hosts": [
              {
                "nodename": "ubuntu",
                "sysname": "Linux",
                "release": "5.8.0-53-generic",
                "machine": "x86_64",
                "number-of-cpus": 2,
                "date": "12/03/2021",
                "statistics": [
                  {
                    "avg-cpu":  {"user": 1.23, "nice": 0.86, "system": 1.23, "iowait": 0.06, "steal": 0.00, "idle": 96.62},
                    "disk": [
                      {"disk_device": "dm-0", "tps": 30.16, "kB_read/s": 138.78, "kB_wrtn/s": 476.19, "kB_dscd/s": 0.00, "kB_read": 654975, "kB_wrtn": 2247452, "kB_dscd": 0},
                      {"disk_device": "sr0", "tps": 0.13, "kB_read/s": 4.89, "kB_wrtn/s": 0.00, "kB_dscd/s": 0.00, "kB_read": 23067, "kB_wrtn": 0, "kB_dscd": 0}
                    ]
                  },
                  {
                    "avg-cpu":  {"user": 0.00, "nice": 0.00, "system": 0.00, "iowait": 0.00, "steal": 0.00, "idle": 100.00},
                    "disk": [
                      {"disk_device": "dm-0", "tps": 0.00, "kB_read/s": 0.00, "kB_wrtn/s": 0.00, "kB_dscd/s": 0.00, "kB_read": 0, "kB_wrtn": 0, "kB_dscd": 0},
                      {"disk_device": "sr0", "tps": 0.00, "kB_read/s": 0.00, "kB_wrtn/s": 0.00, "kB_dscd/s": 0.00, "kB_read": 0, "kB_wrtn": 0, "kB_dscd": 0}
                    ]
                  },
                  {
                    "avg-cpu":  {"user": 0.00, "nice": 0.00, "system": 0.50, "iowait": 0.00, "steal": 0.00, "idle": 99.50},
                    "disk": [
                      {"disk_device": "dm-0", "tps": 5.00, "kB_read/s": 0.00, "kB_wrtn/s": 20.00, "kB_dscd/s": 0.00, "kB_read": 0, "kB_wrtn": 20, "kB_dscd": 0},
                      {"disk_device": "sr0", "tps": 0.00, "kB_read/s": 0.00, "kB_wrtn/s": 0.00, "kB_dscd/s": 0.00, "kB_read": 0, "kB_wrtn": 0, "kB_dscd": 0}
                    ]
                  }
          ...

          This is not easily parsable downstream when used in a pipeline:

          $ iostat 1 -o JSON | jq
          ^C     # hangs forever until <ctrl-c> is entered and no JSON is filtered

          The author of iostat did do a cool thing, though, and correctly wrapped the output in the final end brackets when the <ctrl-c> sequence is caught. So it does finally create a valid JSON document, but I’m not sure all developers will have the forethought to do this. Still, this does not solve the pipelining problem.

          Instead, the streaming iostat parser in jc outputs JSON lines with the same schema as the standard parser. Basically, the only difference is that there are no beginning and ending array brackets and each object is compact printed on its own line. This allows JSON processors like jq to work on each line immediately as they are emitted:

          $ iostat 1 | jc --iostat-s -u | jq -c
          {"percent_user":1.11,"percent_nice":0.78,"percent_system":1.12,"percent_iowait":0.05,"percent_steal":0.0,"percent_idle":96.94,"type":"cpu"}
          {"device":"dm-0","tps":27.4,"kb_read_s":125.07,"kb_wrtn_s":430.11,"kb_dscd_s":0.0,"kb_read":654987,"kb_wrtn":2252376,"kb_dscd":0,"type":"device"}
          {"device":"loop0","tps":0.02,"kb_read_s":0.16,"kb_wrtn_s":0.0,"kb_dscd_s":0.0,"kb_read":862,"kb_wrtn":0,"kb_dscd":0,"type":"device"}
          {"percent_user":2.53,"percent_nice":0.0,"percent_system":1.52,"percent_iowait":0.0,"percent_steal":0.0,"percent_idle":95.96,"type":"cpu"}
          {"device":"dm-0","tps":19.0,"kb_read_s":0.0,"kb_wrtn_s":76.0,"kb_dscd_s":0.0,"kb_read":0,"kb_wrtn":76,"kb_dscd":0,"type":"device"}
          {"device":"loop0","tps":0.0,"kb_read_s":0.0,"kb_wrtn_s":0.0,"kb_dscd_s":0.0,"kb_read":0,"kb_wrtn":0,"kb_dscd":0,"type":"device"}
          {"percent_user":1.01,"percent_nice":0.0,"percent_system":0.0,"percent_iowait":0.0,"percent_steal":0.0,"percent_idle":98.99,"type":"cpu"}
          {"device":"dm-0","tps":0.0,"kb_read_s":0.0,"kb_wrtn_s":0.0,"kb_dscd_s":0.0,"kb_read":0,"kb_wrtn":0,"kb_dscd":0,"type":"device"}
          {"device":"loop0","tps":0.0,"kb_read_s":0.0,"kb_wrtn_s":0.0,"kb_dscd_s":0.0,"kb_read":0,"kb_wrtn":0,"kb_dscd":0,"type":"device"}
          ...

          Tip: If you include a JSON Lines output option, you might also want to include an ‘unbuffer’ option.

          When directly printing to the terminal, the OS will disable buffering, but when piping to other programs there will be a buffer typically around 4KB. If the emitted JSON is small, it will look like the terminal is hung. This is why jc offers the -u, or ‘unbuffer’ option like many other filtering tools do.

          Note, that there may be a performance impact to disabling the buffer, so it should only be disabled while troubleshooting the pipeline in the terminal.

          Use Predictable Key Names

          This one basically comes down to “don’t dynamically generate key names”. If key names aren’t static and predictable, it makes it difficult to have a good Schema and also makes it difficult for users to find the data.

          Instead of doing something like this:

          {
            "Interface 1": [
              "192.168.1.1",
              "172.16.1.1"
            ],
            "Wifi Interface 1": [
              "10.1.1.1"
            ]
          }

          Do this:

          [
            {
              "interface": "Interface 1",
              "ip_addresses": [
                "192.168.1.1",
                "172.16.1.1"
              ]
            },
            {
              "interface": "Wifi Interface 1",
              "ip_addresses": [
                "10.1.1.1"
              ]
            }
          ]

          This is a self-documented structure and the user can simply iterate over all of the objects to get the interface names and IP addresses they want. They can still get it the other way, but it’s not as straightforward and it also doesn’t allow you to have a nicely documented Schema.

          Pretty Print with Two Spaces or Don’t Format at All

          Higher-level languages like Python allow very easy formatting of the JSON output, so I typically see the issue of ugly formatted JSON with programs written in C:

          iostat JSON output formatting is not optimized for terminal line wrapping.

          What is going on here? Actually – I can see what the developer was doing – it does look quite nice outside of the terminal when pasted into a text editor, but while inside the terminal the line wrapping makes it very unreadable.

          I like the look of two-space indentation with JSON – maybe because that’s the way jq formats it and I’m just used to it.

          There’s really no need to format JSON output at all. If it makes your code simpler, just generate the JSON with no newlines or spaces. This is more compact and the user can just as easily pipe the output through jq or other tools to format it.

          If you do choose to format the JSON, then take a cue from jq and use two spaces of indent and don’t coalesce brackets. Like so:

          $ iostat -o JSON | jq
          {
            "sysstat": {
              "hosts": [
                {
                  "nodename": "ubuntu",
                  "sysname": "Linux",
                  "release": "5.8.0-53-generic",
                  "machine": "x86_64",
                  "number-of-cpus": 2,
                  "date": "12/03/2021",
                  "statistics": [
                    {
                      "avg-cpu": {
                        "user": 0.6,
                        "nice": 0.01,
                        "system": 1.68,
                        "iowait": 0.14,
                        "steal": 0,
                        "idle": 97.58
                      },
                      "disk": [
                        {
                          "disk_device": "dm-0",
                          "tps": 29.07,
                          "kB_read/s": 502.25,
                          "kB_wrtn/s": 54.94,
                          "kB_dscd/s": 0,
                          "kB_read": 251601,
                          "kB_wrtn": 27524,
                          "kB_dscd": 0
                        },
                        <SNIP>
                        {
                          "disk_device": "sr0",
                          "tps": 0.19,
                          "kB_read/s": 6.27,
                          "kB_wrtn/s": 0,
                          "kB_dscd/s": 0,
                          "kB_read": 3139,
                          "kB_wrtn": 0,
                          "kB_dscd": 0
                        }
                      ]
                    }
                  ]
                }
              ]
            }
          }

          Beggars can’t be choosers, so I’ll take ugly JSON over no JSON any day. But again, compact JSON with no spaces and newlines is perfectly fine. Anyone working with JSON knows to use jq or other tools to make it easy to read in the terminal.

          Don’t

          Try to avoid these JSON smells:

          Don’t Use Special Characters in Key Names

          There’s nothing more annoying than having to encapsulate an attribute name in brackets because it has special characters or spaces in it.

          $ echo '{"Foo/ foo": "bar"}' | jq '.Foo/ foo'
          jq: error: foo/0 is not defined at <top-level>, line 1:
          .Foo/ foo      
          jq: 1 compile error
          
          $ echo '{"Foo/ foo": "bar"}' | jq '.["Foo/ foo"]'
          "bar"

          Don’t make your users do that! This can also be a consequence of dynamically generating your keys, as discussed in a section above. Instead, keep all key characters lower-case and convert special characters to underscores (‘_‘) to keep them alphanumeric.

          Underscores are better than dashes because they allow you to select the entire key with a double-click in most IDEs and text editors. Dashes will typically only select a section of the key name.

          Don’t Allow Duplicate Keys

          If you are dynamically generating key names it may be possible for duplicates to appear in an object. If there is a possibility of this, wrap those items in individual objects. Duplicate keys are undefined in JSON and may cause different behavior depending on the client.

          Don’t Use Extremely Large Numbers

          JSON has nice typing, but unfortunately the numeric data type is underspecified in the standard and may have different behavior with different clients. This can happen if you output a long UUID as a number – the UUID may actually not turn out to be the same on all clients! If you have a very large number, it’s probably best to just wrap it in a string so it doesn’t get mangled downstream.

          Don’t Use XML

          Just joking! Any standard structured output is better than plain text in many cases, and sometimes (but not often) XML is a better choice than JSON. I prefer JSON for its readability, support ecosystem, and for its support for maps, arrays, and limited types. After developing JSON schemas for over 80 CLI parsers I’ve found that there’s not much JSON can’t do for this type of output.

          In Conclusion

          Always think of the end-user and how they will interact with the data. By following these steps, you can keep the users from having to jump through extra hoops to get to the data they want:

          • Make a Schema
          • Flatten the Structure
          • Output JSON Lines for Streaming Output
          • Use Predictable Key Names
          • Pretty Print with Two Spaces or Don’t Format at All
          • Don’t Use Special Characters in Key Names
          • Don’t Allow Duplicate Keys
          • Don’t Use Very Large Numbers

          This is clearly not an exhaustive list. Did I miss any of your pet peeves? Let me know in the comments!

          Featured

          JC Version 1.17.0 Released

          See below for v1.17.x updates

          I’m excited to announce the release of jc version 1.17.0 available on github and pypi. This release includes streaming parser support, including three new streaming parsers (ls-s, ping-s, and vmstat-s) and one new standard parser (vmstat), bringing the total number of parsers to 78.

          The streaming parsers output JSON Lines (aka NDJSON), which can be ingested by streaming processors like jq, elastic, Splunk, etc. These parsers use significantly less memory while converting large amounts of output (e.g. ls -lR /), and in some cases can be faster than standard parsers. Just like standard parsers, streaming parsers can be used both at the CLI and as Python libraries. When used as Python libraries, parse() is a generator function and returns an iterator which can be used in a loop for lazy processing of the stream.

          The -u CLI option has been added to unbuffer the output. This is useful when piping jc output to another process like jq and the input stream is very slow (e.g. ping output). With the unbuffer option enabled you will be able to see the JSON output immediately when using streaming parsers in this scenario instead of waiting for the buffer to be filled.

          Streaming parsers also have an ignore_exceptions option (-qq on the CLI) to allow uninterrupted processing in case any unexpected parsing errors occur. This can be used for long-lived streams so the pipe will not be broken if there is a hiccup in the stream. When this option is used, a _jc_meta object with a success attribute is added to each emitted JSON object. This allows the downstream application to decide whether to ignore the unparsable lines or further process those lines.

          jc can be installed via pip or through several official OS package repositories, including Debian, Ubuntu, Fedora, openSUSE, Arch Linux, NixOS Linux, Guix System Linux, FreeBSD, and macOS. For more information on how to get jc, click here.

          To upgrade with pip:

          $ pip3 install --upgrade jc
          Sections

            New Features

            • Warning and Error messages now wrap to the terminal width.
            • Support for streaming parsers for much lower memory consumption when converting large amounts of command output.
            • -u CLI option unbuffers jc output. This is useful when converting slow output like ping through the ping-s streaming parser.
            • -qq CLI option makes jc “extra quiet” for streaming parsers. This equates to the ignore_exceptions argument in the streaming parser’s parse() function when using jc as a Python library.

              When using this option, the streaming parser will not stop when parsing errors are encountered. Instead, a _jc_meta object included in the JSON output will have success set to false and the error and line attributes will be set to the error message and original line contents, respectively. Here are examples of the additional _jc_meta object:

            Successfully parsed line with -qq option:

            {
              "foo": "data1",
              "bar": "data2",
              "baz": "data3",
              "_jc_meta": {
                "success": true
              }
            }

            Unsuccessfully parsed line with -qq option:

            {
              "_jc_meta": {
                "success": false,
                "error": "error message",
                "line": "original line data"
              }
            }

            New Parsers

            jc now supports 78 parsers. New parsers include vmstat and three streaming parsers: ls-s, ping-s, and vmstat-s.

            Streaming parsers are considered Beta quality. Even though the streaming parsers have gone through extensive testing, I would like to get more feedback from users before considering them 1.0. Please try them out and provide any feedback as a github issue.

            Also, feel free to open a github issue if you have recommendations for other streaming parsers. Currently I’m thinking about adding streaming parsers for CSV, YAML, and XML documents in a future release.

            Documentation and schemas for all parsers can be found here.

            vmstat command parser

            Linux support for the vmstat command. (Documentation):

            $ vmstat | jc --vmstat -p          # or jc -p vmstat
            [
              {
                "runnable_procs": 2,
                "uninterruptible_sleeping_procs": 0,
                "virtual_mem_used": 0,
                "free_mem": 2794468,
                "buffer_mem": 2108,
                "cache_mem": 741208,
                "inactive_mem": null,
                "active_mem": null,
                "swap_in": 0,
                "swap_out": 0,
                "blocks_in": 1,
                "blocks_out": 3,
                "interrupts": 29,
                "context_switches": 57,
                "user_time": 0,
                "system_time": 0,
                "idle_time": 99,
                "io_wait_time": 0,
                "stolen_time": 0,
                "timestamp": null,
                "timezone": null
              }
            ]

            vmstat-s streaming command parser

            Linux support for the vmstat command. This is a streaming parser and it outputs JSON Lines. (Documentation):

            $ vmstat | jc --vmstat-s
            {"runnable_procs":2,"uninterruptible_sleeping_procs":...timestamp":null,"timezone":null}

            ls-s streaming command parser

            Linux, macOS, and BSD support for the ls command. This is a streaming parser and it outputs JSON Lines. (Documentation):

            $ ls -l /usr/bin | jc --ls-s
            {"filename":"2to3-","flags":"-rwxr-xr-x","links":4,"owner":"root","group":"wheel","size":925,"date":"Feb 22 2019"}
            {"filename":"2to3-2.7","link_to":"../../System/Library/Frameworks/Python.framework/Versions/2.7/bin/2to3-2.7","flags":"lrwxr-xr-x","links":1,"owner":"root","group":"wheel","size":74,"date":"May 4 2019"}
            {"filename":"AssetCacheLocatorUtil","flags":"-rwxr-xr-x","links":1,"owner":"root","group":"wheel","size":55152,"date":"May 3 2019"}
            ...

            ping-s streaming command parser

            Linux, macOS, and BSD support for the ping and ping6 commands. This is a streaming parser and it outputs JSON Lines. (Documentation):

            $ ping 1.1.1.1 | jc --ping-s
            {"type":"reply","destination_ip":"1.1.1.1","sent_bytes":56,"pattern":null,"response_bytes":64,"response_ip":"1.1.1.1","icmp_seq":0,"ttl":56,"time_ms":23.703}
            {"type":"reply","destination_ip":"1.1.1.1","sent_bytes":56,"pattern":null,"response_bytes":64,"response_ip":"1.1.1.1","icmp_seq":1,"ttl":56,"time_ms":22.862}
            {"type":"reply","destination_ip":"1.1.1.1","sent_bytes":56,"pattern":null,"response_bytes":64,"response_ip":"1.1.1.1","icmp_seq":2,"ttl":56,"time_ms":22.82}
            ...

            Updated Parsers

            • No updated parsers in this release

            Schema Changes

            • No schema changes in this release

            Happy parsing!

            For more information on the motivations for creating jc, see my blog post.

            v1.17.1 Updates

            • Fix file parser for gzip files
            • Fix uname parser for cases where the ‘processor’ and/or ‘hardware_platform’ fields are missing on linux
            • Fix uname parser on FreeBSD
            • Add lsusb parser tested on linux
            • Add CSV file streaming parser
            • Add testing for Python 3.10.0

            lsusb command parser

            Linux support for the lsusb command. (Documentation):

            $ lsusb -v | jc --lsusb -p          # or: jc -p lsusb -v
            [
              {
                "bus": "002",
                "device": "001",
                "id": "1d6b:0001",
                "description": "Linux Foundation 1.1 root hub",
                "device_descriptor": {
                  "bLength": {
                    "value": "18"
                  },
                  "bDescriptorType": {
                    "value": "1"
                  },
                  "bcdUSB": {
                    "value": "1.10"
                  },
                  ...
                  "bNumConfigurations": {
                    "value": "1"
                  },
                  "configuration_descriptor": {
                    "bLength": {
                      "value": "9"
                    },
                    ...
                    "iConfiguration": {
                      "value": "0"
                    },
                    "bmAttributes": {
                      "value": "0xe0",
                      "attributes": [
                        "Self Powered",
                        "Remote Wakeup"
                      ]
                    },
                    "MaxPower": {
                      "description": "0mA"
                    },
                    "interface_descriptors": [
                      {
                        "bLength": {
                          "value": "9"
                        },
                        ...
                        "bInterfaceProtocol": {
                          "value": "0",
                          "description": "Full speed (or root) hub"
                        },
                        "iInterface": {
                          "value": "0"
                        },
                        "endpoint_descriptors": [
                          {
                            "bLength": {
                              "value": "7"
                            },
                            ...
                            "bmAttributes": {
                              "value": "3",
                              "attributes": [
                                "Transfer Type  Interrupt",
                                "Synch Type  None",
                                "Usage Type  Data"
                              ]
                            },
                            "wMaxPacketSize": {
                              "value": "0x0002",
                              "description": "1x 2 bytes"
                            },
                            "bInterval": {
                              "value": "255"
                            }
                          }
                        ]
                      }
                    ]
                  }
                },
                "hub_descriptor": {
                  "bLength": {
                    "value": "9"
                  },
                  ...
                  "wHubCharacteristic": {
                    "value": "0x000a",
                    "attributes": [
                      "No power switching (usb 1.0)",
                      "Per-port overcurrent protection"
                    ]
                  },
                  ...
                  "hub_port_status": {
                    "Port 1": {
                      "value": "0000.0103",
                      "attributes": [
                        "power",
                        "enable",
                        "connect"
                      ]
                    },
                    "Port 2": {
                      "value": "0000.0103",
                      "attributes": [
                        "power",
                        "enable",
                        "connect"
                      ]
                    }
                  }
                },
                "device_status": {
                  "value": "0x0001",
                  "description": "Self Powered"
                }
              }
            ]

            csv-s streaming command parser

            Support for CSV files. This is a streaming parser and it outputs JSON Lines. (Documentation):

            $ cat homes.csv
            "Sell", "List", "Living", "Rooms", "Beds", "Baths", "Age", "Acres", "Taxes"
            142, 160, 28, 10, 5, 3,  60, 0.28,  3167
            175, 180, 18,  8, 4, 1,  12, 0.43,  4033
            129, 132, 13,  6, 3, 1,  41, 0.33,  1471
            ...
            
            $ cat homes.csv | jc --csv-s
            {"Sell":"142","List":"160","Living":"28","Rooms":"10","Beds":"5","Baths":"3","Age":"60","Acres":"0.28","Taxes":"3167"}
            {"Sell":"175","List":"180","Living":"18","Rooms":"8","Beds":"4","Baths":"1","Age":"12","Acres":"0.43","Taxes":"4033"}
            {"Sell":"129","List":"132","Living":"13","Rooms":"6","Beds":"3","Baths":"1","Age":"41","Acres":"0.33","Taxes":"1471"}
            ...

            v1.17.2 Updates

            • Fix ping parser to add Alpine linux support
            • Fix netstat parser for older versions of netstat on linux
            • Fix df parser for cases where the ‘filesystem’ field overflows the column length

            v1.17.3 Updates

            • Update parsers to exit with error if non-string input is detected (raise TypeError)
            • Update streaming parsers to exit with error if non-iterable input is detected (raise TypeError)
            • Simplify quiet-checking in parsers
            • Add iostat parser tested on linux
            • Add iostat streaming parser tested on linux

            iostat command parser

            Linux support for the iostat command. (Documentation):

            $ iostat | jc --iostat          # or: jc -p iostat
            [
              {
                  "percent_user": 0.15,
                  "percent_nice": 0.0,
                  "percent_system": 0.18,
                  "percent_iowait": 0.0,
                  "percent_steal": 0.0,
                  "percent_idle": 99.67,
                  "type": "cpu"
              },
              {
                  "device": "sda",
                  "tps": 0.29,
                  "kb_read_s": 7.22,
                  "kb_wrtn_s": 1.25,
                  "kb_read": 194341,
                  "kb_wrtn": 33590,
                  "type": "device"
              },
              {
                  "device": "dm-0",
                  "tps": 0.29,
                  "kb_read_s": 5.99,
                  "kb_wrtn_s": 1.17,
                  "kb_read": 161361,
                  "kb_wrtn": 31522,
                  "type": "device"
              },
              {
                  "device": "dm-1",
                  "tps": 0.0,
                  "kb_read_s": 0.08,
                  "kb_wrtn_s": 0.0,
                  "kb_read": 2204,
                  "kb_wrtn": 0,
                  "type": "device"
              }
            ]

            iostat-s streaming command parser

            Linux support for the iostat command. This is a streaming parser and it outputs JSON Lines. (Documentation):

            $ iostat | jc --iostat-s
            {"percent_user":0.14,"percent_nice":0.0,"percent_system":0.16,"percent_iowait":0.0,"percent_steal":0.0,"percent_idle":99.7,"type":"cpu"}
            {"device":"sda","tps":0.24,"kb_read_s":5.28,"kb_wrtn_s":1.1,"kb_read":203305,"kb_wrtn":42368,"type":"device"}
            ...

            v1.17.4 Updates

            • Add support for the NO_COLOR environment variable to set mono (http://no-color.org/)
            • Add -C option to force color output even when using pipes (overrides -m and NO_COLOR)

            v1.17.5 Updates

            • Add zipinfo parser tested on linux and macOS

            zipinfo command parser

            Linux and macOS support for the zipinfo command. (Documentation):

            $ zipinfo file.zip | jc --zipinfo -p
            [
              {
                "archive": "file.zip",
                "size": 4116,
                "size_unit": "bytes",
                "number_entries": 1,
                "number_files": 1,
                "bytes_uncompressed": 11837,
                "bytes_compressed": 3966,
                "percent_compressed": 66.5,
                "files": [
                  {
                    "flags": "-rw-r--r--",
                    "zipversion": "2.1",
                    "zipunder": "unx",
                    "filesize": 11837,
                    "type": "bX",
                    "method": "defN",
                    "date": "21-Dec-08",
                    "time": "20:50",
                    "filename": "compressed_file"
                  }
                ]
              }
            ]

            v1.17.6 Updates

            • Add jar-manifest file parser for MANIFEST.MF files.
            • Fix CSV parsers for some files that include double-quotes

            jar-manifest file parser

            Support for Java JAR Manifest files. (Documentation):

            $ cat MANIFEST.MF | jc --jar-manifest -p
            [
              {
                "Import_Package": "com.conversantmedia.util.concurrent;resolution:=optional,com.fasterxml.jackson.annotation;version=\"[2.12,3)\";resolution:=optional,com.fasterxml.jackson.core;version=\"[2.12,3)\";resolution:=optional,com.fasterxml.jackson.core.type;version=\"[2.12,3)\";resolution:=optional,com.fasterxml.jackson.cor...",
                "Export_Package": "org.apache.logging.log4j.core;uses:=\"org.apache.logging.log4j,org.apache.logging.log4j.core.config,org.apache.logging.log4j.core.impl,org.apache.logging.log4j.core.layout,org.apache.logging.log4j.core.time,org.apache.logging.log4j.message,org.apache.logging.log4j.spi,org.apache.logging.log4j.status...",
                "Manifest_Version": "1.0",
                "Bundle_License": "https://www.apache.org/licenses/LICENSE-2.0.txt",
                "Bundle_SymbolicName": "org.apache.logging.log4j.core",
                "Built_By": "matt",
                "Bnd_LastModified": "1639373735804",
                "Implementation_Vendor_Id": "org.apache.logging.log4j",
                "Specification_Title": "Apache Log4j Core",
                "Log4jReleaseManager": "Matt Sicker",
                ...
              }
            ]

            v1.17.7 Updates

            • Add stat-s streaming parser for the stat command.

            stat-s streaming command parser

            Linux, macOS, and FreeBSD support for the stat command. This is a streaming parser and it outputs JSON Lines. (Documentation):

            $ stat | jc --stat-s
            {"file":"(stdin)","unix_device":1027739696,"inode":1155,"flags":"crw--w----","links":1,"user":"kbrazil","group":"tty","rdev":268435456,"size":0,"access_time":"Jan  4 15:27:44 2022","modify_time":"Jan  4 15:27:44 2022","change_time":"Jan  4 15:27:44 2022","birth_time":"Dec 31 16:00:00 1969","block_size":131072,"blocks":0,"unix_flags":"0","access_time_epoch":1641338864,"access_time_epoch_utc":null,"modify_time_epoch":1641338864,"modify_time_epoch_utc":null,"change_time_epoch":1641338864,"change_time_epoch_utc":null,"birth_time_epoch":null,"birth_time_epoch_utc":null}
            Featured

            Practical JSON at the Command Line (using Jello)

            This is a new version of my existing article: Practical JSON at the Command Line. In this version I have substituted jello where jq was used in the previous article.

            I’m a big fan of using JSON at the command line instead of filtering and piping unstructured text between processes. My article on Bringing the Unix Philosopy to the 21st Century explains many of the benefits of using JSON instead of plain text. I also created jc, which converts the output of dozens of commands and file-types to JSON, which allows many new possibilities for automation at the command line.

            There are many blog posts on how to use tools like jq to filter JSON at the command line. But I would like to write about how you can actually use that JSON to make your life easier in Bash using jello, a JSON filtering tool I wrote that uses pure Python syntax.

            How do you get that beautifully filtered JSON data into a usable form, such as a list or array, in Bash? What are some best practices when working with JSON data in Bash? Let’s start simple and work our way up.

            In this article we will be processing the output of rpm -qia so we can get a nice list of RPM package metadata objects to play around with. We’ll use jc to convert the rpm command output to JSON so we can process it in jello and then use in our script.

            We’ll look at three scenarios:

            • Assigning a Bash variable from a single JSON attribute
            • Assigning a simple list Bash variable from a JSON array
            • Assigning a Bash array from a JSON array of objects

            Assigning a Variable from a Single Attribute

            The simplest scenario is to pull a single value from the JSON data we are interested in. If we run rpm -qia | jc --rpm-qi we will get a JSON array of rpm metadata objects to work with. I’ll use the -p option in jc to pretty-print the JSON:

            $ rpm -qia | jc --rpm-qi -p
            [
              {
                "name": "make",
                "epoch": 1,
                "version": "3.82",
                "release": "24.el7",
                "architecture": "x86_64",
                "install_date": "Wed 16 Oct 2019 09:21:42 AM PDT",
                "group": "Development/Tools",
                "size": 1160660,
                "license": "GPLv2+",
                "signature": "RSA/SHA256, Thu 22 Aug 2019 02:34:59 PM PDT, Key ID 24c6a8a7f4a80eb5",
                "source_rpm": "make-3.82-24.el7.src.rpm",
                "build_date": "Thu 08 Aug 2019 05:47:25 PM PDT",
                "build_host": "x86-01.bsys.centos.org",
                "relocations": "(not relocatable)",
                "packager": "CentOS BuildSystem <http://bugs.centos.org>",
                "vendor": "CentOS",
                "url": "http://www.gnu.org/software/make/",
                "summary": "A GNU tool which simplifies the build process for users",
                "description": "A GNU tool for controlling the generation of executables and other non-source files of a program from the program's source files. Make allows users to build and install packages without any significant knowledge about the details of the build process. The details about how the program should be built are provided for make in the program's makefile.",
                "build_epoch": 1565311645,
                "build_epoch_utc": null
              },
              {
                "name": "kbd-legacy",
                "version": "1.15.5",
                "release": "15.el7",
                "architecture": "noarch",
                "install_date": "Thu 15 Aug 2019 10:53:08 AM PDT",
                "group": "System Environment/Base",
                "size": 503608,
                "license": "GPLv2+",
                "signature": "RSA/SHA256, Mon 12 Nov 2018 07:17:49 AM PST, Key ID 24c6a8a7f4a80eb5",
                "source_rpm": "kbd-1.15.5-15.el7.src.rpm",
                "build_date": "Tue 30 Oct 2018 03:40:00 PM PDT",
                "build_host": "x86-01.bsys.centos.org",
                "relocations": "(not relocatable)",
                "packager": "CentOS BuildSystem <http://bugs.centos.org>",
                "vendor": "CentOS",
                "url": "http://ftp.altlinux.org/pub/people/legion/kbd",
                "summary": "Legacy data for kbd package",
                "description": "The kbd-legacy package contains original keymaps for kbd package. Please note that kbd-legacy is not helpful without kbd.",
                "build_epoch": 1540939200,
                "build_epoch_utc": null
              },
              ...
            ]

            Ok, that is a long JSON array of objects. Let’s narrow it down to only packages that use the MIT license with jello:

            $ rpm -qia | jc --rpm-qi | jello '[p for p in _ if p.license == "MIT"]'
            [
              {
                "name": "ncurses-base",
                "version": "5.9",
                "release": "14.20130511.el7_4",
                "architecture": "noarch",
                "install_date": "Thu 15 Aug 2019 10:53:08 AM PDT",
                "group": "System Environment/Base",
                "size": 223432,
                "license": "MIT",
                "signature": "RSA/SHA256, Thu 07 Sep 2017 05:43:15 AM PDT, Key ID 24c6a8a7f4a80eb5",
                "source_rpm": "ncurses-5.9-14.20130511.el7_4.src.rpm",
                "build_date": "Wed 06 Sep 2017 03:08:29 PM PDT",
                "build_host": "c1bm.rdu2.centos.org",
                "relocations": "(not relocatable)",
                "packager": "CentOS BuildSystem <http://bugs.centos.org>",
                "vendor": "CentOS",
                "url": "http://invisible-island.net/ncurses/ncurses.html",
                "summary": "Descriptions of common terminals",
                "description": "This package contains descriptions of common terminals. Other terminal descriptions are included in the ncurses-term package.",
                "build_epoch": 1504735709,
                "build_epoch_utc": null
              },
              {
                "name": "ncurses-libs",
                "version": "5.9",
                "release": "14.20130511.el7_4",
                "architecture": "x86_64",
                "install_date": "Thu 15 Aug 2019 10:53:16 AM PDT",
                "group": "System Environment/Libraries",
                "size": 1028216,
                "license": "MIT",
                "signature": "RSA/SHA256, Thu 07 Sep 2017 05:43:31 AM PDT, Key ID 24c6a8a7f4a80eb5",
                "source_rpm": "ncurses-5.9-14.20130511.el7_4.src.rpm",
                "build_date": "Wed 06 Sep 2017 03:08:29 PM PDT",
                "build_host": "c1bm.rdu2.centos.org",
                "relocations": "(not relocatable)",
                "packager": "CentOS BuildSystem <http://bugs.centos.org>",
                "vendor": "CentOS",
                "url": "http://invisible-island.net/ncurses/ncurses.html",
                "summary": "Ncurses libraries",
                "description": "The curses library routines are a terminal-independent method of updating character screens with reasonable optimization.  The ncurses (new curses) library is a freely distributable replacement for the discontinued 4.4 BSD classic curses library. This package contains the ncurses libraries.",
                "build_epoch": 1504735709,
                "build_epoch_utc": null
              },
            ...
            ]

            Tip: You can use jellex to help you rapidly create your jello queries

            Now the list is much smaller. In this form, this is not exactly usable in a Bash script. We’ll need to get this data into a format that Bash can use.

            In this first, simple example, we just want a single attribute from a single object. So let’s filter the data to do that by filtering on the newest build_epoch date and selecting the name field:

            $ rpm -qia | jc --rpm-qi | jello -r 'sorted([p for p in _ if p.license == "MIT"], key=lambda x: x.build_epoch)[-1]["name"]'
            jc

            Well, isn’t that convenient? jc was the last package built on the system. Notice that we use the -r option in jello to strip the quotation marks from the string result. Since that jello query spit out a single word, it’s pretty straightforward to assign it to a Bash variable:

            $ package_name=$(rpm -qia | jc --rpm-qi | jello -r 'sorted([p for p in _ if p.license == "MIT"], key=lambda x: x.build_epoch)[-1]["name"]')
            $ echo $package_name
            jc
            

            This is a good start if we just need a single attribute, but many times in our scripts we have multiple items we need to deal with. Assigning a single Bash variable to a JSON attribute can get tedious and slow if we need to iterate over a large dataset.

            Now, let’s look at assigning more than one item to a Bash variable to use it as a list in a for loop.

            Assigning a List from a JSON Array

            In our next example, we’ll get a list of MIT licensed packages from our rpm -qia query and do something with the output. In this case, we’ll just create a text file for each package, using the name attribute as the filename and the contents will have some text, including the package name. First, lets see the output of the jello filter:

            $ rpm -qia | jc --rpm-qi | jello -lr '[p.name for p in _ if p.license == "MIT"]'
            curl
            dbus-python
            expat
            jansson
            ...

            And now, lets use that filter in a script by assigning it to a Bash variable that will act as a word list:

            #!/bin/bash
            
            packages=$(rpm -qia | jc --rpm-qi | jello -lr '[p.name for p in _ if p.license == "MIT"]')
            
            for package in $packages; do
                echo "Package name is ${package}" > "${package}".txt
            done
            

            After running this script, we get a list of files named after the package names. Inside of the files is a bit of text:

            $ ls
            create_files.sh  jc.txt                libcom_err.txt   libpciaccess.txt    libyaml.txt       popt.txt
            curl.txt         json-c.txt            libcurl.txt      libss.txt           lua.txt           python-iniparse.txt
            dbus-python.txt  krb5-devel.txt        libdrm.txt       libverto-devel.txt  ncurses-base.txt  python-pytoml.txt
            expat.txt        krb5-libs.txt         libfastjson.txt  libverto.txt        ncurses-libs.txt  PyYAML.txt
            jansson.txt      libcom_err-devel.txt  libkadm5.txt     libxml2.txt         ncurses.txt       rubygem-psych.txt
            $ cat jc.txt 
            Package name is jc
            

            That was easy enough, but remember this only works when each item is a single word and you just want to iterate over the same JSON attribute over and over again in a Bash for loop.

            What if I want to include other metadata, like the description, in the text file? One way would be to create another list Bash variable from another jello query and then iterate over the list again. Or, inside the for loop, we could do another rpm -qi query and grab the attribute we want just-in-time:

            #!/bin/bash
            
            packages=$(rpm -qia | jc --rpm-qi | jello -lr '[p.name for p in _ if p.license == "MIT"]')
            
            for package in $packages; do
                description=$(rpm -qi "${package}" | jc --rpm-qi | jello -r _[0].description)
                echo "Package name is ${package}" > "${package}".txt
                echo "The description is:  ${description}" >> "${package}".txt
            done
            

            This works:

            $ ./create_files.sh 
            $ ls
            create_files.sh  jc.txt                libcom_err.txt   libpciaccess.txt    libyaml.txt       popt.txt
            curl.txt         json-c.txt            libcurl.txt      libss.txt           lua.txt           python-iniparse.txt
            dbus-python.txt  krb5-devel.txt        libdrm.txt       libverto-devel.txt  ncurses-base.txt  python-pytoml.txt
            expat.txt        krb5-libs.txt         libfastjson.txt  libverto.txt        ncurses-libs.txt  PyYAML.txt
            jansson.txt      libcom_err-devel.txt  libkadm5.txt     libxml2.txt         ncurses.txt       rubygem-psych.txt
            $ cat jc.txt 
            Package name is jc
            The description is:  This tool serializes the output of popular gnu linux command line tools and file types to structured JSON output

            But it is a little inefficient since we need to run the rpm -qi [package] query many times during the script. A better method would be to do the rpm -qia query one time, which will give us all of the package data at once and then just select the attributes we want in our script. We’ll do that next!

            Assigning a Bash Array from a JSON Array of Objects

            In other programming languages, like python, it is pretty straightforward to load a JSON string of any depth and complexity and use it as a dictionary or list. Unfortunately, Bash does not have the same native capability, but we can do some useful things by assigning JSON objects to a Bash array.

            At first glance, this seems like it should be pretty easy with a single variable assignment statement, but in fact, we’ll need to use a while loop and read lines from jello so Bash can ingest the JSON lines data into the Bash array. This way we can easily iterate through the data in a similar way we would with python.

            In this example, we’ll take the filtered JSON output of the rpm -qia command, iterate over all of the objects (each object is a package) and pull the attributes we want to use in a for loop. This should be a more efficient example of the last script we created since we are only running the rpm -qia command once. First let’s just iterate and print the raw Bash array elements so we can see what it looks like:

            #!/bin/bash
            
            # pull the rpm package objects into a bash array from jello
            packages=()
            while read -r value; do
                packages+=("$value")
            done < <(rpm -qia | jc --rpm-qi | jello -l '[p for p in _ if p.license == "MIT"]')
            
            # iterate over the bash array
            for package in "${packages[@]}"; do
                echo "${package}"
                echo
            done

            There are a few interesting things going on in this script:

            • A Bash array variable named packages is created with packages=()
            • A while loop reads in all of the JSON objects created by jello into the packages Bash array.
              • Note: mapfile -t packages < <( ... ) can be substituted for the while loop when using Bash 4.0 and higher.
            • The jello command uses the -l option which prints each JSON object on a single line. (a.k.a JSON Lines) This is the magic that allows the object to be read in as a Bash array element.
            • Then we use a standard for loop to iterate over each package object, which contains all of the attributes we want to extract into variables.
            • Finally, we do something with those variables.

            When we run this script, we see the following output:

            $ ./print_array.sh 
            {"name":"ncurses-base","version":"5.9","release":"14.20130511.el7_4","architecture":"noarch","install_date":"Thu 15 Aug 2019 10:53:08 AM PDT","group":"System Environment/Base","size":223432,"license":"MIT","signature":"RSA/SHA256, Thu 07 Sep 2017 05:43:15 AM PDT, Key ID 24c6a8a7f4a80eb5","source_rpm":"ncurses-5.9-14.20130511.el7_4.src.rpm","build_date":"Wed 06 Sep 2017 03:08:29 PM PDT","build_host":"c1bm.rdu2.centos.org","relocations":"(not relocatable)","packager":"CentOS BuildSystem <http://bugs.centos.org>","vendor":"CentOS","url":"http://invisible-island.net/ncurses/ncurses.html","summary":"Descriptions of common terminals","description":"This package contains descriptions of common terminals. Other terminal descriptions are included in the ncurses-term package.","build_epoch":1504735709,"build_epoch_utc":null}
            
            {"name":"ncurses-libs","version":"5.9","release":"14.20130511.el7_4","architecture":"x86_64","install_date":"Thu 15 Aug 2019 10:53:16 AM PDT","group":"System Environment/Libraries","size":1028216,"license":"MIT","signature":"RSA/SHA256, Thu 07 Sep 2017 05:43:31 AM PDT, Key ID 24c6a8a7f4a80eb5","source_rpm":"ncurses-5.9-14.20130511.el7_4.src.rpm","build_date":"Wed 06 Sep 2017 03:08:29 PM PDT","build_host":"c1bm.rdu2.centos.org","relocations":"(not relocatable)","packager":"CentOS BuildSystem <http://bugs.centos.org>","vendor":"CentOS","url":"http://invisible-island.net/ncurses/ncurses.html","summary":"Ncurses libraries","description":"The curses library routines are a terminal-independent method of updating character screens with reasonable optimization.  The ncurses (new curses) library is a freely distributable replacement for the discontinued 4.4 BSD classic curses library. This package contains the ncurses libraries.","build_epoch":1504735709,"build_epoch_utc":null}
            ...

            Very cool! Now we can use jello to pull any attribute we want into a variable within the for loop:

            #!/bin/bash
            
            # pull the rpm package objects into a bash array from jello
            packages=()
            while read -r value; do
                packages+=("$value")
            done < <(rpm -qia | jc --rpm-qi | jello -l '[p for p in _ if p.license == "MIT"]')
            
            # iterate over the bash array
            for package in "${packages[@]}"; do
                name=$(jello -r '_.name' <<< "${package}")
                description=$(jello -r '_.description' <<< "${package}")
                version=$(jello -r '_.version' <<< "${package}")
                
                echo "Package name is ${name}" > "${name}".txt
                echo "The description is:  ${description}" >> "${name}".txt
                echo "The version is:  ${version}" >> "${name}".txt
            done

            And here’s what it does:

            $ ./create_files.sh 
            $ ls
            create_files.sh  jc.txt                libcom_err.txt   libpciaccess.txt    libyaml.txt       popt.txt
            curl.txt         json-c.txt            libcurl.txt      libss.txt           lua.txt           python-iniparse.txt
            dbus-python.txt  krb5-devel.txt        libdrm.txt       libverto-devel.txt  ncurses-base.txt  python-pytoml.txt
            expat.txt        krb5-libs.txt         libfastjson.txt  libverto.txt        ncurses-libs.txt  PyYAML.txt
            jansson.txt      libcom_err-devel.txt  libkadm5.txt     libxml2.txt         ncurses.txt       rubygem-psych.txt
            $ cat jc.txt 
            Package name is jc
            The description is:  This tool serializes the output of popular gnu linux command line tools and file types to structured JSON output
            The version is:  1.15.0
            

            As you can see, this is more efficient and allows you to pull in any attribute you would like from each Bash array element. Each element is acting like a JSON object that jello can query.

            Conclusion

            We went through a few scenarios of how to assign JSON data to Bash variables and arrays with jc and jello. Using JSON instead of plain text allows you to be more expressive in your queries. Also, JSON has the advantage of allowing new fields to be added at any time without breaking your existing query.

            JSON can be used by simply assigning a string word to a Bash variable, a string list of words to a variable and looping over the list, or by assigning entire JSON objects to Bash array elements, which can be further queried by jello within a loop. These are powerful ways JSON data can help you write better scripts.

            If you like jello, you should check out Jello Explorer (jellex). Jello Explorer is an interactive TUI JSON filter built on jello that can help you create queries faster and easier.

            Featured

            Practical JSON at the Command Line

            Prefer python syntax over jq? Please see a new version of this article that uses jello instead.

            I’m a big fan of using JSON at the command line instead of filtering and piping unstructured text between processes. My article on Bringing the Unix Philosopy to the 21st Century explains many of the benefits of using JSON instead of plain text. I also created jc, which converts the output of dozens of commands and file-types to JSON, which allows many new possibilities for automation at the command line.

            There are many blog posts on how to use tools like jq to filter JSON at the command line. But I would like to write about how you can actually use that JSON to make your life easier in Bash.

            How do you get that beautifully filtered JSON data into a usable form, such as a list or array, in Bash? What are some best practices when working with JSON data in Bash? Let’s start simple and work our way up.

            In this article we will be processing the output of rpm -qia so we can get a nice list of RPM package metadata objects to play around with. We’ll use jc to convert the rpm command output to JSON so we can process it in jq and then use in our script.

            We’ll look at three scenarios:

            • Assigning a Bash variable from a single JSON attribute
            • Assigning a simple list Bash variable from a JSON array
            • Assigning a Bash array from a JSON array of objects

            Assigning a Variable from a Single Attribute

            The simplest scenario is to pull a single value from the JSON data we are interested in. If we run rpm -qia | jc --rpm-qi we will get a JSON array of rpm metadata objects to work with. I’ll use the -p option in jc to pretty-print the JSON:

            $ rpm -qia | jc --rpm-qi -p
            [
              {
                "name": "make",
                "epoch": 1,
                "version": "3.82",
                "release": "24.el7",
                "architecture": "x86_64",
                "install_date": "Wed 16 Oct 2019 09:21:42 AM PDT",
                "group": "Development/Tools",
                "size": 1160660,
                "license": "GPLv2+",
                "signature": "RSA/SHA256, Thu 22 Aug 2019 02:34:59 PM PDT, Key ID 24c6a8a7f4a80eb5",
                "source_rpm": "make-3.82-24.el7.src.rpm",
                "build_date": "Thu 08 Aug 2019 05:47:25 PM PDT",
                "build_host": "x86-01.bsys.centos.org",
                "relocations": "(not relocatable)",
                "packager": "CentOS BuildSystem <http://bugs.centos.org>",
                "vendor": "CentOS",
                "url": "http://www.gnu.org/software/make/",
                "summary": "A GNU tool which simplifies the build process for users",
                "description": "A GNU tool for controlling the generation of executables and other non-source files of a program from the program's source files. Make allows users to build and install packages without any significant knowledge about the details of the build process. The details about how the program should be built are provided for make in the program's makefile.",
                "build_epoch": 1565311645,
                "build_epoch_utc": null
              },
              {
                "name": "kbd-legacy",
                "version": "1.15.5",
                "release": "15.el7",
                "architecture": "noarch",
                "install_date": "Thu 15 Aug 2019 10:53:08 AM PDT",
                "group": "System Environment/Base",
                "size": 503608,
                "license": "GPLv2+",
                "signature": "RSA/SHA256, Mon 12 Nov 2018 07:17:49 AM PST, Key ID 24c6a8a7f4a80eb5",
                "source_rpm": "kbd-1.15.5-15.el7.src.rpm",
                "build_date": "Tue 30 Oct 2018 03:40:00 PM PDT",
                "build_host": "x86-01.bsys.centos.org",
                "relocations": "(not relocatable)",
                "packager": "CentOS BuildSystem <http://bugs.centos.org>",
                "vendor": "CentOS",
                "url": "http://ftp.altlinux.org/pub/people/legion/kbd",
                "summary": "Legacy data for kbd package",
                "description": "The kbd-legacy package contains original keymaps for kbd package. Please note that kbd-legacy is not helpful without kbd.",
                "build_epoch": 1540939200,
                "build_epoch_utc": null
              },
              ...
            ]

            Ok, that is a long JSON array of objects. Let’s narrow it down to only packages that use the MIT license with jq:

            $ rpm -qia | jc --rpm-qi | jq '.[] | select(.license == "MIT")'
            {
              "name": "ncurses-base",
              "version": "5.9",
              "release": "14.20130511.el7_4",
              "architecture": "noarch",
              "install_date": "Thu 15 Aug 2019 10:53:08 AM PDT",
              "group": "System Environment/Base",
              "size": 223432,
              "license": "MIT",
              "signature": "RSA/SHA256, Thu 07 Sep 2017 05:43:15 AM PDT, Key ID 24c6a8a7f4a80eb5",
              "source_rpm": "ncurses-5.9-14.20130511.el7_4.src.rpm",
              "build_date": "Wed 06 Sep 2017 03:08:29 PM PDT",
              "build_host": "c1bm.rdu2.centos.org",
              "relocations": "(not relocatable)",
              "packager": "CentOS BuildSystem <http://bugs.centos.org>",
              "vendor": "CentOS",
              "url": "http://invisible-island.net/ncurses/ncurses.html",
              "summary": "Descriptions of common terminals",
              "description": "This package contains descriptions of common terminals. Other terminal descriptions are included in the ncurses-term package.",
              "build_epoch": 1504735709,
              "build_epoch_utc": null
            }
            {
              "name": "ncurses-libs",
              "version": "5.9",
              "release": "14.20130511.el7_4",
              "architecture": "x86_64",
              "install_date": "Thu 15 Aug 2019 10:53:16 AM PDT",
              "group": "System Environment/Libraries",
              "size": 1028216,
              "license": "MIT",
              "signature": "RSA/SHA256, Thu 07 Sep 2017 05:43:31 AM PDT, Key ID 24c6a8a7f4a80eb5",
              "source_rpm": "ncurses-5.9-14.20130511.el7_4.src.rpm",
              "build_date": "Wed 06 Sep 2017 03:08:29 PM PDT",
              "build_host": "c1bm.rdu2.centos.org",
              "relocations": "(not relocatable)",
              "packager": "CentOS BuildSystem <http://bugs.centos.org>",
              "vendor": "CentOS",
              "url": "http://invisible-island.net/ncurses/ncurses.html",
              "summary": "Ncurses libraries",
              "description": "The curses library routines are a terminal-independent method of updating character screens with reasonable optimization.  The ncurses (new curses) library is a freely distributable replacement for the discontinued 4.4 BSD classic curses library. This package contains the ncurses libraries.",
              "build_epoch": 1504735709,
              "build_epoch_utc": null
            }
            ...

            Now the list is much smaller. Also, notice that jq unpacked the JSON objects from the array for us. (There is no-longer a set of square brackets around the output). In this form, this is not exactly usable in a Bash script. In fact, this is no longer even a single valid JSON object, but a series of smaller JSON objects. We’ll need to get this data into a format that Bash can use.

            In this first, simple example, we just want a single attribute from a single object. So let’s filter the data to do that by filtering on the newest build_epoch date and selecting the name field:

            $ rpm -qia | jc --rpm-qi | jq 'sort_by(.build_epoch)[] | select(.license == "MIT")' | jq -sr '.[-1].name'
            jc

            The particulars of the jq query itself are outside the scope of this article. For more information on how to properly structure a jq query, see here, here, and here.

            Not a fan of jq syntax? Already know how to work with JSON in Python? Try out jello, which works just like jq, but uses Python syntax!

            Well, isn’t that convenient? jc was the last package built on the system. Notice that we use the -r option in jq to strip the quotation marks from the string result. Since that jq query spit out a single word, it’s pretty straightforward to assign it to a Bash variable:

            $ package_name=$(rpm -qia | jc --rpm-qi | jq 'sort_by(.build_epoch)[] | select(.license == "MIT")' | jq -sr '.[-1].name')
            $ echo $package_name
            jc
            

            This is a good start if we just need a single attribute, but many times in our scripts we have multiple items we need to deal with. Assigning a single Bash variable to a JSON attribute can get tedious and slow if we need to iterate over a large dataset.

            Now, let’s look at assigning more than one item to a Bash variable to use it as a list in a for loop.

            Assigning a List from a JSON Array

            In our next example, we’ll get a list of MIT licensed packages from our rpm -qia query and do something with the output. In this case, we’ll just create a text file for each package, using the name attribute as the filename and the contents will have some text, including the package name. First, lets see the output of the jq filter:

            $ rpm -qia | jc --rpm-qi | jq -r '.[] | select(.license == "MIT") | .name'
            curl
            dbus-python
            expat
            jansson
            ...

            And now, lets use that filter in a script by assigning it to a Bash variable that will act as a word list:

            #!/bin/bash
            
            packages=$(rpm -qia | jc --rpm-qi | jq -r '.[] | select(.license == "MIT") | .name')
            
            for package in $packages; do
                echo "Package name is ${package}" > "${package}".txt
            done
            

            After running this script, we get a list of files named after the package names. Inside of the files is a bit of text:

            $ ls
            create_files.sh  jc.txt                libcom_err.txt   libpciaccess.txt    libyaml.txt       popt.txt
            curl.txt         json-c.txt            libcurl.txt      libss.txt           lua.txt           python-iniparse.txt
            dbus-python.txt  krb5-devel.txt        libdrm.txt       libverto-devel.txt  ncurses-base.txt  python-pytoml.txt
            expat.txt        krb5-libs.txt         libfastjson.txt  libverto.txt        ncurses-libs.txt  PyYAML.txt
            jansson.txt      libcom_err-devel.txt  libkadm5.txt     libxml2.txt         ncurses.txt       rubygem-psych.txt
            $ cat jc.txt 
            Package name is jc
            

            That was easy enough, but remember this only works when each item is a single word and you just want to iterate over the same JSON attribute over and over again in a Bash for loop.

            What if I want to include other metadata, like the description, in the text file? One way would be to create another list Bash variable from another jq query and then iterate over the list again. Or, inside the for loop, we could do another rpm -qi query and grab the attribute we want just-in-time:

            #!/bin/bash
            
            packages=$(rpm -qia | jc --rpm-qi | jq -r '.[] | select(.license == "MIT") | .name')
            
            for package in $packages; do
                description=$(rpm -qi "${package}" | jc --rpm-qi | jq -r .[0].description)
                echo "Package name is ${package}" > "${package}".txt
                echo "The description is:  ${description}" >> "${package}".txt
            done
            

            This works:

            $ ./create_files.sh 
            $ ls
            create_files.sh  jc.txt                libcom_err.txt   libpciaccess.txt    libyaml.txt       popt.txt
            curl.txt         json-c.txt            libcurl.txt      libss.txt           lua.txt           python-iniparse.txt
            dbus-python.txt  krb5-devel.txt        libdrm.txt       libverto-devel.txt  ncurses-base.txt  python-pytoml.txt
            expat.txt        krb5-libs.txt         libfastjson.txt  libverto.txt        ncurses-libs.txt  PyYAML.txt
            jansson.txt      libcom_err-devel.txt  libkadm5.txt     libxml2.txt         ncurses.txt       rubygem-psych.txt
            $ cat jc.txt 
            Package name is jc
            The description is:  This tool serializes the output of popular gnu linux command line tools and file types to structured JSON output

            But it is a little inefficient since we need to run the rpm -qi [package] query many times during the script. A better method would be to do the rpm -qia query one time, which will give us all of the package data at once and then just select the attributes we want in our script. We’ll do that next!

            Assigning a Bash Array from a JSON Array of Objects

            In other programming languages, like python, it is pretty straightforward to load a JSON string of any depth and complexity and use it as a dictionary or list. Unfortunately, Bash does not have the same native capability, but we can do some useful things by assigning JSON objects to a Bash array.

            At first glance, this seems like it should be pretty easy with a single variable assignment statement, but in fact, we’ll need to use a while loop and read lines from jq so Bash can ingest the JSON lines data into the Bash array. This way we can easily iterate through the data in a similar way we would with python.

            In this example, we’ll take the filtered JSON output of the rpm -qia command, iterate over all of the objects (each object is a package) and pull the attributes we want to use in a for loop. This should be a more efficient example of the last script we created since we are only running the rpm -qia command once. First let’s just iterate and print the raw Bash array elements so we can see what it looks like:

            #!/bin/bash
            
            # pull the rpm package objects into a bash array from jq
            packages=()
            while read -r value; do
                packages+=("$value")
            done < <(rpm -qia | jc --rpm-qi | jq -c '.[] | select(.license == "MIT")')
            
            # iterate over the bash array
            for package in "${packages[@]}"; do
                echo "${package}"
                echo
            done

            There are a few interesting things going on in this script:

            • A Bash array variable named packages is created with packages=()
            • A while loop reads in all of the JSON objects created by jq into the packages Bash array.
              • Note: mapfile -t packages < <( ... ) can be substituted for the while loop when using Bash 4.0 and higher.
            • The jq command uses the -c option which prints each JSON object on a single line. This is the magic that allows the object to be read in as a Bash array element.
            • Then we use a standard for loop to iterate over each package object, which contains all of the attributes we want to extract into variables.
            • Finally, we do something with those variables.

            When we run this script, we see the following output:

            $ ./print_array.sh 
            {"name":"ncurses-base","version":"5.9","release":"14.20130511.el7_4","architecture":"noarch","install_date":"Thu 15 Aug 2019 10:53:08 AM PDT","group":"System Environment/Base","size":223432,"license":"MIT","signature":"RSA/SHA256, Thu 07 Sep 2017 05:43:15 AM PDT, Key ID 24c6a8a7f4a80eb5","source_rpm":"ncurses-5.9-14.20130511.el7_4.src.rpm","build_date":"Wed 06 Sep 2017 03:08:29 PM PDT","build_host":"c1bm.rdu2.centos.org","relocations":"(not relocatable)","packager":"CentOS BuildSystem <http://bugs.centos.org>","vendor":"CentOS","url":"http://invisible-island.net/ncurses/ncurses.html","summary":"Descriptions of common terminals","description":"This package contains descriptions of common terminals. Other terminal descriptions are included in the ncurses-term package.","build_epoch":1504735709,"build_epoch_utc":null}
            
            {"name":"ncurses-libs","version":"5.9","release":"14.20130511.el7_4","architecture":"x86_64","install_date":"Thu 15 Aug 2019 10:53:16 AM PDT","group":"System Environment/Libraries","size":1028216,"license":"MIT","signature":"RSA/SHA256, Thu 07 Sep 2017 05:43:31 AM PDT, Key ID 24c6a8a7f4a80eb5","source_rpm":"ncurses-5.9-14.20130511.el7_4.src.rpm","build_date":"Wed 06 Sep 2017 03:08:29 PM PDT","build_host":"c1bm.rdu2.centos.org","relocations":"(not relocatable)","packager":"CentOS BuildSystem <http://bugs.centos.org>","vendor":"CentOS","url":"http://invisible-island.net/ncurses/ncurses.html","summary":"Ncurses libraries","description":"The curses library routines are a terminal-independent method of updating character screens with reasonable optimization.  The ncurses (new curses) library is a freely distributable replacement for the discontinued 4.4 BSD classic curses library. This package contains the ncurses libraries.","build_epoch":1504735709,"build_epoch_utc":null}
            ...

            Very cool! Now we can use jq to pull any attribute we want into a variable within the for loop:

            #!/bin/bash
            
            # pull the rpm package objects into a bash array from jq
            packages=()
            while read -r value; do
                packages+=("$value")
            done < <(rpm -qia | jc --rpm-qi | jq -c '.[] | select(.license == "MIT")')
            
            # iterate over the bash array
            for package in "${packages[@]}"; do
                name=$(jq -r '.name' <<< "${package}")
                description=$(jq -r '.description' <<< "${package}")
                version=$(jq -r '.version' <<< "${package}")
                
                echo "Package name is ${name}" > "${name}".txt
                echo "The description is:  ${description}" >> "${name}".txt
                echo "The version is:  ${version}" >> "${name}".txt
            done

            And here’s what it does:

            $ ./create_files.sh 
            $ ls
            create_files.sh  jc.txt                libcom_err.txt   libpciaccess.txt    libyaml.txt       popt.txt
            curl.txt         json-c.txt            libcurl.txt      libss.txt           lua.txt           python-iniparse.txt
            dbus-python.txt  krb5-devel.txt        libdrm.txt       libverto-devel.txt  ncurses-base.txt  python-pytoml.txt
            expat.txt        krb5-libs.txt         libfastjson.txt  libverto.txt        ncurses-libs.txt  PyYAML.txt
            jansson.txt      libcom_err-devel.txt  libkadm5.txt     libxml2.txt         ncurses.txt       rubygem-psych.txt
            $ cat jc.txt 
            Package name is jc
            The description is:  This tool serializes the output of popular gnu linux command line tools and file types to structured JSON output
            The version is:  1.15.0
            

            As you can see, this is more efficient and allows you to pull in any attribute you would like from each Bash array element. Each element is acting like a JSON object that jq can query.

            Conclusion

            We went through a few scenarios of how to assign JSON data to Bash variables and arrays with jc and jq. Using JSON instead of plain text allows you to be more expressive in your queries. Also, JSON has the advantage of allowing new fields to be added at any time without breaking your existing query.

            JSON can be used by simply assigning a string word to a Bash variable, a string list of words to a variable and looping over the list, or by assigning entire JSON objects to Bash array elements, which can be further queried by jq within a loop. These are powerful ways JSON data can help you write better scripts.

            Featured

            JC Version 1.15.0 Released

            Try the jc web demo!

            jc is now available as an MSI install package for Windows.

            I’m excited to announce the release of jc version 1.15.0 available on github and pypi. This is a significant release that includes dozens of new features and parsers.

            jc now supports over 70 commands and file-types, including the new acpi, upower, /usr/bin/time, dpkg -l, rpm -qi, finger, and dir command parsers. Several existing parsers have been updated to include calculated time fields for convenience. These include date, uptime, stat, timedatectl, who, dig, and ls.

            The CLI experience has been enhanced with new -h help and -v version options. External library dependencies are now optional, so jc will work just fine without them, albeit with limited functionality. JSON output is now more compact so less data is being piped between programs and unencoded unicode characters are now supported in JSON strings.

            jc can be installed via pip or through several official OS package repositories, including Debian, Ubuntu, Fedora, openSUSE, Arch Linux, NixOS Linux, Guix System Linux, FreeBSD, and macOS. For more information on how to get jc, click here.

            To upgrade with pip:

            $ pip3 install --upgrade jc
            Sections

              New Features

              • -h option displays help and the parser list. jc no longer displays the help text on error. Now that there are so many parsers, the -h option prints to STDOUT so the output can be piped to more or less for paging.
              • -v option displays the version, github site, and copyright information
              • New calculated epoch timestamp fields have been added to several parsers, including date, stat, timedatectl, who, dig, and ls. These fields are also available in many of the new parsers, including upower, rpm -qi, and dir.

                All timestamps are naive (i.e. based on the timezone of the machine the parser is running on) unless the UTC timezone can be detected within the text of the command output. If the UTC timezone is detected, a timezone-aware timestamp is created. All aware timestamps have the suffix ‘_utc‘. No other timezones are supported for aware timestamps.
              • Several calculated time fields have been added to the uptime parser.
              • All external library dependencies, including pygments, ruamel.yaml, and xmltodict are now optional. If a dependency is missing, jc will still run, but will have limited functionality. For example, if the pygments library is not installed, then all JSON output will be monochrome. If the ruamel.yaml or xmltodict libraries are not installed, then the --yaml or --xml parsers, respectively, will not run.
              • JSON output is more compact, with all spaces between delimiters removed, unless the -p option is used to pretty-print the JSON output. This reduces the amount of data that needs to be piped between programs and can save some disk space if JSON output is being stored to disk.
              • Unencoded unicode characters are now printed in JSON strings. These types of characters include the Copyright ‘©’ symbol and many others.

              New Parsers

              jc now supports 70 parsers. New parsers include acpi, upower, /usr/bin/time, dpkg -l, rpm -qi, finger, and dir. The dir parser is the first Windows command parser to be included in jc!

              Documentation and schemas for all parsers can be found here.

              acpi command parser

              Linux support for the acpi command. (Documentation):

              $ acpi -V | jc --acpi -p          # or:  jc -p acpi -V
              [
                {
                  "type": "Battery",
                  "id": 0,
                  "state": "Charging",
                  "charge_percent": 71,
                  "until_charged": "00:29:20",
                  "design_capacity_mah": 2110,
                  "last_full_capacity": 2271,
                  "last_full_capacity_percent": 100,
                  "until_charged_hours": 0,
                  "until_charged_minutes": 29,
                  "until_charged_seconds": 20,
                  "until_charged_total_seconds": 1760
                },
                {
                  "type": "Adapter",
                  "id": 0,
                  "on-line": true
                },
                {
                  "type": "Thermal",
                  "id": 0,
                  "mode": "ok",
                  "temperature": 46.0,
                  "temperature_unit": "C",
                  "trip_points": [
                    {
                      "id": 0,
                      "switches_to_mode": "critical",
                      "temperature": 127.0,
                      "temperature_unit": "C"
                    },
                    {
                      "id": 1,
                      "switches_to_mode": "hot",
                      "temperature": 127.0,
                      "temperature_unit": "C"
                    }
                  ]
                },
                {
                  "type": "Cooling",
                  "id": 0,
                  "messages": [
                    "Processor 0 of 10"
                  ]
                },
                {
                  "type": "Cooling",
                  "id": 1,
                  "messages": [
                    "Processor 0 of 10"
                  ]
                },
                {
                  "type": "Cooling",
                  "id": 2,
                  "messages": [
                    "x86_pkg_temp no state information available"
                  ]
                },
                {
                  "type": "Cooling",
                  "id": 3,
                  "messages": [
                    "Processor 0 of 10"
                  ]
                },
                {
                  "type": "Cooling",
                  "id": 4,
                  "messages": [
                    "intel_powerclamp no state information available"
                  ]
                },
                {
                  "type": "Cooling",
                  "id": 5,
                  "messages": [
                    "Processor 0 of 10"
                  ]
                }
              ]

              upower command parser

              Linux support for the upower command. (Documentation):

              $ upower -i /org/freedesktop/UPower/devices/battery | jc --upower -p          # or jc -p upower -i /org/freedesktop/UPower/devices/battery
              [
                {
                  "native_path": "/sys/devices/LNXSYSTM:00/device:00/PNP0C0A:00/power_supply/BAT0",
                  "vendor": "NOTEBOOK",
                  "model": "BAT",
                  "serial": "0001",
                  "power_supply": true,
                  "updated": "Thu 11 Mar 2021 06:28:08 PM UTC",
                  "has_history": true,
                  "has_statistics": true,
                  "detail": {
                    "type": "battery",
                    "present": true,
                    "rechargeable": true,
                    "state": "charging",
                    "energy": 22.3998,
                    "energy_empty": 0.0,
                    "energy_full": 52.6473,
                    "energy_full_design": 62.16,
                    "energy_rate": 31.6905,
                    "voltage": 12.191,
                    "time_to_full": 57.3,
                    "percentage": 42.5469,
                    "capacity": 84.6964,
                    "technology": "lithium-ion",
                    "energy_unit": "Wh",
                    "energy_empty_unit": "Wh",
                    "energy_full_unit": "Wh",
                    "energy_full_design_unit": "Wh",
                    "energy_rate_unit": "W",
                    "voltage_unit": "V",
                    "time_to_full_unit": "minutes"
                  },
                  "history_charge": [
                    {
                      "time": 1328809335,
                      "percent_charged": 42.547,
                      "status": "charging"
                    },
                    {
                      "time": 1328809305,
                      "percent_charged": 42.02,
                      "status": "charging"
                    }
                  ],
                  "history_rate": [
                    {
                      "time": 1328809335,
                      "percent_charged": 31.691,
                      "status": "charging"
                    }
                  ],
                  "updated_seconds_ago": 441975,
                  "updated_epoch": 1615516088,
                  "updated_epoch_utc": 1615487288
                }
              ]

              /usr/bin/time command parser

              Linux, macOS, and BSD support for the /usr/bin/time command. (Documentation):

              $ /usr/bin/time --verbose -o timefile.out sleep 2.5; cat timefile.out | jc --time -p
              {
                "command_being_timed": "sleep 2.5",
                "user_time": 0.0,
                "system_time": 0.0,
                "cpu_percent": 0,
                "elapsed_time": "0:02.50",
                "average_shared_text_size": 0,
                "average_unshared_data_size": 0,
                "average_stack_size": 0,
                "average_total_size": 0,
                "maximum_resident_set_size": 2084,
                "average_resident_set_size": 0,
                "major_pagefaults": 0,
                "minor_pagefaults": 72,
                "voluntary_context_switches": 2,
                "involuntary_context_switches": 1,
                "swaps": 0,
                "block_input_operations": 0,
                "block_output_operations": 0,
                "messages_sent": 0,
                "messages_received": 0,
                "signals_delivered": 0,
                "page_size": 4096,
                "exit_status": 0,
                "elapsed_time_hours": 0,
                "elapsed_time_minutes": 0,
                "elapsed_time_seconds": 2,
                "elapsed_time_centiseconds": 50,
                "elapsed_time_total_seconds": 2.5
              }

              dpkg -l command parser

              Linux support for the dpkg -l command. (Documentation):

              $ dpkg -l | jc --dpkg-l -p          # or:  jc -p dpkg -l
              [
                {
                  "codes": "ii",
                  "name": "accountsservice",
                  "version": "0.6.45-1ubuntu1.3",
                  "architecture": "amd64",
                  "description": "query and manipulate user account information",
                  "desired": "install",
                  "status": "installed"
                },
                {
                  "codes": "rc",
                  "name": "acl",
                  "version": "2.2.52-3build1",
                  "architecture": "amd64",
                  "description": "Access control list utilities",
                  "desired": "remove",
                  "status": "config-files"
                },
                {
                  "codes": "uWR",
                  "name": "acpi",
                  "version": "1.7-1.1",
                  "architecture": "amd64",
                  "description": "displays information on ACPI devices",
                  "desired": "unknown",
                  "status": "trigger await",
                  "error": "reinstall required"
                },
                {
                  "codes": "rh",
                  "name": "acpid",
                  "version": "1:2.0.28-1ubuntu1",
                  "architecture": "amd64",
                  "description": "Advanced Configuration and Power Interface event daemon",
                  "desired": "remove",
                  "status": "half installed"
                },
                {
                  "codes": "pn",
                  "name": "adduser",
                  "version": "3.116ubuntu1",
                  "architecture": "all",
                  "description": "add and remove users and groups",
                  "desired": "purge",
                  "status": "not installed"
                }
              ]

              rpm -qi command parser

              Linux support for the rpm -qi command. (Documentation):

              $ rpm_qia | jc --rpm_qi -p          # or:  jc -p rpm -qia
              [
                {
                  "name": "make",
                  "epoch": 1,
                  "version": "3.82",
                  "release": "24.el7",
                  "architecture": "x86_64",
                  "install_date": "Wed 16 Oct 2019 09:21:42 AM PDT",
                  "group": "Development/Tools",
                  "size": 1160660,
                  "license": "GPLv2+",
                  "signature": "RSA/SHA256, Thu 22 Aug 2019 02:34:59 PM PDT, Key ID 24c6a8a7f4a80eb5",
                  "source_rpm": "make-3.82-24.el7.src.rpm",
                  "build_date": "Thu 08 Aug 2019 05:47:25 PM PDT",
                  "build_host": "x86-01.bsys.centos.org",
                  "relocations": "(not relocatable)",
                  "packager": "CentOS BuildSystem <http://bugs.centos.org>",
                  "vendor": "CentOS",
                  "url": "http://www.gnu.org/software/make/",
                  "summary": "A GNU tool which simplifies the build process for users",
                  "description": "A GNU tool for controlling the generation of executables and other non-source...",
                  "build_epoch": 1565311645,
                  "build_epoch_utc": null
                },
                {
                  "name": "kbd-legacy",
                  "version": "1.15.5",
                  "release": "15.el7",
                  "architecture": "noarch",
                  "install_date": "Thu 15 Aug 2019 10:53:08 AM PDT",
                  "group": "System Environment/Base",
                  "size": 503608,
                  "license": "GPLv2+",
                  "signature": "RSA/SHA256, Mon 12 Nov 2018 07:17:49 AM PST, Key ID 24c6a8a7f4a80eb5",
                  "source_rpm": "kbd-1.15.5-15.el7.src.rpm",
                  "build_date": "Tue 30 Oct 2018 03:40:00 PM PDT",
                  "build_host": "x86-01.bsys.centos.org",
                  "relocations": "(not relocatable)",
                  "packager": "CentOS BuildSystem <http://bugs.centos.org>",
                  "vendor": "CentOS",
                  "url": "http://ftp.altlinux.org/pub/people/legion/kbd",
                  "summary": "Legacy data for kbd package",
                  "description": "The kbd-legacy package contains original keymaps for kbd package. Please note...",
                  "build_epoch": 1540939200,
                  "build_epoch_utc": null
                }
              ]

              finger command parser

              Linux, macOS, and BSD support for the finger command. (Documentation):

              $ finger | jc --finger -p          # or:  jc -p finger
              [
                {
                  "login": "jdoe",
                  "name": "John Doe",
                  "tty": "tty1",
                  "idle": "14d",
                  "login_time": "Mar 22 21:14",
                  "tty_writeable": false,
                  "idle_minutes": 0,
                  "idle_hours": 0,
                  "idle_days": 14,
                  "total_idle_minutes": 20160
                },
                {
                  "login": "jdoe",
                  "name": "John Doe",
                  "tty": "pts/0",
                  "idle": null,
                  "login_time": "Apr  5 15:33",
                  "details": "(192.168.1.22)",
                  "tty_writeable": true,
                  "idle_minutes": 0,
                  "idle_hours": 0,
                  "idle_days": 0,
                  "total_idle_minutes": 0
                }
              ]

              dir command parser

              Windows support for the dir command – written by Rasheed Elsaleh. (Documentation):

              C:> dir | jc --dir -p          # or:  jc -p dir
              [
                {
                  "date": "03/24/2021",
                  "time": "03:15 PM",
                  "dir": true,
                  "size": null,
                  "filename": ".",
                  "parent": "C:\\Program Files\\Internet Explorer",
                  "epoch": 1616624100
                },
                {
                  "date": "03/24/2021",
                  "time": "03:15 PM",
                  "dir": true,
                  "size": null,
                  "filename": "..",
                  "parent": "C:\\Program Files\\Internet Explorer",
                  "epoch": 1616624100
                },
                {
                  "date": "12/07/2019",
                  "time": "02:49 AM",
                  "dir": true,
                  "size": null,
                  "filename": "en-US",
                  "parent": "C:\\Program Files\\Internet Explorer",
                  "epoch": 1575715740
                },
                {
                  "date": "12/07/2019",
                  "time": "02:09 AM",
                  "dir": false,
                  "size": 54784,
                  "filename": "ExtExport.exe",
                  "parent": "C:\\Program Files\\Internet Explorer",
                  "epoch": 1575713340
                }
              ]

              Updated Parsers

              • Several parsers have been updated to include calculated epoch timestamp fields, including: date, stat, timedatectl, who, dig, and ls. See the Schema Changes section for more details.
              • The uptime parser has been enhanced with additional calculated time fields. See the Schema Changes section for more details.

              Schema Changes

              date command parser

              The date command parser has been completely rewritten and enhanced with several new fields, including: epoch, epoch_utc, hour_24, utc_offset, day_of_year, week_of_year, iso, and timezone_aware. The weekday_num field has also been updated to conform to ISO 8601 compliant numbering. (Documentation)

              $ date | jc --date -p          # or:  jc -p date
              {
                "year": 2021,
                "month": "Mar",
                "month_num": 3,
                "day": 25,
                "weekday": "Thu",
                "weekday_num": 4,
                "hour": 2,
                "hour_24": 2,
                "minute": 2,
                "second": 26,
                "period": "AM",
                "timezone": "UTC",
                "utc_offset": "+0000",
                "day_of_year": 84,
                "week_of_year": 12,
                "iso": "2021-03-25T02:02:26+00:00",
                "epoch": 1616662946,
                "epoch_utc": 1616637746,
                "timezone_aware": true
              }

              stat command parser

              The stat parser has been updated to add the following fields: access_time_epoch, access_time_epoch_utc, modify_time_epoch, modify_time_epoch_utc, change_time_epoch, change_time_epoch_utc, birth_time_epoch, birth_time_epoch_utc. (Documentation)

              $ stat /bin/* | jc --stat -p          # or:  jc -p stat /bin/*
              [
                {
                  "file": "/bin/bash",
                  "size": 1113504,
                  "blocks": 2176,
                  "io_blocks": 4096,
                  "type": "regular file",
                  "device": "802h/2050d",
                  "inode": 131099,
                  "links": 1,
                  "access": "0755",
                  "flags": "-rwxr-xr-x",
                  "uid": 0,
                  "user": "root",
                  "gid": 0,
                  "group": "root",
                  "access_time": "2019-11-14 08:18:03.509681766 +0000",
                  "modify_time": "2019-06-06 22:28:15.000000000 +0000",
                  "change_time": "2019-08-12 17:21:29.521945390 +0000",
                  "birth_time": null,
                  "access_time_epoch": 1573748283,
                  "access_time_epoch_utc": 1573719483,
                  "modify_time_epoch": 1559885295,
                  "modify_time_epoch_utc": 1559860095,
                  "change_time_epoch": 1565655689,
                  "change_time_epoch_utc": 1565630489,
                  "birth_time_epoch": null,
                  "birth_time_epoch_utc": null
                },
                {
                  "file": "/bin/btrfs",
                  "size": 716464,
                  "blocks": 1400,
                  "io_blocks": 4096,
                  "type": "regular file",
                  "device": "802h/2050d",
                  "inode": 131100,
                  "links": 1,
                  "access": "0755",
                  "flags": "-rwxr-xr-x",
                  "uid": 0,
                  "user": "root",
                  "gid": 0,
                  "group": "root",
                  "access_time": "2019-11-14 08:18:28.990834276 +0000",
                  "modify_time": "2018-03-12 23:04:27.000000000 +0000",
                  "change_time": "2019-08-12 17:21:29.545944399 +0000",
                  "birth_time": null,
                  "access_time_epoch": 1573748308,
                  "access_time_epoch_utc": 1573719508,
                  "modify_time_epoch": 1520921067,
                  "modify_time_epoch_utc": 1520895867,
                  "change_time_epoch": 1565655689,
                  "change_time_epoch_utc": 1565630489,
                  "birth_time_epoch": null,
                  "birth_time_epoch_utc": null
                }
              ]

              timedatectl command parser

              The epoch_utc field has been added to the timedatectl command parser. (Documentation)

              timedatectl | jc --timedatectl -p          # or: jc -p timedatectl
              {
                "local_time": "Tue 2020-03-10 17:53:21 PDT",
                "universal_time": "Wed 2020-03-11 00:53:21 UTC",
                "rtc_time": "Wed 2020-03-11 00:53:21",
                "time_zone": "America/Los_Angeles (PDT, -0700)",
                "ntp_enabled": true,
                "ntp_synchronized": true,
                "rtc_in_local_tz": false,
                "dst_active": true,
                "epoch_utc": 1583888001
              }

              who command parser

              The epoch field has been added to the who command parser. (Documentation)

              $ who | jc --who -p          # or:  jc -p who
              [
                {
                  "user": "joeuser",
                  "tty": "ttyS0",
                  "time": "2020-03-02 02:52",
                  "epoch": 1583146320
                },
                {
                  "user": "joeuser",
                  "tty": "pts/0",
                  "time": "2020-03-02 05:15",
                  "from": "192.168.71.1",
                  "epoch": 1583154900
                }
              ]

              dig command parser

              The when_epoch and when_epoch_utc fields have been added to the dig command parser. (Documentation)

              $ dig cnn.com www.cnn.com @205.251.194.64 | jc --dig -p          # or:  jc -p dig cnn.com www.cnn.com @205.251.194.64
              [
                {
                  "id": 52172,
                  "opcode": "QUERY",
                  "status": "NOERROR",
                  "flags": [
                    "qr",
                    "rd",
                    "ra"
                  ],
                  "query_num": 1,
                  "answer_num": 4,
                  "authority_num": 0,
                  "additional_num": 1,
                  "question": {
                    "name": "cnn.com.",
                    "class": "IN",
                    "type": "A"
                  },
                  "answer": [
                    {
                      "name": "cnn.com.",
                      "class": "IN",
                      "type": "A",
                      "ttl": 27,
                      "data": "151.101.65.67"
                    },
                    {
                      "name": "cnn.com.",
                      "class": "IN",
                      "type": "A",
                      "ttl": 27,
                      "data": "151.101.129.67"
                    },
                    {
                      "name": "cnn.com.",
                      "class": "IN",
                      "type": "A",
                      "ttl": 27,
                      "data": "151.101.1.67"
                    },
                    {
                      "name": "cnn.com.",
                      "class": "IN",
                      "type": "A",
                      "ttl": 27,
                      "data": "151.101.193.67"
                    }
                  ],
                  "query_time": 38,
                  "server": "2600",
                  "when": "Tue Mar 30 20:07:59 PDT 2021",
                  "rcvd": 100,
                  "when_epoch": 1617160079,
                  "when_epoch_utc": null
                },
                {
                  "id": 36292,
                  "opcode": "QUERY",
                  "status": "NOERROR",
                  "flags": [
                    "qr",
                    "aa",
                    "rd"
                  ],
                  "query_num": 1,
                  "answer_num": 1,
                  "authority_num": 4,
                  "additional_num": 1,
                  "question": {
                    "name": "www.cnn.com.",
                    "class": "IN",
                    "type": "A"
                  },
                  "answer": [
                    {
                      "name": "www.cnn.com.",
                      "class": "IN",
                      "type": "CNAME",
                      "ttl": 300,
                      "data": "turner-tls.map.fastly.net."
                    }
                  ],
                  "authority": [
                    {
                      "name": "cnn.com.",
                      "class": "IN",
                      "type": "NS",
                      "ttl": 3600,
                      "data": "ns-1086.awsdns-07.org."
                    },
                    {
                      "name": "cnn.com.",
                      "class": "IN",
                      "type": "NS",
                      "ttl": 3600,
                      "data": "ns-1630.awsdns-11.co.uk."
                    },
                    {
                      "name": "cnn.com.",
                      "class": "IN",
                      "type": "NS",
                      "ttl": 3600,
                      "data": "ns-47.awsdns-05.com."
                    },
                    {
                      "name": "cnn.com.",
                      "class": "IN",
                      "type": "NS",
                      "ttl": 3600,
                      "data": "ns-576.awsdns-08.net."
                    }
                  ],
                  "query_time": 27,
                  "server": "205.251.194.64#53(205.251.194.64)",
                  "when": "Tue Mar 30 20:07:59 PDT 2021",
                  "rcvd": 212,
                  "when_epoch": 1617160079,
                  "when_epoch_utc": null
                }
              ]

              ls command parser

              The epoch and epoch_utc fields have been added to the ls command parser. Note, that these fields are only available if the --full-time or -l --time-style=full-iso options are used when running ls. (Documentation)

              $ ls --full-time /usr/bin | jc --ls -p          # or:  jc -p ls --full-time /usr/bin
              [
                {
                  "filename": "acpi",
                  "flags": "-rwxr-xr-x",
                  "links": 1,
                  "owner": "root",
                  "group": "root",
                  "size": 23656,
                  "date": "2018-01-14 19:20:21.000000000 -0800",
                  "epoch": 1515986421,
                  "epoch_utc": null
                },
                {
                  "filename": "acpi_listen",
                  "flags": "-rwxr-xr-x",
                  "links": 1,
                  "owner": "root",
                  "group": "root",
                  "size": 14608,
                  "date": "2017-04-27 21:28:10.000000000 -0700",
                  "epoch": 1493353690,
                  "epoch_utc": null
                }
              ]

              uptime command parser

              Several calculated time fields have been added to the uptime command parser, including: uptime_days, uptime_hours, uptime_minutes, uptime_total_seconds, time_hour, time_minute, and time_second. (Documentation)

              $ uptime | jc --uptime -p          # or:  jc -p uptime
              {
                "time": "11:35",
                "uptime": "3 days, 4:03",
                "users": 5,
                "load_1m": 1.88,
                "load_5m": 2.0,
                "load_15m": 1.94,
                "time_hour": 11,
                "time_minute": 35,
                "time_second": null,
                "uptime_days": 3,
                "uptime_hours": 4,
                "uptime_minutes": 3,
                "uptime_total_seconds": 273780
              }

              Full Parser List

              • acpi
              • airport -I
              • airport -s
              • arp
              • blkid
              • cksum
              • crontab
              • crontab (with user info)
              • csv
              • date
              • df
              • dig
              • dir
              • dmidecode
              • dpkg -l
              • du
              • env
              • file
              • finger
              • free
              • fstab
              • group
              • gshadow
              • hash
              • hashsum (various hash sum programs: md5, md5sum, shasum, etc.)
              • hciconfig
              • history
              • hosts
              • id
              • ifconfig
              • ini
              • iptables
              • iw_scan
              • jobs
              • kv
              • last
              • ls
              • lsblk
              • lsmod
              • lsof
              • mount
              • netstat
              • ntpq
              • passwd
              • ping
              • pip list
              • pip show
              • ps
              • route
              • rpm -qi
              • shadow
              • ss
              • stat
              • sysctl
              • systemctl
              • systemctl list-jobs
              • systemctl list-sockets
              • systemctl list-unit-files
              • time (/usr/bin/time)
              • timedatectl
              • tracepath
              • traceroute
              • uname -a
              • upower
              • uptime
              • w
              • wc
              • who
              • xml
              • yaml

              Version 1.15.1 Updates

              • New feature to show parser documentation interactively with -h --parser_name. For example: $ jc -h --arp
              • Man page added to pypi package for easier packaging in homebrew
              • Update rpm-qi parser to add two calculated timestamp fields: install_date_epoch and install_date_epoch_utc
              • Clean up documentation and autogenerate the Parser Information section from metadata

              Schema Changes

              The rpm-qi parser has been updated to add two calculated timestamp fields: install_date_epoch (naive) and install_date_epoch_utc (timezone-aware).

              $ rpm -qia | jc --rpm-qi -p
                  [
                    {
                      "name": "make",
                      "epoch": 1,
                      "version": "3.82",
                      "release": "24.el7",
                      "architecture": "x86_64",
                      "install_date": "Wed 16 Oct 2019 09:21:42 AM PDT",
                      "group": "Development/Tools",
                      "size": 1160660,
                      "license": "GPLv2+",
                      "signature": "RSA/SHA256, Thu 22 Aug 2019 02:34:59 PM PDT, Key ID 24c6a8a7f4a80eb5",
                      "source_rpm": "make-3.82-24.el7.src.rpm",
                      "build_date": "Thu 08 Aug 2019 05:47:25 PM PDT",
                      "build_host": "x86-01.bsys.centos.org",
                      "relocations": "(not relocatable)",
                      "packager": "CentOS BuildSystem <http://bugs.centos.org>",
                      "vendor": "CentOS",
                      "url": "http://www.gnu.org/software/make/",
                      "summary": "A GNU tool which simplifies the build process for users",
                      "description": "A GNU tool for controlling the generation of executables and other...",
                      "build_epoch": 1565311645,
                      "build_epoch_utc": null,
                      "install_date_epoch": 1571242902,
                      "install_date_epoch_utc": null
                    }
                  ]

              Version 1.15.2 Updates

              • Add systeminfo parser tested on Windows
              • Update dig parser to fix an issue with IPv6 addresses in the server field
              • Update dig parser to fix an issue when axfr entries contain a semicolon
              • Update dig parser to add support for “Additional Section” and “Opt Pseudosection”
              • Update dig parser to add query_size field
              • Use dig parser as the main example in readme, documentation, and man page
              • Standardize int, float, and boolean conversion rules with functions in jc.utils

              New Parsers

              systeminfo command parser (Windows)

              Windows support for the systeminfo command – written by Jon Smith. (Documentation):

              $ systeminfo | jc --systeminfo -p
                  {
                    "host_name": "TESTLAPTOP",
                    "os_name": "Microsoft Windows 10 Enterprise",
                    "os_version": "10.0.17134 N/A Build 17134",
                    "os_manufacturer": "Microsoft Corporation",
                    "os_configuration": "Member Workstation",
                    "os_build_type": "Multiprocessor Free",
                    "registered_owner": "Test, Inc.",
                    "registered_organization": "Test, Inc.",
                    "product_id": "11111-11111-11111-AA111",
                    "original_install_date": "3/26/2019, 3:51:30 PM",
                    "system_boot_time": "3/30/2021, 6:13:59 AM",
                    "system_manufacturer": "Dell Inc.",
                    "system_model": "Precision 5530",
                    "system_type": "x64-based PC",
                    "processors": [
                      "Intel64 Family 6 Model 158 Stepping 10 GenuineIntel ~2592 Mhz"
                    ],
                    "bios_version": "Dell Inc. 1.16.2, 4/21/2020",
                    "windows_directory": "C:\\WINDOWS",
                    "system_directory": "C:\\WINDOWS\\system32",
                    "boot_device": "\\Device\\HarddiskVolume2",
                    "system_locale": "en-us;English (United States)",
                    "input_locale": "en-us;English (United States)",
                    "time_zone": "(UTC+00:00) UTC",
                    "total_physical_memory_mb": 32503,
                    "available_physical_memory_mb": 19743,
                    "virtual_memory_max_size_mb": 37367,
                    "virtual_memory_available_mb": 22266,
                    "virtual_memory_in_use_mb": 15101,
                    "page_file_locations": "C:\\pagefile.sys",
                    "domain": "test.com",
                    "logon_server": "\\\\TESTDC01",
                    "hotfixs": [
                      "KB2693643",
                      "KB4601054"
                    ],
                    "network_cards": [
                      {
                        "name": "Intel(R) Wireless-AC 9260 160MHz",
                        "connection_name": "Wi-Fi",
                        "status": null,
                        "dhcp_enabled": true,
                        "dhcp_server": "192.168.2.1",
                        "ip_addresses": [
                          "192.168.2.219"
                        ]
                      }
                    ],
                    "hyperv_requirements": {
                      "vm_monitor_mode_extensions": true,
                      "virtualization_enabled_in_firmware": true,
                      "second_level_address_translation": false,
                      "data_execution_prevention_available": true
                    },
                    "original_install_date_epoch": 1553640690,
                    "original_install_date_epoch_utc": 1553615490,
                    "system_boot_time_epoch": 1617110039,
                    "system_boot_time_epoch_utc": 1617084839
                  }

              Schema Changes

              dig Command Parser

              Support for the opt_pseudosection and additional Section have been added. The query_size field has also been added.

              $ dig example.com | jc --dig -p
                  [
                    {
                      "id": 2951,
                      "opcode": "QUERY",
                      "status": "NOERROR",
                      "flags": [
                        "qr",
                        "rd",
                        "ra"
                      ],
                      "query_num": 1,
                      "answer_num": 1,
                      "authority_num": 0,
                      "additional_num": 3,
                      "opt_pseudosection": {
                        "edns": {
                          "version": 0,
                          "flags": [],
                          "udp": 4096
                        }
                      },
                      "question": {
                        "name": "example.com.",
                        "class": "IN",
                        "type": "A"
                      },
                      "answer": [
                        {
                          "name": "example.com.",
                          "class": "IN",
                          "type": "A",
                          "ttl": 39302,
                          "data": "93.184.216.34"
                        }
                      ],
                      "additional": [
                      {
                        "name": "pdns196.ultradns.com.",
                        "class": "IN",
                        "type": "A",
                        "ttl": 172800,
                        "data": "156.154.64.196"
                      },
                      {
                        "name": "pdns196.ultradns.com.",
                        "class": "IN",
                        "type": "AAAA",
                        "ttl": 172800,
                        "data": "2001:502:f3ff::e8"
                      },
                      "query_size": 57,
                      "query_time": 49,
                      "server": "2600:1700:bab0:d40::1#53(2600:1700:bab0:d40::1)",
                      "when": "Fri Apr 16 16:05:10 PDT 2021",
                      "rcvd": 56,
                      "when_epoch": 1618614310,
                      "when_epoch_utc": null
                    }
                  ]
              
              

              Version 1.15.3 Updates

              • Add ufw status command parser tested on linux
              • Add ufw-appinfo command parser tested on linux
              • Fix deb package name to conform to standard
              • Add Caveats section to readme and manpage

              New Parsers

              ufw command parser

              Linux support for the ufw status command. (Documentation):

              # ufw status verbose  | jc --ufw -p          # or jc -p ufw status verbose
              {
                "status": "active",
                "logging": "on",
                "logging_level": "low",
                "default": "deny (incoming), allow (outgoing), disabled (routed)",
                "new_profiles": "skip",
                "rules": [
                  {
                    "action": "ALLOW",
                    "action_direction": "IN",
                    "index": null,
                    "network_protocol": "ipv4",
                    "to_interface": "any",
                    "to_transport": "any",
                    "to_service": null,
                    "to_ports": [
                      22
                    ],
                    "to_ip": "0.0.0.0",
                    "to_ip_prefix": 0,
                    "comment": null,
                    "from_ip": "0.0.0.0",
                    "from_ip_prefix": 0,
                    "from_interface": "any",
                    "from_transport": "any",
                    "from_port_ranges": [
                      {
                        "start": 0,
                        "end": 65535
                      }
                    ],
                    "from_service": null
                  },
                  {
                    "action": "ALLOW",
                    "action_direction": "IN",
                    "index": null,
                    "network_protocol": "ipv4",
                    "to_interface": "any",
                    "to_transport": "tcp",
                    "to_service": null,
                    "to_ports": [
                      80,
                      443
                    ],
                    "to_ip": "0.0.0.0",
                    "to_ip_prefix": 0,
                    "comment": null,
                    "from_ip": "0.0.0.0",
                    "from_ip_prefix": 0,
                    "from_interface": "any",
                    "from_transport": "any",
                    "from_port_ranges": [
                      {
                        "start": 0,
                        "end": 65535
                      }
                    ],
                    "from_service": null
                  }
                ]
              }

              ufw-appinfo command parser

              Linux support for the ufw app info [application] and ufw app info all commands. (Documentation):

              # ufw app info MSN | jc --ufw-appinfo -p          # or:  jc -p ufw app info MSN
              [
                {
                  "profile": "MSN",
                  "title": "MSN Chat",
                  "description": "MSN chat protocol (with file transfer and voice)",
                  "tcp_list": [
                    1863,
                    6901
                  ],
                  "udp_list": [
                    1863,
                    6901
                  ],
                  "tcp_ranges": [
                    {
                      "start": 6891,
                      "end": 6900
                    }
                  ],
                  "normalized_tcp_list": [
                    1863,
                    6901
                  ],
                  "normalized_tcp_ranges": [
                    {
                      "start": 6891,
                      "end": 6900
                    }
                  ],
                  "normalized_udp_list": [
                    1863,
                    6901
                  ]
                }
              ]

              Version 1.15.4 Updates

              • Update ping parser to support error responses in OSX and BSD
              • Update ping parser to be more resilient against parsing errors for unknown error types
              • Update dig parser to support +noall +answer use case
              • Update dig parser compatibility to all platforms
              • Fix colors in Windows terminals (cmd.exe and PowerShell)
              • Fix epoch calculations when UTC is referenced as “Coordinated Universal Time”
              • Add Windows time format for systeminfo output
              • Add exceptions module to standardize parser exceptions
              • jc no longer swallows exit codes when using the “Magic” syntax. See the Exit Codes section of the README and man page for details

              Version 1.15.5 Updates

              • Fix issue where help and about information would not display if a 3rd party parser library was missing. (e.g. xmltodict)
              • Add more error message detail when encountering ParseError and LibraryNotFound exceptions

              Happy parsing!

              For more information on the motivations for creating jc, see my blog post.

              Featured

              JC Version 1.14.0 Released

              Try the jc web demo!

              Happy New Year! I’m happy to announce the release of jc version 1.14.0 available on github and pypi.

              jc now supports over 60 commands and file-types, including the new hash, hashsum (md5, md5sum, shasum, sha1sum, sha224sum, sha256sum, sha384sum, sha512sum), cksum, and wc command parsers. The ls parser has been enhanced to work with vdir output and the env parser has been enhanced to work with printenv output. jc is now fully tested on python 3.9.

              jc can be installed via pip or through several official OS package repositories, including Debian, Ubuntu, Fedora, openSUSE, Arch Linux, NixOS Linux, Guix System Linux, FreeBSD, and macOS. For more information on how to get jc, click here.

              To upgrade with pip:

              $ pip3 install --upgrade jc

              New Features

              • jc is now available on the official Debian and Ubuntu repository (apt-get install jc)
              • Tested on python 3.9

              New Parsers

              jc now supports 61 parsers. New parsers include kv, date, hash, hashsum, cksum, and wc.

              Documentation and schemas for all parsers can be found here.

              kv key/value pair parser (added in v1.13.2)

              Parses key/value pair files. Files can include comments prepended with # or ; and keys and values can be delimited by = or : with or without spaces. Quotation marks are stripped from quoted values, though they can be kept with the -r (raw output) jc argument.

              These types of files can be found in many places, including configuration files in /etc. (e.g. /etc/sysconfig/network-scripts).

              $ cat keyvalue.txt
              # this file contains key/value pairs
              name = John Doe
              address=555 California Drive
              age: 34
              ; comments can include # or ;
              # delimiter can be = or :
              # quoted values have quotation marks stripped by default
              # but can be preserved with the -r argument
              occupation:"Engineer"
              
              $ cat keyvalue.txt | jc --kv -p
              {
                "name": "John Doe",
                "address": "555 California Drive",
                "age": "34",
                "occupation": "Engineer"
              }

              date command parser (added in v1.13.2)

              Linux, macOS, and FreeBSD support for the date command:

              $ date | jc --date -p          # or:  jc -p date
              {
                "year": 2020,
                "month_num": 7,
                "day": 31,
                "hour": 16,
                "minute": 48,
                "second": 11,
                "month": "Jul",
                "weekday": "Fri",
                "weekday_num": 6,
                "timezone": "PDT"
              }

              hash command parser

              Linux, macOS, and FreeBSD support for the hash BASH shell builtin:

              $ hash | jc --hash -p
              [
                {
                  "hits": 2,
                  "command": "/bin/cat"
                },
                {
                  "hits": 1,
                  "command": "/bin/ls"
                }
              ]

              hashsum command parser

              Linux, macOS, and FreeBSD support for various MD5 and SHA hash commands, including md5, md5sum, shasum, sha1sum, sha224sum, sha256sum, sha384sum, sha512sum:

              $ md5sum * | jc --hashsum -p          # or jc -p md5sum *
              [
                {
                  "filename": "devtoolset-3-gcc-4.9.2-6.el7.x86_64.rpm",
                  "hash": "65fc958c1add637ec23c4b137aecf3d3"   
                },
                {
                  "filename": "digout",
                  "hash": "5b9312ee5aff080927753c63a347707d"
                },
                {
                  "filename": "dmidecode.out",
                  "hash": "716fd11c2ac00db109281f7110b8fb9d"
                },
                {
                  "filename": "file with spaces in the name",
                  "hash": "d41d8cd98f00b204e9800998ecf8427e"
                },
                {
                  "filename": "id-centos.out",
                  "hash": "4295be239a14ad77ef3253103de976d2"
                },
                {
                  "filename": "ifcfg.json",
                  "hash": "01fda0d9ba9a75618b072e64ff512b43"
                }
              ]

              cksum command parser

              Linux, macOS, and FreeBSD support for the cksum and sum commands:

              $ cksum * | jc --cksum -p          # or jc -p cksum *
              [
                {
                  "filename": "__init__.py",
                  "checksum": 4294967295,
                  "blocks": 0
                },
                {
                  "filename": "airport.py",
                  "checksum": 2208551092,
                  "blocks": 3745
                },
                {
                  "filename": "airport_s.py",
                  "checksum": 1113817598,
                  "blocks": 4572
                }
              ]

              wc command parser

              Linux, macOS, and FreeBSD support for the wc command:

              $ wc * | jc --wc -p          # or jc -p wc *
              [
                {
                  "filename": "airport-I.json",
                  "lines": 1,
                  "words": 30,
                  "characters": 307
                },
                {
                  "filename": "airport-I.out",
                  "lines": 15,
                  "words": 33,
                  "characters": 348
                },
                {
                  "filename": "airport-s.json",
                  "lines": 1,
                  "words": 202,
                  "characters": 2152
                }
              ]

              Updated Parsers

              The env parser has been enhanced to work with printenv command output using the “magic” syntax. (e.g. jc printenv)

              The ls parser has been enhanced to work with vdir command output using the “magic” syntax. (e.g. jc vdir)

              Schema Changes

              There are no schema changes in this release.

              Full Parser List

              • airport -I
              • airport -s
              • arp
              • blkid
              • cksum
              • crontab
              • crontab-u
              • CSV
              • date
              • df
              • dig
              • dmidecode
              • du
              • env
              • file
              • free
              • fstab
              • /etc/group
              • /etc/gshadow
              • hash
              • hashsum
              • history
              • /etc/hosts
              • id
              • ifconfig
              • INI
              • iptables
              • jobs
              • kv
              • last and lastb
              • ls
              • lsblk
              • lsmod
              • lsof
              • mount
              • netstat
              • ntpq
              • /etc/passwd
              • ping
              • pip list
              • pip show
              • ps
              • route
              • /etc/shadow
              • ss
              • stat
              • sysctl
              • systemctl
              • systemctl list-jobs
              • systemctl list-sockets
              • systemctl list-unit-files
              • timedatectl
              • tracepath
              • traceroute
              • uname -a
              • uptime
              • w
              • wc
              • who
              • XML
              • YAML

              For more information on the motivations for creating jc, see my blog post.

              Happy parsing!

              v1.14.1 Release Changes

              • Add iw-scan parser tested on linux (beta)
              • Update date parser for Ubuntu 20.04 support
              • Update last parser for last -F support
              • Update last parser to add convenience fields and augment data for easier parsing
              • Update man page
              • Minor documentation updates

              Schema Changes:

              date command parser

              A new period field has been added to the schema to represent AM and PM which may appear depending on locale configuration on the host. If the locale does not print AM or PM then the value will be null.

              {
                "year":         integer,
                "month_num":    integer,
                "day":          integer,
                "hour":         integer,
                "minute":       integer,
                "second":       integer,
                "period":       string,
                "month":        string,
                "weekday":      string,
                "weekday_num":  integer,
                "timezone":     string
               }
              

              last command parser

              The duration field calculation has changed to be more easily parsed and will display as total HOURS:MINUTES. Also, a few convenience calculated fields have been added and will display when the last -F option is used: login_epochlogout_epoch, and duration_seconds.

              [
                {
                  "user":             string,
                  "tty":              string,
                  "hostname":         string,
                  "login":            string,
                  "logout":           string,
                  "duration":         string,
                  "login_epoch":      integer,   # available with last -F option
                  "logout_epoch":     integer,   # available with last -F option
                  "duration_seconds": integer    # available with last -F option
                }
              ]
              Featured

              Parsing Command Output in Nornir with JC

              In my last couple of posts we learned how to parse linux command output in Ansible and Saltstack using jc. In this post we’ll do something similar with Nornir.

              Nornir is a popular automation framework that allows you to use native python to control hosts and network devices. Many times it would be nice to be able to parse the output of remotely-run commands and use that information elsewhere in your scripts. jc allows you to do this automatically – no regex/looping/slicing/etc. required to get to the data you want!

              Since jc is both a command line tool and a python library, it is easy to use inside a Nornir script to automate the boring work of command output parsing.

              For more information on the motivations for creating jc, see my blog post.

              Installation

              To use jc in a Nornir script, simply install it and import one or more parsers.

              Installing jc:

              $ pip3 install jc

              Import the jc library:

              import jc

              Now we are ready to use the jc in our Nornir script!

              Syntax

              To use the jc parser, call the parse function with the parser name and command output arguments. For example, to automatically parse a uname -a output string:

              uname_obj = jc.parse('uname', uname_command_output_string)

              Now you can use whatever uname field you would like in the rest of your code:

              print(uname_obj['node_name'])

              A Simple Example

              Below we have a small Nornir script using Netmiko to call a few commands on a linux host. (uname, date, ifconfig, and uptime) I used the nornir-netmiko package to simplify the connection to the linux host:

              from nornir import InitNornir
              from nornir_netmiko.tasks import netmiko_send_command
              import jc
              
              nr = InitNornir(config_file='config.yaml')
              
              def run_commands(task, command_list):
                  for cmd in command_list:
                      task.run(
                          task=netmiko_send_command,
                          command_string=cmd,
                          name=cmd
                      )
              
              commands = ['uname -a', 'date', 'ifconfig', 'uptime']
              
              result = nr.run(
                  task=run_commands,
                  command_list=commands
              )
              
              uname_result_string = result['host1'][1].result
              uname_result_obj = jc.parse('uname', uname_result_string)
              hostname = uname_result_obj['node_name']
              kernel_version = uname_result_obj['kernel_version']
              
              date_result_string = result['host1'][2].result
              date_result_obj = jc.parse('date', date_result_string)
              timezone = date_result_obj['timezone']
              
              ifconfig_result_string = result['host1'][3].result
              ifconfig_result_obj = jc.parse('ifconfig', ifconfig_result_string)
              ipv4_addr = ifconfig_result_obj[1]['ipv4_addr']
              
              uptime_result_string = result['host1'][4].result
              uptime_result_obj = jc.parse('uptime', uptime_result_string)
              uptime = uptime_result_obj['uptime']
              
              print(f'hostname: {hostname}')
              print(f'kernel version: {kernel_version}')
              print(f'timezone: {timezone}')
              print(f'ip address: {ipv4_addr}')
              print(f'uptime: {uptime}')

              Script output:

              $ python3 nornir_with_jc.py 
              hostname: my-ubuntu
              kernel version: #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020
              timezone: UTC
              ip address: 192.168.1.239
              uptime: 47 min
              

              Here you can see we have run a few tasks and assigned the results to some variables. Let’s go over the uname -a output:

              uname_result_string = result['host1'][1].result

              Above, we are grabbing the string result output attribute from the uname -a command (the first command in the commands list) and are assigning it to uname_result_string. There are cleaner ways of getting the result info from Nornir, but this way we can see the structure of the result object.

              uname_result_obj = jc.parse('uname', uname_result_string)

              Next, we have run uname_result_string through the jc uname parser and assigned the resulting dictionary object to the uname_result_obj variable.

              hostname = uname_result_obj['node_name']
              kernel_version = uname_result_obj['kernel_version']

              Then, we created a couple of variables that we can use in our script called hostname and kernel_version so we can grab just the object attributes we are interested in. jc returns standard dictionary objects, so they are easy to use.

              print(f'hostname: {hostname}')
              print(f'kernel version: {kernel_version}')

              Finally, we use our variables in a print function, but we could have used these objects anywhere else in the script.

              Nice! Instead of parsing the STDOUT text manually, we used jc to automatically parse the command output, providing us a convenient object to use elsewhere in our script. No more need to regex or loop and slice your way through the output to get what you are looking for!

              For a complete list of jc parsers available and their associated schemas, see the parser documentation.

              Happy parsing!

              Featured

              Parsing Command Output in Saltstack with JC

              In my last blog post I demonstrated how we can easily parse remote command output in Ansible. Since then it was requested that I demonstrate something similar using Saltstack.

              Saltstack (or Salt, as it is known) is a little different than Ansible in that it primarily uses a pub/sub architecture vs. SSH and requires an agent, or a Minion, to be installed on the remote hosts you are managing.

              It turns out it is fairly straightforward to add jc functionality to Saltstack via a custom Output Module and/or Serializer Module. We’ll go over both methods, plus a bonus method in this post.

              For more information on the motivations for creating jc, see my blog post.

              Output Module

              With a Salt Output Module you can restructure the output of the command results that are written to STDOUT on the Master. The default output is typically YAML, but you can change it to JSON or other formats with builtin output modules.

              Here is the default YAML output:

              # salt '*' cmd.run 'uptime'
              minion1:
                   16:31:16 up 2 days,  3:04,  1 user,  load average: 0.03, 0.03, 0.00
              minion2:
                   16:31:16 up 2 days,  3:04,  1 user,  load average: 0.00, 0.00, 0.00

              And here is the output using the builtin JSON ouputter:

              # salt '*' cmd.run 'uptime' --out=json
              {
                  "minion2": " 16:33:02 up 2 days,  3:06,  1 user,  load average: 0.00, 0.00, 0.00"
              }
              {
                  "minion1": " 16:33:02 up 2 days,  3:06,  1 user,  load average: 0.00, 0.02, 0.00"
              }

              But we can do better with jc by turning the uptime output into a JSON object:

              # JC_PARSER=uptime salt '*' cmd.run 'uptime' --out=jc --out-indent=2
              {
                "minion1": {
                  "time": "16:36:04",
                  "uptime": "2 days, 3:09",
                  "users": 1,
                  "load_1m": 0.07,
                  "load_5m": 0.02,
                  "load_15m": 0.0
                }
              }
              {
                "minion2": {
                  "time": "16:36:04",
                  "uptime": "2 days, 3:09",
                  "users": 1,
                  "load_1m": 0.0,
                  "load_5m": 0.0,
                  "load_15m": 0.0
                }
              }

              Now we can pipe this output to jq, jello, or any other JSON filter to more easily consume this data.

              We’ll go over the Output Module installation and usage later in this post.

              Serializer Module

              With a Salt Serializer Module you can restructure the output of the command results during runtime on each Minion so they can be used as objects/variables within a Salt state. For example, If I only cared about the number of users currently logged into each minion and wanted to set that number as a variable for use elsewhere, we could do that with a jc Serializer Module.

              Here is a simple, contrived example Salt state file to show how it works:

              {% set uptime_out = salt.cmd.shell('uptime') %}
              {% set uptime_jc = salt.slsutil.deserialize('jc', uptime_out, parser='uptime') %}
              
              run_uptime:
                cmd.run:
                  - name: >
                      echo 'The number of users logged in is {{ uptime_jc.users }}'

              And here is the output after applying this state file:

              # salt '*' state.apply uptime-users
              minion1:
              ----------
                        ID: run_uptime
                  Function: cmd.run
                      Name: echo 'The number of users logged in is 1'
              
                    Result: True
                   Comment: Command "echo 'The number of users logged in is 1'
                            " run
                   Started: 17:01:43.992058
                  Duration: 6.107 ms
                   Changes:   
                            ----------
                            pid:
                                23208
                            retcode:
                                0
                            stderr:
                            stdout:
                                The number of users logged in is 1
              
              Summary for minion1
              ------------
              Succeeded: 1 (changed=1)
              Failed:    0
              ------------
              Total states run:     1
              Total run time:   6.107 ms
              minion2:
              ----------
                        ID: run_uptime
                  Function: cmd.run
                      Name: echo 'The number of users logged in is 2'
              
                    Result: True
                   Comment: Command "echo 'The number of users logged in is 2'
                            " run
                   Started: 17:01:44.005482
                  Duration: 6.55 ms
                   Changes:   
                            ----------
                            pid:
                                23371
                            retcode:
                                0
                            stderr:
                            stdout:
                                The number of users logged in is 2
              
              Summary for minion2
              ------------
              Succeeded: 1 (changed=1)
              Failed:    0
              ------------
              Total states run:     1
              Total run time:   6.550 ms

              Since jc deserialized the command output into an object, we can simply reference the object attributes in our Salt states. We’ll go over installation and usage of the jc Serializer Module later in this post.

              Installation and Usage

              To use the jc Output Module, you will need to install jc on the Master. To use the jc Serializer Module, you will need to install jc on the Minions. Depending on your use case you may decide to install one or the other or both modules.

              Installing jc

              You can install jc on the Master and Minions with the following command. Of course, this can also be automated via Salt!

              $ pip3 install jc

              Installing the Output Module

              To install the Output Module on the Master, you need to place the Python module in a directory where the Master is configured to look for it.

              First, edit the /etc/salt/master configuration file to configure a custom Module directory. In this example we will use /srv/modules by adding this line to the configuration file:

              module_dirs: ["/srv/modules"]

              Next we need to create the /srv/modules/output directory, if it doesn’t already exist:

              # mkdir -p /srv/modules/output

              Next, copy the python module into the directory. I have uploaded the code to Github as a Gist:

              # curl https://gist.githubusercontent.com/kellyjonbrazil/24e10f0c3e438ea22fc1e2bfaee22efc/raw/263e4eaf8e51f974b34d44e0483540b163667bdf/jc.py -o /srv/modules/output/jc.py

              Finally, restart the Salt Master:

              # systemctl restart salt-master

              Using the Output Module

              To use the jc Output Module, you need to call it with the --out=jc option of the salt command.

              Additionally, you need to tell the jc Output Module which parser to use. To do this, you can set the JC_PARSER environment variable inline with the command:

              # JC_PARSER=date salt '*' cmd.run 'date' --out=jc
              {"minion2": {"year": 2020, "month_num": 9, "day": 15, "hour": 18, "minute": 27, "second": 11, "month": "Sep", "weekday": "Tue", "weekday_num": 3, "timezone": "UTC"}}
              {"minion1": {"year": 2020, "month_num": 9, "day": 15, "hour": 18, "minute": 27, "second": 11, "month": "Sep", "weekday": "Tue", "weekday_num": 3, "timezone": "UTC"}}

              For a list of jc parsers, see the parser documentation.

              Additionally, you can add the --out-indent option to pretty-print the output:

              # JC_PARSER=date salt '*' cmd.run 'date' --out=jc --out-indent=2
              {
                "minion2": {
                  "year": 2020,
                  "month_num": 9,
                  "day": 15,
                  "hour": 18,
                  "minute": 29,
                  "second": 8,
                  "month": "Sep",
                  "weekday": "Tue",
                  "weekday_num": 3,
                  "timezone": "UTC"
                }
              }
              {
                "minion1": {
                  "year": 2020,
                  "month_num": 9,
                  "day": 15,
                  "hour": 18,
                  "minute": 29,
                  "second": 8,
                  "month": "Sep",
                  "weekday": "Tue",
                  "weekday_num": 3,
                  "timezone": "UTC"
                }
              }

              Installing the Serializer Module

              To install the Serializer Module on the Minions, you can copy the Python module to the _serializers folder within your Salt fileserver directory on the Master (typically /srv/salt) and sync to the Minions.

              First, create the /srv/salt/_serializers directory if it doesn’t already exist:

              # mkdir -p /srv/salt/_serializers

              Next, copy the Python module into the _serializers directory on the Master. I have uploaded the code to Github as a Gist:

              # curl https://gist.githubusercontent.com/kellyjonbrazil/7d67cfa003735bf80ef43fe5652950dd/raw/1541a7d327aed0366ccfea91bd0533032111d11c/jc.py -o /srv/salt/_serializers/jc.py

              Finally, sync the jc Serializer Module to the Minions:

              # salt '*' saltutil.sync_all

              Using the Serializer Module

              To use the jc Serializer Module, invoke it with the salt.slsutil.deserialize() function within a Salt state file. The function requires three arguments to deserialize with jc:

              • Argument 1: 'jc'
                • This should always be the literal string 'jc' to call the jc Serializer Module
              • Argument 2: String data to be parsed
                • This is the STDOUT string output of the command you want to deserialize
              • Argument 3: parser='<parser>'
                • <parser> is the jc parser you want to use to parse the command output. For example, to use the ifconfig parser, Argument 3 would look like this: parser='ifconfig'. For a list of jc parsers, see the parser documentation.

              For example, via Jinja2 template:

              {% set date = salt.slsutil.deserialize('jc', date_stdout, parser='date') %}

              Then you can reference any attribute of the date object (Python dictionary) in any other part of the Salt state file. Here is a full example:

              {% set date_stdout = salt.cmd.shell('date') %}
              {% set date = salt.slsutil.deserialize('jc', date_stdout, parser='date') %}
              
              run_date:
                cmd.run:
                  - name: >
                      echo 'The timezone is {{ date.timezone }}'

              One More Thing

              It is also possible to deserialize command output into objects using jc without using the jc Serializer Module. If jc is installed on the Minion, then you can pipe the command output to jc as you would normally do on the command line, then use the buit-in JSON Serializer Module to deserialize the jc JSON output into Python objects:

              {% set date = salt.slsutil.deserialize('json', salt.cmd.shell('date | jc --date')) %}
              
              run_date:
                cmd.run:
                  - name: >
                      echo 'The timezone is {{ date.timezone }}'

              Happy parsing!

              Featured

              Parsing Command Output in Ansible with JC

              Ansible is a popular automation framework that allows you to configure any number of remote hosts in a declarative and idempotent way. A common use-case is to run a shell command on the remote host, return the STDOUT output, loop through it and parse it.

              Starting in Ansible 2.9 with the community.general collection, it is possible to use jc as a filter to automatically parse the command output for you so you can easily use the output as an object. The official filter documentation can be found here. Even more detailed documentation can be found here.

              For more information on the motivations for creating jc, see my blog post.

              Installation

              To use the jc filter plugin, you just need to install jc and the community.general collection on the Ansible controller. Ansible version 2.9 or higher is required to install the community.general collection.

              Installing jc:

              $ pip3 install jc

              Installing the community.general Ansible collection:

              $ ansible-galaxy collection install community.general

              Now we are ready to use the jc filter plugin!

              Syntax

              To use the jc filter plugin you just need to pipe the command output to the plugin and specify the parser as an argument. For example, this is how you would parse the output of ps on the remote host:

                tasks:
                - shell: ps aux
                  register: result
                - set_fact:
                    myvar: "{{ result.stdout | community.general.jc('ps') }}"
              

              Note: Use underscores instead of dashes (if any) in the parser name. e.g. git-log becomes git_log

              This will generate a myvar object that includes the exact same information you would have received by running jc ps aux on the remote host. Now you can use object notation to pull out the information you are interested in.

              A Simple Example

              Let’s put it all together with a very simple example. In this example we will run the date command on the remote host and print the timezone as a debug message:

              - name: Get Timezone
                hosts: ubuntu
                tasks:
                - shell: date
                  register: result
                - set_fact:
                    myvar: "{{ result.stdout | community.general.jc('date') }}"
                - debug:
                    msg: "The timezone is: {{ myvar.timezone }}"
              

              Instead of parsing the STDOUT text manually, we used the timezone attribute of the myvar object that jc gave us. Let’s see this in action:

              $ ansible-playbook get-timezone.yml 
              
              PLAY [Get Timezone] *****************************************************************************
              
              TASK [Gathering Facts] **************************************************************************
              ok: [192.168.1.239]
              
              TASK [shell] ************************************************************************************
              changed: [192.168.1.239]
              
              TASK [set_fact] *********************************************************************************
              ok: [192.168.1.239]
              
              TASK [debug] ************************************************************************************
              ok: [192.168.1.239] => {
                  "msg": "The timezone is: UTC"
              }
              
              PLAY RECAP **************************************************************************************
              192.168.1.239              : ok=4    changed=1    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
              

              Simple – no more need to grep/awk/sed your way through the output to get what you are looking for!

              For a complete list of jc parsers available and their associated schemas, see the parser documentation.

              Happy parsing!

              Featured

              JC Version 1.13.1 Released

              Try the jc web demo!

              I’m happy to announce the release of jc version 1.13.1 available on github and pypi.

              jc now supports over 55 commands and file-types, including the new ping, sysctl, traceroute, and tracepath command parsers. The INI file parser has been enhanced to support simple key/value text files and the route command parser now supports IPv6 tables.

              Custom local parser plugins are now supported. This allows overriding existing parsers and rapid development of new parsers.

              Other updates include verbose debugging, more consistent handling of empty data, and many parser fixes for FreeBSD.

              jc can be installed via pip or through several new official OS package repositories, including Fedora, openSUSE, Arch Linux, NixOS Linux, Guix System Linux, FreeBSD, and macOS. For more information on how to get jc, click here.

              To upgrade with pip:

              $ pip3 install --upgrade jc

              New Features

              • jc is now available on the official Fedora repository (dnf install jc)
              • jc is now available on the official Arch Linux repository (pacman -S jc)
              • jc is now available on the official NixOS repository (nix-env -iA nixpkgs.jc)
              • jc is now available on the official Guix System Linux repository (guix install jc)
              • jc is now available on the official FreeBSD ports repository (portsnap fetch update && cd /usr/ports/textproc/py-jc && make install clean)
              • jc is in process (Intent To Package) for Debian packaging.
              • Local custom parser plugins allow you to override packaged parsers or rapidly create your own.
              • Verbose debugging is now supported with the -dd command argument.
              • All parsers now correctly return empty objects when sent empty data.
              • Older versions of the pygments library (>=2.3.0) are now supported (for Debian packaging)

              New Parsers

              jc now supports 55 parsers. New parsers include ping, sysctl, tracepath, and traceroute.

              Documentation and schemas for all parsers can be found here.

              ping command parser

              Linux, macOS, and FreeBSD support for the ping command:

              $ ping 8.8.8.8 -c 3 | jc --ping -p          # or:  jc -p ping 8.8.8.8 -c 3
              {
                "destination_ip": "8.8.8.8",
                "data_bytes": 56,
                "pattern": null,
                "destination": "8.8.8.8",
                "packets_transmitted": 3,
                "packets_received": 3,
                "packet_loss_percent": 0.0,
                "duplicates": 0,
                "time_ms": 2005.0,
                "round_trip_ms_min": 23.835,
                "round_trip_ms_avg": 30.46,
                "round_trip_ms_max": 34.838,
                "round_trip_ms_stddev": 4.766,
                "responses": [
                  {
                    "type": "reply",
                    "timestamp": null,
                    "bytes": 64,
                    "response_ip": "8.8.8.8",
                    "icmp_seq": 1,
                    "ttl": 118,
                    "time_ms": 23.8,
                    "duplicate": false
                  },
                  {
                    "type": "reply",
                    "timestamp": null,
                    "bytes": 64,
                    "response_ip": "8.8.8.8",
                    "icmp_seq": 2,
                    "ttl": 118,
                    "time_ms": 34.8,
                    "duplicate": false
                  },
                  {
                    "type": "reply",
                    "timestamp": null,
                    "bytes": 64,
                    "response_ip": "8.8.8.8",
                    "icmp_seq": 3,
                    "ttl": 118,
                    "time_ms": 32.7,
                    "duplicate": false
                  }
                ]
              }

              sysctl command parser

              Linux, macOS, and FreeBSD support for the sysctl -a command:

              $ sysctl -a | jc --sysctl -p          # or:  jc -p sysctl -a
              {
                "user.cs_path": "/usr/bin:/bin:/usr/sbin:/sbin",
                "user.bc_base_max": 99,
                "user.bc_dim_max": 2048,
                "user.bc_scale_max": 99,
                "user.bc_string_max": 1000,
                "user.coll_weights_max": 2,
                "user.expr_nest_max": 32
                ...
              }

              tracepath command parser

              Linux support for the tracepath command:

              $ tracepath6 3ffe:2400:0:109::2 | jc --tracepath -p
              {
                "pmtu": 1480,
                "forward_hops": 2,
                "return_hops": 2,
                "hops": [
                  {
                    "ttl": 1,
                    "guess": true,
                    "host": "[LOCALHOST]",
                    "reply_ms": null,
                    "pmtu": 1500,
                    "asymmetric_difference": null,
                    "reached": false
                  },
                  {
                    "ttl": 1,
                    "guess": false,
                    "host": "dust.inr.ac.ru",
                    "reply_ms": 0.411,
                    "pmtu": null,
                    "asymmetric_difference": null,
                    "reached": false
                  },
                  {
                    "ttl": 2,
                    "guess": false,
                    "host": "dust.inr.ac.ru",
                    "reply_ms": 0.39,
                    "pmtu": 1480,
                    "asymmetric_difference": 1,
                    "reached": false
                  },
                  {
                    "ttl": 2,
                    "guess": false,
                    "host": "3ffe:2400:0:109::2",
                    "reply_ms": 463.514,
                    "pmtu": null,
                    "asymmetric_difference": null,
                    "reached": true
                  }
                ]
              }

              traceroute command parser

              Linux, macOS, and FreeBSD support for the traceroute command:

              $ traceroute -m 3 8.8.8.8 | jc --traceroute -p          # or:  jc -p traceroute -m 3 8.8.8.8
              {
                "destination_ip": "8.8.8.8",
                "destination_name": "8.8.8.8",
                "hops": [
                  {
                    "hop": 1,
                    "probes": [
                      {
                        "annotation": null,
                        "asn": null,
                        "ip": "192.168.1.254",
                        "name": "dsldevice.local.net",
                        "rtt": 6.616
                      },
                      {
                        "annotation": null,
                        "asn": null,
                        "ip": "192.168.1.254",
                        "name": "dsldevice.local.net",
                        "rtt": 6.413
                      },
                      {
                        "annotation": null,
                        "asn": null,
                        "ip": "192.168.1.254",
                        "name": "dsldevice.local.net",
                        "rtt": 6.308
                      }
                    ]
                  },
                  {
                    "hop": 2,
                    "probes": [
                      {
                        "annotation": null,
                        "asn": null,
                        "ip": "76.220.24.1",
                        "name": "76-220-24-1.lightspeed.sntcca.sbcglobal.net",
                        "rtt": 29.367
                      },
                      {
                        "annotation": null,
                        "asn": null,
                        "ip": "76.220.24.1",
                        "name": "76-220-24-1.lightspeed.sntcca.sbcglobal.net",
                        "rtt": 40.197
                      },
                      {
                        "annotation": null,
                        "asn": null,
                        "ip": "76.220.24.1",
                        "name": "76-220-24-1.lightspeed.sntcca.sbcglobal.net",
                        "rtt": 29.162
                      }
                    ]
                  },
                  {
                    "hop": 3,
                    "probes": [
                      {
                        "annotation": null,
                        "asn": null,
                        "ip": null,
                        "name": null,
                        "rtt": null
                      }
                    ]
                  }
                ]
              }

              Updated Parsers

              There have been many parser updates since v1.11.0. The INI file parser has been enhanced to support files and output that contains simple key/value pairs. The route command parser has been enhanced to add support for IPv6 routing tables. The uname parser provides more intuitive debug messages and an issue in the iptables command parser was fixed, allowing it to convert the last row of a table. Many other parser enhancements including the consistent handling of blank input, FreeBSD support, and minor field additions and fixes are included.

              Key/Value Pair Files with the INI File Parser

              The INI file parser has been enhanced to now support files containing simple key/value pairs. Files can include comments prepended with # or ; and keys and values can be delimited by = or : with or without spaces. Quotation marks are stripped from quoted values, though they can be kept with the -r (raw output) jc argument.

              These types of files can be found in many places, including configuration files in /etc. (e.g. /etc/sysconfig/network-scripts).

              $ cat keyvalue.txt
              # this file contains key/value pairs
              name = John Doe
              address=555 California Drive
              age: 34
              ; comments can include # or ;
              # delimiter can be = or :
              # quoted values have quotation marks stripped by default
              # but can be preserved with the -r argument
              occupation:"Engineer"
              
              $ cat keyvalue.txt | jc --ini -p
              {
                "name": "John Doe",
                "address": "555 California Drive",
                "age": "34",
                "occupation": "Engineer"
              }

              route Command Parser

              The route command parser has been enhanced to support IPv6 tables.

              $ route -6 | jc --route -p          # or: jc -p route -6
              [
                {
                  "destination": "[::]/96",
                  "next_hop": "[::]",
                  "flags": "!n",
                  "metric": 1024,
                  "ref": 0,
                  "use": 0,
                  "iface": "lo",
                  "flags_pretty": [
                    "REJECT"
                  ]
                },
                {
                  "destination": "0.0.0.0/96",
                  "next_hop": "[::]",
                  "flags": "!n",
                  "metric": 1024,
                  "ref": 0,
                  "use": 0,
                  "iface": "lo",
                  "flags_pretty": [
                    "REJECT"
                  ]
                },
                {
                  "destination": "2002:a00::/24",
                  "next_hop": "[::]",
                  "flags": "!n",
                  "metric": 1024,
                  "ref": 0,
                  "use": 0,
                  "iface": "lo",
                  "flags_pretty": [
                    "REJECT"
                  ]
                },
                ...
              ]

              Schema Changes

              There are no schema changes in this release.

              Full Parser List

              • airport -I
              • airport -s
              • arp
              • blkid
              • crontab
              • crontab-u
              • CSV
              • df
              • dig
              • dmidecode
              • du
              • env
              • file
              • free
              • fstab
              • /etc/group
              • /etc/gshadow
              • history
              • /etc/hosts
              • id
              • ifconfig
              • INI
              • iptables
              • jobs
              • last and lastb
              • ls
              • lsblk
              • lsmod
              • lsof
              • mount
              • netstat
              • ntpq
              • /etc/passwd
              • ping
              • pip list
              • pip show
              • ps
              • route
              • /etc/shadow
              • ss
              • stat
              • sysctl
              • systemctl
              • systemctl list-jobs
              • systemctl list-sockets
              • systemctl list-unit-files
              • timedatectl
              • tracepath
              • traceroute
              • uname -a
              • uptime
              • w
              • who
              • XML
              • YAML

              For more information on the motivations for creating jc, see my blog post.

              Happy parsing!

              Featured

              More Comprehensive Tracebacks in Python

              Python has great exception-handling with nice traceback messages that can help debug issues with your code. Here’s an example of a typical traceback message:

              Traceback (most recent call last):
                File "/Users/kbrazil/Library/Python/3.7/bin/jc", line 11, in <module>
                  load_entry_point('jc', 'console_scripts', 'jc')()
                File "/Users/kbrazil/git/jc/jc/cli.py", line 396, in main
                  result = parser.parse(data, raw=raw, quiet=quiet)
                File "/Users/kbrazil/git/jc/jc/parsers/uname.py", line 108, in parse
                  raw_output['kernel_release'] = parsed_line.pop(0)
              IndexError: pop from empty list
              

              I usually read these from the bottom-up to zero-in on the issue. Here I can see that my program is trying to pop the last item off a list called parsed_line, but the list is empty and Python doesn’t know what to do, so it quits with an IndexError exception.

              The traceback conveniently includes the line number and snippet of the offending code. This is usually correct, but the line numbering can be off depending on the type of error or exception. This might be enough information for me to dig into the code and figure out why parsed_line is empty. But what about a more complex example?

              Traceback (most recent call last):
                File "/Users/kbrazil/Library/Python/3.7/bin/jc", line 11, in <module>
                  load_entry_point('jc', 'console_scripts', 'jc')()
                File "/Users/kbrazil/git/jc/jc/cli.py", line 396, in main
                  result = parser.parse(data, raw=raw, quiet=quiet)
                File "/Users/kbrazil/git/jc/jc/parsers/arp.py", line 226, in parse
                  'hwtype': line[4].lstrip('[').rstrip(']'),
              IndexError: list index out of range

              In this traceback I can see that the program is trying to pull the fifth item from the line list but Python can’t grab it – probably because the list doesn’t have that many items. This traceback doesn’t show me the state of the variables, so I can’t tell what input the function took or what the line variable looks like when causing this issue.

              What I’d really like is to see more context (the code lines before and after the error) along with the variable state when the error occurred. Many times this is done with a debugger or with print() statements. But there is another way!

              cgitb (deprecated)

              Back in 1995 when CGI scripts were all the rage, Python added the cgi library along with its helper module, cgitb. This module would print more verbose traceback messages to the browser to help with troubleshooting. Conveniently, its traceback messages would include surrounding code context and variable state! cgitb is poorly named since it can drop-in replace standard tracebacks on any type of program. The name might be why it never really gained traction. Unfortunately, cgitb is set to be deprecated in Python 3.10, but let’s see how it works and then check out how to replace it:

              IndexError
              Python 3.7.6: /usr/local/opt/python/bin/python3.7
              Mon Jul  6 12:09:08 2020
              
              A problem occurred in a Python script.  Here is the sequence of
              function calls leading up to the error, in the order they occurred.
              
               /Users/kbrazil/Library/Python/3.7/bin/jc in <module>()
                  2 # EASY-INSTALL-ENTRY-SCRIPT: 'jc','console_scripts','jc'
                  3 __requires__ = 'jc'
                  4 import re
                  5 import sys
                  6 from pkg_resources import load_entry_point
                  7 
                  8 if __name__ == '__main__':
                  9     sys.argv[0] = re.sub(r'(-script\.pyw?|\.exe)?$', '', sys.argv[0])
                 10     sys.exit(
                 11         load_entry_point('jc', 'console_scripts', 'jc')()
                 12     )
              load_entry_point = <function load_entry_point>
              
               /Users/kbrazil/git/jc/jc/cli.py in main()
                391 
                392         if parser_name in parsers:
                393             # load parser module just in time so we don't need to load all modules
                394             parser = parser_module(arg)
                395             try:
                396                 result = parser.parse(data, raw=raw, quiet=quiet)
                397                 found = True
                398                 break
                399 
                400             except Exception:
                401                 if debug:
              result undefined
              parser = <module 'jc.parsers.arp' from '/Users/kbrazil/git/jc/jc/parsers/arp.py'>
              parser.parse = <function parse>
              data = "#!/usr/bin/env python3\n\nimport jc.parsers.ls\nimp...print(tabulate.tabulate(parsed, headers='keys'))\n"
              raw = False
              quiet = False
              
               /Users/kbrazil/git/jc/jc/parsers/arp.py in parse(data="#!/usr/bin/env python3\n\nimport jc.parsers.ls\nimp...print(tabulate.tabulate(parsed, headers='keys'))\n", raw=False, quiet=False)
                221             for line in cleandata:
                222                 line = line.split()
                223                 output_line = {
                224                     'name': line[0],
                225                     'address': line[1].lstrip('(').rstrip(')'),
                226                     'hwtype': line[4].lstrip('[').rstrip(']'),
                227                     'hwaddress': line[3],
                228                     'iface': line[6],
                229                 }
                230                 raw_output.append(output_line)
                231 
              line = ['#!/usr/bin/env', 'python3']
              ].lstrip undefined
              IndexError: list index out of range
                  __cause__ = None
                  __class__ = <class 'IndexError'>
                  __context__ = None
                  __delattr__ = <method-wrapper '__delattr__' of IndexError object>
                  __dict__ = {}
                  __dir__ = <built-in method __dir__ of IndexError object>
                  __doc__ = 'Sequence index out of range.'
                  __eq__ = <method-wrapper '__eq__' of IndexError object>
                  __format__ = <built-in method __format__ of IndexError object>
                  __ge__ = <method-wrapper '__ge__' of IndexError object>
                  __getattribute__ = <method-wrapper '__getattribute__' of IndexError object>
                  __gt__ = <method-wrapper '__gt__' of IndexError object>
                  __hash__ = <method-wrapper '__hash__' of IndexError object>
                  __init__ = <method-wrapper '__init__' of IndexError object>
                  __init_subclass__ = <built-in method __init_subclass__ of type object>
                  __le__ = <method-wrapper '__le__' of IndexError object>
                  __lt__ = <method-wrapper '__lt__' of IndexError object>
                  __ne__ = <method-wrapper '__ne__' of IndexError object>
                  __new__ = <built-in method __new__ of type object>
                  __reduce__ = <built-in method __reduce__ of IndexError object>
                  __reduce_ex__ = <built-in method __reduce_ex__ of IndexError object>
                  __repr__ = <method-wrapper '__repr__' of IndexError object>
                  __setattr__ = <method-wrapper '__setattr__' of IndexError object>
                  __setstate__ = <built-in method __setstate__ of IndexError object>
                  __sizeof__ = <built-in method __sizeof__ of IndexError object>
                  __str__ = <method-wrapper '__str__' of IndexError object>
                  __subclasshook__ = <built-in method __subclasshook__ of type object>
                  __suppress_context__ = False
                  __traceback__ = <traceback object>
                  args = ('list index out of range',)
                  with_traceback = <built-in method with_traceback of IndexError object>
              
              The above is a description of an error in a Python program.  Here is
              the original traceback:
              
              Traceback (most recent call last):
                File "/Users/kbrazil/Library/Python/3.7/bin/jc", line 11, in <module>
                  load_entry_point('jc', 'console_scripts', 'jc')()
                File "/Users/kbrazil/git/jc/jc/cli.py", line 396, in main
                  result = parser.parse(data, raw=raw, quiet=quiet)
                File "/Users/kbrazil/git/jc/jc/parsers/arp.py", line 226, in parse
                  'hwtype': line[4].lstrip('[').rstrip(']'),
              IndexError: list index out of range

              This verbose traceback gives me just what I’m looking for! Though the default is 5, I told cgitb to print out 11 lines of context. Now I can see the two variables I’m particularly interested in to troubleshoot this issue: data and line.

              data = "#!/usr/bin/env python3\n\nimport jc.parsers.ls\nimp...print(tabulate.tabulate(parsed, headers='keys'))\n"

              (Notice how it snips the value if it’s too long. Pretty cool!)

              line = ['#!/usr/bin/env', 'python3']

              Now I can easily see that the data that was input into the function does not look like the type of data expected at all. (it is expecting text output from the arp command and instead it was fed in another Python script file) I can also see that the line list only has two items.

              I included cgitb in jc to provide a verbose debug command option (-dd) to help speed up troubleshooting of parsing issues – typically during development of a new parser or to quickly identify an issue a user is having over email. It seemed perfect for my needs and aside from the weird name it worked well.

              Then I noticed that cgitb was to be deprecated along with the cgi module with no replacement.

              tracebackplus

              I decided to vendorize the builtin cgitb library so it wouldn’t be orphaned in later versions of Python. After looking at the code I found it would be pretty easy to simplify the module by taking out all of the HTML rendering cruft. And why not rename it to something more descriptive while we’re at it? After not too much thought, I settled on tracebackplus.

              Like cgitb, tracebackplus doesn’t require any external libraries and can easily replace standard tracebacks with the following code:

              import tracebackplus
              tracebackplus.enable(context=11)

              Here is the code for tracebackplus along with the permissive MIT license. Feel free to use this code in your projects.

              Here’s an example of how it is being used in jc to provide different levels of debugging using the -d (standard traceback) or -dd (tracebackplus) command line arguments:

              try:
                  result = parser.parse(data, raw=raw, quiet=quiet)
                  found = True
                  break
              
              except Exception:
                  if debug:
                      if verbose_debug:
                          import jc.tracebackplus
                          jc.tracebackplus.enable(context=11)
              
                      raise
              
                  else:
                      import jc.utils
                      jc.utils.error_message(
                          f'{parser_name} parser could not parse the input data. Did you use the correct parser?\n'
                          '                 For details use the -d or -dd option.')
                      sys.exit(1)

              Happy debugging!

              Featured

              JC Version 1.11.1 Released

              Try the jc web demo!

              I’m happy to announce the release of jc version 1.11.1 available on github and pypi.

              jc now supports over 50 commands and file-types and now can be installed via Homebrew (macOS) and zypper (OpenSUSE). In addition, jc can now be installed via DEB and RPM packages or run as a single binary on linux or macOS. You can set your own custom colors for jc to display and more command parsers are supported on macOS. See below for more information on the new features.

              To upgrade, run:

              $ pip3 install --upgrade jc

              RPM/DEB packages and Binaries can also be found here.

              OS package repositories (e.g. brew, zypper, etc.) will be updated with the latest version of jc on their own future release schedules.

              New Features

              • jc now supports custom colors. You can customize the colors by setting the JC_COLORS environment variable.
              • jc is now available on macOS via Homebrew (brew install jc)
              • jc is now available on OpenSUSE via zypper
              • DEB, RPM, and Binary packages are now available for linux and macOS
              • Several back-end updates to support packaging on standard linux distribution package repositories in the future (e.g. Fedora)

              New Parsers

              jc now supports 51 parsers. The dmidecode command is now supported for linux platforms.

              Documentation and schemas for all parsers can be found here.

              dmidecode command parser

              Linux support for the dmidecode command:

              # jc -p dmidecode
              [
                {
                  "handle": "0x0000",
                  "type": 0,
                  "bytes": 24,
                  "description": "BIOS Information",
                  "values": {
                    "vendor": "Phoenix Technologies LTD",
                    "version": "6.00",
                    "release_date": "04/13/2018",
                    "address": "0xEA490",
                    "runtime_size": "88944 bytes",
                    "rom_size": "64 kB",
                    "characteristics": [
                      "ISA is supported",
                      "PCI is supported",
                      "PC Card (PCMCIA) is supported",
                      "PNP is supported",
                      "APM is supported",
                      "BIOS is upgradeable",
                      "BIOS shadowing is allowed",
                      "ESCD support is available",
                      "Boot from CD is supported",
                      "Selectable boot is supported",
                      "EDD is supported",
                      "Print screen service is supported (int 5h)",
                      "8042 keyboard services are supported (int 9h)",
                      "Serial services are supported (int 14h)",
                      "Printer services are supported (int 17h)",
                      "CGA/mono video services are supported (int 10h)",
                      "ACPI is supported",
                      "Smart battery is supported",
                      "BIOS boot specification is supported",
                      "Function key-initiated network boot is supported",
                      "Targeted content distribution is supported"
                    ],
                    "bios_revision": "4.6",
                    "firmware_revision": "0.0"
                  }
                },
                ...
              ]

              Updated Parsers

              The netstat command is now supported on macOS:

              $ jc -p netstat
              [
                {
                  "proto": "tcp4",
                  "recv_q": 0,
                  "send_q": 0,
                  "local_address": "mylaptop.local",
                  "foreign_address": "173.199.15.254",
                  "state": "SYN_SENT   ",
                  "kind": "network",
                  "local_port": "57561",
                  "foreign_port": "https",
                  "transport_protocol": "tcp",
                  "network_protocol": "ipv4",
                  "local_port_num": 57561
                },
                {
                  "proto": "tcp4",
                  "recv_q": 0,
                  "send_q": 0,
                  "local_address": "mylaptop.local",
                  "foreign_address": "192.0.71.3",
                  "state": "ESTABLISHED",
                  "kind": "network",
                  "local_port": "57525",
                  "foreign_port": "https",
                  "transport_protocol": "tcp",
                  "network_protocol": "ipv4",
                  "local_port_num": 57525
                },
                ...
              ]

              The netstat parser has been enhanced to support the -r (routes) and -i (interfaces) options on both linux and macOS.

              $ jc -p netstat -r
              [
                {
                  "destination": "default",
                  "gateway": "router.local",
                  "route_flags": "UGSc",
                  "route_refs": 102,
                  "use": 24,
                  "iface": "en0",
                  "kind": "route"
                },
                {
                  "destination": "127",
                  "gateway": "localhost",
                  "route_flags": "UCS",
                  "route_refs": 0,
                  "use": 0,
                  "iface": "lo0",
                  "kind": "route"
                },
                ...
              ]
              $ jc -p netstat -i
              [
                {
                  "iface": "lo0",
                  "mtu": 16384,
                  "network": "<Link#1>",
                  "address": null,
                  "ipkts": 1777797,
                  "ierrs": 0,
                  "opkts": 1777797,
                  "oerrs": 0,
                  "coll": 0,
                  "kind": "interface"
                },
                {
                  "iface": "lo0",
                  "mtu": 16384,
                  "network": "127",
                  "address": "localhost",
                  "ipkts": 1777797,
                  "ierrs": null,
                  "opkts": 1777797,
                  "oerrs": null,
                  "coll": null,
                  "kind": "interface"
                },
                {
                  "iface": "lo0",
                  "mtu": 16384,
                  "network": "localhost",
                  "address": "::1",
                  "ipkts": 1777797,
                  "ierrs": null,
                  "opkts": 1777797,
                  "oerrs": null,
                  "coll": null,
                  "kind": "interface"
                },
                ...
              ]

              The stat command is now supported on macOS.

              $ jc -p stat jc*
              [
                {
                  "file": "jc-1.11.1-linux.sha256",
                  "device": "16778221",
                  "inode": 82163627,
                  "flags": "-rw-r--r--",
                  "links": 1,
                  "user": "joeuser",
                  "group": "staff",
                  "rdev": 0,
                  "size": 69,
                  "access_time": "May 26 08:27:44 2020",
                  "modify_time": "May 24 18:47:25 2020",
                  "change_time": "May 24 18:51:21 2020",
                  "birth_time": "May 24 18:47:25 2020",
                  "block_size": 4096,
                  "blocks": 8,
                  "osx_flags": "0"
                },
                {
                  "file": "jc-1.11.1-linux.tar.gz",
                  "device": "16778221",
                  "inode": 82163628,
                  "flags": "-rw-r--r--",
                  "links": 1,
                  "user": "joeuser",
                  "group": "staff",
                  "rdev": 0,
                  "size": 20226936,
                  "access_time": "May 26 08:27:44 2020",
                  "modify_time": "May 24 18:47:25 2020",
                  "change_time": "May 24 18:47:25 2020",
                  "birth_time": "May 24 18:47:25 2020",
                  "block_size": 4096,
                  "blocks": 39512,
                  "osx_flags": "0"
                },
                ...
              ]

              Schema Changes

              There are no schema changes in this release.

              Full Parser List

              • airport -I
              • airport -s
              • arp
              • blkid
              • crontab
              • crontab-u
              • CSV
              • df
              • dig
              • dmidecode
              • du
              • env
              • file
              • free
              • fstab
              • /etc/group
              • /etc/gshadow
              • history
              • /etc/hosts
              • id
              • ifconfig
              • INI
              • iptables
              • jobs
              • last and lastb
              • ls
              • lsblk
              • lsmod
              • lsof
              • mount
              • netstat
              • ntpq
              • /etc/passwd
              • pip list
              • pip show
              • ps
              • route
              • /etc/shadow
              • ss
              • stat
              • systemctl
              • systemctl list-jobs
              • systemctl list-sockets
              • systemctl list-unit-files
              • timedatectl
              • uname -a
              • uptime
              • w
              • who
              • XML
              • YAML

              For more information on the motivations for creating jc, see my blog post.

              Happy parsing!

              Featured

              JC Version 1.10.2 Released

              Try the jc web demo!

              I’m happy to announce the release of jc version 1.10.2 available on github and pypi. See below for more information on the new features.

              To upgrade, run:

              $ pip3 install --upgrade jc

              New Features

              jc now supports color output by default when printing to the terminal. Color is automatically disabled when piping to another program. The -m (monochrome) option can be used to disable color output to the terminal.

              New Parsers

              No new parsers in this release.

              Updated Parsers

              • file command parser: minor fix for some edge cases
              • arp command parser: fix macOS detection for some edge cases
              • dig command parser: add axfr support

              Schema Changes

              The dig command parser now supports the axfr option. The schema has been updated to add this section:

              $ jc -p dig @81.4.108.41 axfr zonetransfer.me
              [
                {
                  "axfr": [
                    {
                      "name": "zonetransfer.me.",
                      "ttl": 7200,
                      "class": "IN",
                      "type": "SOA",
                      "data": "nsztm1.digi.ninja. robin.digi.ninja. 2019100801 172800 900 1209600 3600"
                    },
                    {
                      "name": "zonetransfer.me.",
                      "ttl": 300,
                      "class": "IN",
                      "type": "HINFO",
                      "data": "\"Casio fx-700G\" \"Windows XP\""
                    },
                    {
                      "name": "zonetransfer.me.",
                      "ttl": 301,
                      "class": "IN",
                      "type": "TXT",
                      "data": "\"google-site-verification=tyP28J7JAUHA9fw2sHXMgcCC0I6XBmmoVi04VlMewxA\""
                    },
                    ...
                  ],
                  "query_time": 805,
                  "server": "81.4.108.41#53(81.4.108.41)",
                  "when": "Thu Apr 09 08:05:31 PDT 2020",
                  "size": "50 records (messages 1, bytes 1994)"
                }
              ]

              Full Parser List

              • airport -I
              • airport -s
              • arp
              • blkid
              • crontab
              • crontab-u
              • CSV
              • df
              • dig
              • du
              • env
              • file
              • free
              • fstab
              • /etc/group
              • /etc/gshadow
              • history
              • /etc/hosts
              • id
              • ifconfig
              • INI
              • iptables
              • jobs
              • last and lastb
              • ls
              • lsblk
              • lsmod
              • lsof
              • mount
              • netstat
              • ntpq
              • /etc/passwd
              • pip list
              • pip show
              • ps
              • route
              • /etc/shadow
              • ss
              • stat
              • systemctl
              • systemctl list-jobs
              • systemctl list-sockets
              • systemctl list-unit-files
              • timedatectl
              • uname -a
              • uptime
              • w
              • who
              • XML
              • YAML

              For more information on the motivations for creating jc, see my blog post.

              Happy parsing!

              Featured

              Jello: The JQ Alternative for Pythonistas

              Built on jello:

              Jello Explorer (jellex): TUI interactive JSON filter using Python syntax

              jello web demo

              I’m a big fan of using structured data at the command line. So much so that I’ve written a couple of utilities to promote JSON in the CLI:

              Typically I use jq to filter and process the JSON output into submission until I get what I want. But if you’re anything like me, you spend a lot of time googling how to do what you want in jq because the syntax can get a little out of hand. In fact, I keep notes with example jq queries I’ve used before in case I need those techniques again.

              jq is great for simple things, but sometimes when I want to iterate through a deeply nested structure with arrays of objects I find python’s list and dictionary syntax easier to comprehend.

              Hello jello

              That’s why I created jello. jello works similarly to jq but uses the python interpreter, so you can iterate with loops, comprehensions, variables, expressions, etc. just like you would in a full-fledged python script.

              The nice thing about jello is that it removes a lot of the boilerplate code you would need to ingest and output the JSON or JSON Lines data so you can focus on the logic.

              Let’s take the following output from jc -ap:

              $ jc -ap
              {
                "name": "jc",
                "version": "1.9.2",
                "description": "jc cli output JSON conversion tool",
                "author": "Kelly Brazil",
                "author_email": "kellyjonbrazil@gmail.com",
                "parser_count": 50,
                "parsers": [
                  {
                    "name": "airport",
                    "argument": "--airport",
                    "version": "1.0",
                    "description": "airport -I command parser",
                    "author": "Kelly Brazil",
                    "author_email": "kellyjonbrazil@gmail.com",
                    "compatible": [
                      "darwin"
                    ],
                    "magic_commands": [
                      "airport -I"
                    ]
                  },
                  {
                    "name": "airport_s",
                    "argument": "--airport-s",
                    "version": "1.0",
                    "description": "airport -s command parser",
                    "author": "Kelly Brazil",
                    "author_email": "kellyjonbrazil@gmail.com",
                    "compatible": [
                      "darwin"
                    ],
                    "magic_commands": [
                      "airport -s"
                    ]
                  },
                  ...
              ]

              Let’s say I want a list of the parser names that are compatible with macOS. Here is a jq query that will get down to that level:

              $ jc -a | jq '[.parsers[] | select(.compatible[] | contains("darwin")) | .name]' 
              [
                "airport",
                "airport_s",
                "arp",
                "crontab",
                "crontab_u",
                "csv",
                ...
              ]

              This is not too terribly bad, but you need to be careful about bracket and parenthesis placements. Here’s the same query in jello:

              $ jc -a | jello '[parser.name for parser in _.parsers if "darwin" in parser.compatible]'
              [
                "airport",
                "airport_s",
                "arp",
                "crontab",
                "crontab_u",
                "csv",
                ...
              ]

              As you can see, jello gives you the JSON or JSON Lines input as a dictionary or list of dictionaries assigned to ‘_‘. Then you process it as you’d like using standard python syntax, with the convenience of dot notation. jello automatically takes care of slurping input and printing valid JSON or JSON Lines depending on the value of the last expression.

              The example above is not quite as terse as using jq, but it’s more readable to someone who is familiar with python list comprehension. As with any programming language, there are multiple ways to skin a cat. We can also do a similar query with a for loop:

              $ jc -a | jello '\
              result = []
              for parser in _.parsers:
                if "darwin" in parser.compatible:
                  result.append(parser.name)
              result'
              [
                "airport",
                "airport_s",
                "arp",
                "crontab",
                "crontab_u",
                "csv",
                ...
              ]

              Advanced JSON Processing

              These are very simple examples and jq syntax might be ok here (though I prefer python syntax). But what if we try to do something more complex? Let’s take one of the advanced examples from the excellent jq tutorial by Matthew Lincoln.

              Under Grouping and Counting, Matthew describes an advanced jq filter against a sample Twitter dataset that includes JSON Lines data. There he describes the following query:

              “We can now create a table of users. Let’s create a table with columns for the user id, user name, followers count, and a column of their tweet ids separated by a semicolon.”

              https://programminghistorian.org/en/lessons/json-and-jq

              Here is the final jq query:

              $ cat twitterdata.jlines | jq -s 'group_by(.user) | 
                                               .[] | 
                                               {
                                                 user_id: .[0].user.id, 
                                                 user_name: .[0].user.screen_name, 
                                                 user_followers: .[0].user.followers_count, 
                                                 tweet_ids: [.[].id | tostring] | join(";")
                                               }'
              ...
              {
                "user_id": 47073035,
                "user_name": "msoltanm",
                "user_followers": 63,
                "tweet_ids": "619172275741298700"
              }
              {
                "user_id": 2569107372,
                "user_name": "SlavinOleg",
                "user_followers": 35,
                "tweet_ids": "501064198973960200;501064202794971140;501064214467731460;501064215759568900;501064220121632800"
              }
              {
                "user_id": 2369225023,
                "user_name": "SkogCarla",
                "user_followers": 10816,
                "tweet_ids": "501064217667960800"
              }
              {
                "user_id": 2477475030,
                "user_name": "bennharr",
                "user_followers": 151,
                "tweet_ids": "501064201503113200"
              }
              {
                "user_id": 42226593,
                "user_name": "shirleycolleen",
                "user_followers": 2114,
                "tweet_ids": "619172281294655500;619172179960328200"
              }
              ...

              This is a fantastic query! It’s actually deceptively simple looking – it takes quite a few paragraphs for Matthew to describe how it works and there are some tricky brackets, braces, and parentheses in there that need to be set just right. Let’s see how we could tackle this task with jello using standard python syntax:

              $ cat twitterdata.jlines | jello -l '\
              user_ids = set()
              for tweet in _:
                  user_ids.add(tweet.user.id)
              result = []
              for user in user_ids:
                  user_profile = {}
                  tweet_ids = []
                  for tweet in _:
                      if tweet.user.id == user:
                          user_profile.update({
                              "user_id": user,
                              "user_name": tweet.user.screen_name,
                              "user_followers": tweet.user.followers_count})
                          tweet_ids.append(str(tweet.id))
                  user_profile["tweet_ids"] = ";".join(tweet_ids)
                  result.append(user_profile)
              result'
              ...
              {"user_id": 2696111005, "user_name": "EGEVER142", "user_followers": 1433, "tweet_ids": "619172303654518784"}
              {"user_id": 42226593, "user_name": "shirleycolleen", "user_followers": 2114, "tweet_ids": "619172281294655488;619172179960328192"}
              {"user_id": 106948003, "user_name": "MrKneeGrow", "user_followers": 172, "tweet_ids": "501064228627705857"}
              {"user_id": 18270633, "user_name": "ahhthatswhy", "user_followers": 559, "tweet_ids": "501064204661850113"}
              {"user_id": 14331818, "user_name": "edsu", "user_followers": 4220, "tweet_ids": "615973042443956225;618602288781860864"}
              {"user_id": 2569107372, "user_name": "SlavinOleg", "user_followers": 35, "tweet_ids": "501064198973960192;501064202794971136;501064214467731457;501064215759568897;501064220121632768"}
              {"user_id": 22668719, "user_name": "nodehyena", "user_followers": 294, "tweet_ids": "501064222772445187"}
              ...

              So there’s 17 lines of python… again not as terse as jq, but for pythonistas this is probably a lot easier to understand what is going on. This is a pretty simple and naive implementation – there are probably much better approaches that are shorter, simpler, faster, etc. but the point is I can come back six months from now and understand what is going on if I need to debug or tweak it.

              Just for fun, let’s pipe this result through jtbl to see what it looks like:

                 user_id  user_name          user_followers  tweet_ids
              ----------  ---------------  ----------------  ----------------------------------------------------------------------------------------------
              ...
              2481812382  SadieODoyle                    42  501064200035516416
              2696111005  EGEVER142                    1433  619172303654518784
                42226593  shirleycolleen               2114  619172281294655488;619172179960328192
               106948003  MrKneeGrow                    172  501064228627705857
                18270633  ahhthatswhy                   559  501064204661850113
                14331818  edsu                         4220  615973042443956225;618602288781860864
              2569107372  SlavinOleg                     35  501064198973960192;501064202794971136;501064214467731457;501064215759568897;501064220121632768
                22668719  nodehyena                     294  501064222772445187
                23598003  victoriasview                1163  501064228288364546
               851336634  20mUsa                      15643  50106414
              ...

              Very cool! Find more examples at https://github.com/kellyjonbrazil/jello. I hope you find jello useful in your command line pipelines.

              Try Jello Explorer and the jello web demo!

              Featured

              JC Version 1.9.0 Released

              Try the jc web demo!

              I’m happy to announce the release of jc version 1.9.0 available on github and pypi. See below for more information on the new features and parsers.

              To upgrade, run:

              $ pip3 install --upgrade jc

              jc In The News!

              The Linux Unplugged podcast gave a shoutout to jc on their February 18, 2020 episode for their App Pick segment. The discussion starts at 45:47. Go check out the podcast!

              New Parsers

              jc now includes 50 parsers! New parsers (tested on linux and OSX) include airport -I, airport -s, file, ntpq -p, and timedatectl commands.

              Documentation and schemas for all parsers can be found here.

              airport -I command parser

              OSX support for the airport -I command:

              $ airport -I | jc --airport -p          # or:  jc -p airport -I
              {
                "agrctlrssi": -66,
                "agrextrssi": 0,
                "agrctlnoise": -90,
                "agrextnoise": 0,
                "state": "running",
                "op_mode": "station",
                "lasttxrate": 195,
                "maxrate": 867,
                "lastassocstatus": 0,
                "802_11_auth": "open",
                "link_auth": "wpa2-psk",
                "bssid": "3c:37:86:15:ad:f9",
                "ssid": "SnazzleDazzle",
                "mcs": 0,
                "channel": "48,80"
              }

              airport -s command parser

              OSX support for the airport -s command.

              $ airport -s | jc --airport-s -p          or: jc -p airport -s
              [
                {
                  "ssid": "DIRECT-4A-HP OfficeJet 3830",
                  "bssid": "00:67:eb:2a:a7:3b",
                  "rssi": -90,
                  "channel": "6",
                  "ht": true,
                  "cc": "--",
                  "security": [
                    "WPA2(PSK/AES/AES)"
                  ]
                },
                {
                  "ssid": "Latitude38",
                  "bssid": "c0:ff:d5:d2:7a:f3",
                  "rssi": -85,
                  "channel": "11",
                  "ht": true,
                  "cc": "US",
                  "security": [
                    "WPA2(PSK/AES/AES)"
                  ]
                },
                {
                  "ssid": "xfinitywifi",
                  "bssid": "6e:e3:0e:b8:45:99",
                  "rssi": -83,
                  "channel": "11",
                  "ht": true,
                  "cc": "US",
                  "security": [
                    "NONE"
                  ]
                },
                ...
              ]

              file command parser

              Linux and OSX support for the file command:

              $ file * | jc --file -p          or:  jc -p file *
              [
                {
                  "filename": "Applications",
                  "type": "directory"
                },
                {
                  "filename": "another file with spaces",
                  "type": "empty"
                },
                {
                  "filename": "argstest.py",
                  "type": "Python script text executable, ASCII text"
                },
                {
                  "filename": "blkid-p.out",
                  "type": "ASCII text"
                },
                {
                  "filename": "blkid-pi.out",
                  "type": "ASCII text, with very long lines"
                },
                {
                  "filename": "cd_catalog.xml",
                  "type": "XML 1.0 document text, ASCII text, with CRLF line terminators"
                },
                {
                  "filename": "centosserial.sh",
                  "type": "Bourne-Again shell script text executable, UTF-8 Unicode text"
                },
                ...
              ]

              ntpq command parser

              Linux support for the ntpq -p command.

              $ ntpq -p | jc --ntpq -p          # or:  jc -p ntpq -p
              [
                {
                  "remote": "44.190.6.254",
                  "refid": "127.67.113.92",
                  "st": 2,
                  "t": "u",
                  "when": 1,
                  "poll": 64,
                  "reach": 1,
                  "delay": 23.399,
                  "offset": -2.805,
                  "jitter": 2.131,
                  "state": null
                },
                {
                  "remote": "mirror1.sjc02.s",
                  "refid": "216.218.254.202",
                  "st": 2,
                  "t": "u",
                  "when": 2,
                  "poll": 64,
                  "reach": 1,
                  "delay": 29.325,
                  "offset": 1.044,
                  "jitter": 4.069,
                  "state": null
                }
              ]

              timedatectl command parser

              Linux support for the timedatectl command:

              $ timedatectl | jc --timedatectl -p          # or:  jc -p timedatectl
              {
                "local_time": "Tue 2020-03-10 17:53:21 PDT",
                "universal_time": "Wed 2020-03-11 00:53:21 UTC",
                "rtc_time": "Wed 2020-03-11 00:53:21",
                "time_zone": "America/Los_Angeles (PDT, -0700)",
                "ntp_enabled": true,
                "ntp_synchronized": true,
                "rtc_in_local_tz": false,
                "dst_active": true
              }

              Updated Parsers

              No updated parsers in this release.

              Schema Changes

              There are no schema changes in this release.

              Full Parser List

              • airport -I
              • airport -s
              • arp
              • blkid
              • crontab
              • crontab-u
              • CSV
              • df
              • dig
              • du
              • env
              • file
              • free
              • fstab
              • /etc/group
              • /etc/gshadow
              • history
              • /etc/hosts
              • id
              • ifconfig
              • INI
              • iptables
              • jobs
              • last and lastb
              • ls
              • lsblk
              • lsmod
              • lsof
              • mount
              • netstat
              • ntpq
              • /etc/passwd
              • pip list
              • pip show
              • ps
              • route
              • /etc/shadow
              • ss
              • stat
              • systemctl
              • systemctl list-jobs
              • systemctl list-sockets
              • systemctl list-unit-files
              • timedatectl
              • uname -a
              • uptime
              • w
              • who
              • XML
              • YAML

              For more information on the motivations for creating jc, see my blog post.

              Happy parsing!

              Featured

              JSON Tables in the Terminal

              The other day I was looking around for a simple command-line tool to print JSON and JSON Lines data to a table in the terminal. I found a few programs that can do it with some massaging of the data, like visidata, jt, and json-table, but these really didn’t meet my requirements.

              I wanted to pipe JSON or JSON Lines data into a program and get a nicely formatted table with correct headers without any additional configuration or arguments. I also wanted it to automatically fit the terminal width and wrap or truncate the columns to fit the data with no complicated configuration. Basically, I just wanted it to “do the right thing” so I can view JSON data in a tabular format without any fuss.

              I ended up creating a little command-line utility called jtbl that does exactly that:

              $ cat cities.json | jtbl 
                LatD    LatM    LatS  NS      LonD    LonM    LonS  EW    City               State
              ------  ------  ------  ----  ------  ------  ------  ----  -----------------  -------
                  41       5      59  N         80      39       0  W     Youngstown         OH
                  42      52      48  N         97      23      23  W     Yankton            SD
                  46      35      59  N        120      30      36  W     Yakima             WA
                  42      16      12  N         71      48       0  W     Worcester          MA
                  43      37      48  N         89      46      11  W     Wisconsin Dells    WI
                  36       5      59  N         80      15       0  W     Winston-Salem      NC
                  49      52      48  N         97       9       0  W     Winnipeg           MB

              jtbl is simple and elegant. It just takes in piped JSON or JSON Lines data and prints a table. There’s only one option to turn on column truncation vs. wrapping columns if the terminal width is too narrow to display the complete table.

              $ jtbl -h
              jtbl:   Converts JSON and JSON Lines to a table
              
              Usage:  <JSON Data> | jtbl [OPTIONS]
              
                      -t  truncate data instead of wrapping if too long for the terminal width
                      -v  version info
                      -h  help

              Here’s an example using a relatively slim terminal width of 75:

              $ jc dig www.cnn.com | jq '.[].answer' | jtbl 
              ╒═════════════════╤═════════╤════════╤═══════╤═════════════════╕
              │ name            │ class   │ type   │   ttl │ data            │
              ╞═════════════════╪═════════╪════════╪═══════╪═════════════════╡
              │ www.cnn.com.    │ IN      │ CNAME  │   201 │ turner-tls.map. │
              │                 │         │        │       │ fastly.net.     │
              ├─────────────────┼─────────┼────────┼───────┼─────────────────┤
              │ turner-tls.map. │ IN      │ A      │    22 │ 151.101.189.67  │
              │ fastly.net.     │         │        │       │                 │
              ╘═════════════════╧═════════╧════════╧═══════╧═════════════════╛

              or with truncation enabled:

              $ jc dig www.cnn.com | jq '.[].answer' | jtbl -t 
              name                  class    type      ttl  data
              --------------------  -------  ------  -----  --------------------
              www.cnn.com.          IN       CNAME     219  turner-tls.map.fastl
              turner-tls.map.fastl  IN       A          10  151.101.189.67

              Here’s an example using it to print the result of an XML API query response, converted to JSON with jc, and filtered with jq:

              $ curl -X GET --basic -u "testuser:testpassword" https://reststop.randomhouse.com/resources/works/19306 | jc --xml | jq '.work' | jtbl
              ╒═════════════╤══════════╤══════════╤════════════╤══════════════╤════════════╤═════════════╤══════════════╤══════════════╤════════════╕
              │ authorweb   │ titles   │   workid │ @uri       │ onsaledate   │ series     │ titleAuth   │ titleSubti   │ titleshort   │ titleweb   │
              │             │          │          │            │              │            │             │ tleAuth      │              │            │
              ╞═════════════╪══════════╪══════════╪════════════╪══════════════╪════════════╪═════════════╪══════════════╪══════════════╪════════════╡
              │ BROWN, DAN  │          │    19306 │ https://re │ 2003-09-02   │ Robert Lan │ Angels & D  │ Angels & D   │ ANGELS & D   │ Angels & D │
              │             │          │          │ ststop.ran │ T00:00:00-   │ gdon       │ emons : Da  │ emons :  :   │ EMON(LPTP)   │ emons      │
              │             │          │          │ domhouse.c │ 04:00        │            │ n Brown     │  Dan Brown   │ (REI)(MTI)   │            │
              │             │          │          │ om/resourc │              │            │             │              │              │            │
              │             │          │          │ es/works/1 │              │            │             │              │              │            │
              │             │          │          │ 9306       │              │            │             │              │              │            │
              ╘═════════════╧══════════╧══════════╧════════════╧══════════════╧════════════╧═════════════╧══════════════╧══════════════╧════════════╛

              Again, with truncation enabled:

              $ curl -X GET --basic -u "testuser:testpassword" https://reststop.randomhouse.com/resources/works/19306 | jc --xml | jq '.work' | jtbl -t
              authorweb    titles      workid  @uri        onsaledate    series      titleAuth    titleSubti    titleshort    titleweb
              -----------  --------  --------  ----------  ------------  ----------  -----------  ------------  ------------  ----------
              BROWN, DAN                19306  https://re  2003-09-02    ROBERT LAN  Angels & D   Angels & D    ANGELS & D    Angels & D

              I found that having the ability to quickly see the JSON data in a tabular, horizontal format can sometimes help me visualize ‘where I am’ in the data more easily than looking at long vertical lists of JSON.

              I hope you enjoy it!

              Featured

              JC Version 1.8.0 Released

              Try the jc web demo!

              I’m excited to announce the release of jc version 1.8.0 available on github and pypi. See below for more information on the new features and parsers.

              To upgrade, run:

              $ pip3 install --upgrade jc

              New Parsers

              jc now includes 45 parsers! New parsers (tested on linux and OSX) include blkid, last, lastb, who, /etc/passwd files, /etc/shadow files, /etc/group files, /etc/gshadow files, and CSV files.

              Documentation and schemas for all parsers can be found here.

              blkid command parser

              Linux support for the blkid command:

              $ blkid | jc --blkid -p          # or:  jc -p blkid
              [
                {
                  "device": "/dev/sda1",
                  "uuid": "05d927ab-5875-49e4-ada1-7f46cb32c932",
                  "type": "xfs"
                },
                {
                  "device": "/dev/sda2",
                  "uuid": "3klkIj-w1kk-DkJi-0XBJ-y3i7-i2Ac-vHqWBM",
                  "type": "LVM2_member"
                },
                {
                  "device": "/dev/mapper/centos-root",
                  "uuid": "07d718ff-950c-4e5b-98f0-42a1147c77d9",
                  "type": "xfs"
                },
                {
                  "device": "/dev/mapper/centos-swap",
                  "uuid": "615eb89a-bcbf-46fd-80e3-c483ff5c931f",
                  "type": "swap"
                }
              ]
              
              $ sudo blkid -o udev -ip /dev/sda2 | jc --blkid -p          # or:  sudo jc -p blkid -o udev -ip /dev/sda2
              [
                {
                  "id_fs_uuid": "3klkIj-w1kk-DkJi-0XBJ-y3i7-i2Ac-vHqWBM",
                  "id_fs_uuid_enc": "3klkIj-w1kk-DkJi-0XBJ-y3i7-i2Ac-vHqWBM",
                  "id_fs_version": "LVM2\x20001",
                  "id_fs_type": "LVM2_member",
                  "id_fs_usage": "raid",
                  "id_iolimit_minimum_io_size": 512,
                  "id_iolimit_physical_sector_size": 512,
                  "id_iolimit_logical_sector_size": 512,
                  "id_part_entry_scheme": "dos",
                  "id_part_entry_type": "0x8e",
                  "id_part_entry_number": 2,
                  "id_part_entry_offset": 2099200,
                  "id_part_entry_size": 39843840,
                  "id_part_entry_disk": "8:0"
                }
              ]

              last and lastb command parsers

              Linux and OSX support for the last command. Linux support for the lastb command.

              $ last | jc --last -p          # or:  jc -p last
              [
                {
                  "user": "joeuser",
                  "tty": "ttys002",
                  "hostname": null,
                  "login": "Thu Feb 27 14:31",
                  "logout": "still logged in"
                },
                {
                  "user": "joeuser",
                  "tty": "ttys003",
                  "hostname": null,
                  "login": "Thu Feb 27 10:38",
                  "logout": "10:38",
                  "duration": "00:00"
                },
                {
                  "user": "joeuser",
                  "tty": "ttys003",
                  "hostname": null,
                  "login": "Thu Feb 27 10:18",
                  "logout": "10:18",
                  "duration": "00:00"
                },
                ...
              ]
              
              $ sudo lastb | jc --last -p          # or:  sudo jc -p lastb
              [
                {
                  "user": "joeuser",
                  "tty": "ssh:notty",
                  "hostname": "127.0.0.1",
                  "login": "Tue Mar 3 00:48",
                  "logout": "00:48",
                  "duration": "00:00"
                },
                {
                  "user": "joeuser",
                  "tty": "ssh:notty",
                  "hostname": "127.0.0.1",
                  "login": "Tue Mar 3 00:48",
                  "logout": "00:48",
                  "duration": "00:00"
                },
                {
                  "user": "jouser",
                  "tty": "ssh:notty",
                  "hostname": "127.0.0.1",
                  "login": "Tue Mar 3 00:48",
                  "logout": "00:48",
                  "duration": "00:00"
                }
              ]

              who command parser

              Linux and OSX support for the who command:

              $ who | jc --who -p          # or:  jc -p who
              [
                {
                  "user": "joeuser",
                  "tty": "ttyS0",
                  "time": "2020-03-02 02:52"
                },
                {
                  "user": "joeuser",
                  "tty": "pts/0",
                  "time": "2020-03-02 05:15",
                  "from": "192.168.71.1"
                }
              ]
              
              $ who -a | jc --who -p          # or:  jc -p who -a
              [
                {
                  "event": "reboot",
                  "time": "Feb 7 23:31",
                  "pid": 1
                },
                {
                  "user": "joeuser",
                  "writeable_tty": "-",
                  "tty": "console",
                  "time": "Feb 7 23:32",
                  "idle": "old",
                  "pid": 105
                },
                {
                  "user": "joeuser",
                  "writeable_tty": "+",
                  "tty": "ttys000",
                  "time": "Feb 13 16:44",
                  "idle": ".",
                  "pid": 51217,
                  "comment": "term=0 exit=0"
                },
                {
                  "user": "joeuser",
                  "writeable_tty": "?",
                  "tty": "ttys003",
                  "time": "Feb 28 08:59",
                  "idle": "01:36",
                  "pid": 41402
                },
                {
                  "user": "joeuser",
                  "writeable_tty": "+",
                  "tty": "ttys004",
                  "time": "Mar 1 16:35",
                  "idle": ".",
                  "pid": 15679,
                  "from": "192.168.1.5"
                }
              ]

              CSV File Parser

              Convert generic CSV files to JSON. The parser will attempt to automatically detect the delimiter character. If it cannot detect the delimiter it will use the comma (‘,‘) as the delimiter. The file must contain a header row as the first line:

              $ cat homes.csv 
              "Sell", "List", "Living", "Rooms", "Beds", "Baths", "Age", "Acres", "Taxes"
              142, 160, 28, 10, 5, 3,  60, 0.28,  3167
              175, 180, 18,  8, 4, 1,  12, 0.43,  4033
              129, 132, 13,  6, 3, 1,  41, 0.33,  1471
              ...
              
              $ cat homes.csv | jc --csv -p
              [
                {
                  "Sell": "142",
                  "List": "160",
                  "Living": "28",
                  "Rooms": "10",
                  "Beds": "5",
                  "Baths": "3",
                  "Age": "60",
                  "Acres": "0.28",
                  "Taxes": "3167"
                },
                {
                  "Sell": "175",
                  "List": "180",
                  "Living": "18",
                  "Rooms": "8",
                  "Beds": "4",
                  "Baths": "1",
                  "Age": "12",
                  "Acres": "0.43",
                  "Taxes": "4033"
                },
                {
                  "Sell": "129",
                  "List": "132",
                  "Living": "13",
                  "Rooms": "6",
                  "Beds": "3",
                  "Baths": "1",
                  "Age": "41",
                  "Acres": "0.33",
                  "Taxes": "1471"
                },
                ...
              ]

              /etc/passwd, /etc/shadow, /etc/group, and /etc/gshadow file parsers

              Convert /etc/passwd, /etc/shadow, /etc/group, and /etc/gshadow files to JSON format:

              $ cat /etc/passwd | jc --passwd -p
              [
                {
                  "username": "nobody",
                  "password": "*",
                  "uid": -2,
                  "gid": -2,
                  "comment": "Unprivileged User",
                  "home": "/var/empty",
                  "shell": "/usr/bin/false"
                },
                {
                  "username": "root",
                  "password": "*",
                  "uid": 0,
                  "gid": 0,
                  "comment": "System Administrator",
                  "home": "/var/root",
                  "shell": "/bin/sh"
                },
                {
                  "username": "daemon",
                  "password": "*",
                  "uid": 1,
                  "gid": 1,
                  "comment": "System Services",
                  "home": "/var/root",
                  "shell": "/usr/bin/false"
                },
                ...
              ]
              
              $ sudo cat /etc/shadow | jc --shadow -p
              [
                {
                  "username": "root",
                  "password": "*",
                  "last_changed": 18113,
                  "minimum": 0,
                  "maximum": 99999,
                  "warn": 7,
                  "inactive": null,
                  "expire": null
                },
                {
                  "username": "daemon",
                  "password": "*",
                  "last_changed": 18113,
                  "minimum": 0,
                  "maximum": 99999,
                  "warn": 7,
                  "inactive": null,
                  "expire": null
                },
                {
                  "username": "bin",
                  "password": "*",
                  "last_changed": 18113,
                  "minimum": 0,
                  "maximum": 99999,
                  "warn": 7,
                  "inactive": null,
                  "expire": null
                },
                ...
              ]
              
              $ cat /etc/group | jc --group -p
              [
                {
                  "group_name": "nobody",
                  "password": "*",
                  "gid": -2,
                  "members": []
                },
                {
                  "group_name": "nogroup",
                  "password": "*",
                  "gid": -1,
                  "members": []
                },
                {
                  "group_name": "wheel",
                  "password": "*",
                  "gid": 0,
                  "members": [
                    "root"
                  ]
                },
                {
                  "group_name": "certusers",
                  "password": "*",
                  "gid": 29,
                  "members": [
                    "root",
                    "_jabber",
                    "_postfix",
                    "_cyrus",
                    "_calendar",
                    "_dovecot"
                  ]
                },
                ...
              ]
              
              $ cat /etc/gshadow | jc --gshadow -p
              [
                {
                  "group_name": "root",
                  "password": "*",
                  "administrators": [],
                  "members": []
                },
                {
                  "group_name": "adm",
                  "password": "*",
                  "administrators": [],
                  "members": [
                    "syslog",
                    "joeuser"
                  ]
                },
                ...
              ]

              Updated Parsers

              • The ls parser now supports filenames that contain newline characters when using ls -l or ls -b. A warning message will be sent to stderr if newlines are detected and ls -l or ls -b are not used:
              $ ls | jc --ls
              
              jc:  Warning - Newline characters detected. Filenames probably corrupted. Use ls -l or -b instead.
              
              [{"filename": "this file has"}, {"filename": "a newline inside"}, {"filename": "this file has"}, {"filename": "four contiguous newlines inside"}, ...]
              • The ls parser now supports multiple directory listings, globbing, and recursive listings.
              $ ls -R | jc --ls
              [{"filename": "centos-7.7"}, {"filename": "create_fixtures.sh"}, {"filename": "generic"}, {"filename": "osx-10.11.6"}, {"filename": "osx-10.14.6"}, ...]

              Alternative “Magic” Syntax

              jc now accepts a simplified syntax for most command parsers. Instead of piping the data into jc you can now also prepend “jc” to the command you would like to convert. Note that command aliases are not supported:

              $ jc dig www.example.com
              [{"id": 31113, "opcode": "QUERY", "status": "NOERROR", "flags": ["qr", "rd", "ra"], "query_num": 1, "answer_num": 1, "authority_num": 0, "additional_num": 1, "question": {"name": "www.example.com.", "class": "IN", "type": "A"}, "answer": [{"name": "www.example.com.", "class": "IN", "type": "A", "ttl": 35366, "data": "93.184.216.34"}], "query_time": 37, "server": "2600", "when": "Mon Mar 02 16:13:31 PST 2020", "rcvd": 60}]

              You can also insert jc options before the command:

              $ jc -pqd dig www.example.com
              [
                {
                  "id": 7495,
                  "opcode": "QUERY",
                  "status": "NOERROR",
                  "flags": [
                    "qr",
                    "rd",
                    "ra"
                  ],
                  "query_num": 1,
                  "answer_num": 1,
                  "authority_num": 0,
                  "additional_num": 1,
                  "question": {
                    "name": "www.example.com.",
                    "class": "IN",
                    "type": "A"
                  },
                  "answer": [
                    {
                      "name": "www.example.com.",
                      "class": "IN",
                      "type": "A",
                      "ttl": 36160,
                      "data": "93.184.216.34"
                    }
                  ],
                  "query_time": 40,
                  "server": "2600",
                  "when": "Mon Mar 02 16:15:21 PST 2020",
                  "rcvd": 60
                }
              ]

              Schema Changes

              There are no schema changes in this release.

              Full Parser List

              • arp
              • blkid
              • crontab
              • crontab-u
              • CSV
              • df
              • dig
              • du
              • env
              • free
              • fstab
              • /etc/group
              • /etc/gshadow
              • history
              • /etc/hosts
              • id
              • ifconfig
              • INI
              • iptables
              • jobs
              • last and lastb
              • ls
              • lsblk
              • lsmod
              • lsof
              • mount
              • netstat
              • /etc/passwd
              • pip list
              • pip show
              • ps
              • route
              • /etc/shadow
              • ss
              • stat
              • systemctl
              • systemctl list-jobs
              • systemctl list-sockets
              • systemctl list-unit-files
              • uname -a
              • uptime
              • w
              • who
              • XML
              • YAML

              For more information on the motivations for creating jc, see my blog post.

              Happy parsing!

              Featured

              Applying Orchestration and Choreography to Cybersecurity Automation

              Imagine a world where most of your security stack seamlessly integrates with each other, has access to the latest threat intelligence from internal and external sources, and automatically mitigates the most severe incidents. Suspicious files found in emails get sent to the closest sandbox for detonation, where the hash and other IOCs are sent to endpoints, NGFWs, proxies, etc. to inoculate the organization, and then send all of the relevant information to the SOC as an incident ticket.

              Many organizations can at least do the above with a Security Orchestration Automation and Response (SOAR) platform implementation. Several vendors offer this type of Orchestration platform, including Splunk (Phantom), Palo Alto Networks (Demisto), Fortinet (Cybersponse), and IBM (Resilient). These platforms have become mainstream within the past few years and with more and more cybersecurity professionals learning the python programming language it has become easier to implement and customize them. In fact, no programming experience is needed at all for many use cases since playbooks can be created and maintained with a graphical builder.

              Cybersponse Graphical Playbook Editor

              I’m a big fan of using Orchestration to automate workflows with playbooks – in fact I’ve written integrations for Phantom and Demisto, and FortiSOAR. But there is another automation paradigm that doesn’t get talked about as much in the cybersecurity realm: Choreography.

              Orchestration

              So we already have an idea of what Orchestration is: it’s a central repository of vendor integrations and associated actions that can be connected together in clever and novel ways to create playbooks. Playbooks are like scripts that run based on incoming events, schedules, or can even be run manually to automate repetitive tasks. This automation removes the human-error factor and can reduce the workload of the Security team.

              Centralized Automation with Orchestration

              The key piece about Orchestration is that it is centralized. There is typically a central server that has all of the vendor integration information and playbooks. Alarms, logs, alerts, etc. get sent to this server so it can act as the conductor and tell each security device in the stack what to do and when to do it.

              This approach has pros and cons:

              Pros:

              • Very flexible – you can make a playbook do almost anything you can think of
              • Can version control the playbooks in a central repository like git
              • Large libraries of vendor apps
              • Typically have a good user communities

              Cons:

              • Can be brittle if APIs change, unsupported vendors are introduced, or if there are connectivity issues to the central Orchestrator
              • Vendor lock-in to a SOAR platform / not open source
              • Can require python programming experience to onboard an unsupported security service or to create a complex playbook

              Let’s compare this to Choreography – the other, lesser-known automation paradigm available to us.

              Choreography

              Choreography? Where did that come from? Well, the concepts of Orchestration and Choreography come from the world of Service Oriented Architecture (SOA). SOA had some good ideas, but it didn’t really take off until it recently morphed and rebranded as Microservice Architecture. (Yes, this is an over-simplification for the scope of this post)

              We almost take microservice architectures for granted now. Cloud application delivery and containerization of services are not as bleeding-edge as they were just a couple of years ago. We intuitively understand that microservices act independently yet are connected to other microservices to make up an application. The way these microservices are connected can be described as Orchestration or Choreography.

              Now we are just extending the metaphor and considering each piece of our security stack as a ‘microservice’. For example, your NGFW, sandbox, email security gateway, NAC, Intel feed, etc. are all cybersecurity microservices that need to be configured to talk to one another to enable your cybersecurity ‘application’.

              Distributed Automation with Choreography

              In the case of Choreography, each of these security ‘microservices’ (or security appliances) knows what they are supposed to do by subscribing to one or more channels on a message bus. This bus allows the service to receive alerts and IOC information in near-real-time and then publish their results on one or more channels on that same bus. It’s almost like layer 7 multicast for you router geeks out there.

              In this paradigm, there is no need for a central repository of rules or playbooks for many standard use-cases because the ‘fabric’ gets smarter as more and more different types of security services join. Unlike an orchestra, which follows the lead of the conductor, each service works independently based on its own configuration. Each service knows its own dance moves and works harmoniously in relation to the other services.

              The Message Bus

              How does this work in the real world?

              There are a couple examples of the Choreography approach being used in the Cybersecurity realm. A proprietary implementation by Fortinet (disclaimer: I am a Fortinet employee) is called the “Security Fabric”.

              Fortinet Security Fabric

              The Fortinet Security Fabric

              Fortinet’s Security Fabric is a proprietary implementation that behaves like a message bus to learn about new Fortinet and Fabric Ready ecosystem partner appliances and services as soon as they connect to the fabric.  These services are configured to connect to the Security Fabric and take appropriate action when a security incident is identified.

              For example, after installing a FortiSandbox appliance and adding it to the Security Fabric, other Fortinet or “Fabric-Ready” partner appliances, such as the NGFW and Secure Email Gateway can send suspicious files they detect to the Security Fabric where the sandbox service is listening. The FortiSandbox, in turn, can publish the IOC results of the scans it performs to the Security Fabric so other Fortinet or Fabric-Ready partner appliances (e.g. NGFW, FortiGuard, FortiEDR) can ingest them and take appropriate action.

              This is very powerful. As more services are connected to the Security Fabric, it gets smarter, more capable, and scales – automatically.

              OpenDXL

              Another open-source, multi-vendor example of a message bus being used for cybersecurity choreography is OpenDXL. OpenDXL was originally developed by McAfee, as a security-specific message bus, but it was open-sourced under the Organization for the Advancement of Structured Information Standards (OASIS) Open Cybersecurity Alliance (OCA) project. (Disclaimer: Fortinet is a sponsor of OCA) This project brings together the message bus concept to integrate multiple cybersecurity services using well-known formats like STIX2 to influence its ontology.

              OpenDXL Architecture

              Some of the pros and cons of the Choreography approach:

              Pros:

              • The ‘fabric’ automatically gets smarter and more capable as more security services are connected
              • No need for dozens of boilerplate playbooks
              • Open-source and proprietary options available
              • No reliance on a central conductor – less brittle to Orchestrator outages or misconfigurations.
              • Integrations “just work” together if they are part of the ecosystem

              Cons:

              • Less granular control over automation workflows
              • Open-source options are still maturing
              • Typically, no central repository for service configurations

              Which Way is the Best?

              We know that automation will improve our security operations, but which approach is best? Since Orchestration and Choreography both have their own pros and cons that don’t overlap too much it probably makes sense to use both.

              Choreography can reduce the amount of boilerplate playbooks you need to bootstrap your automation initiative, while Orchestration can be used to automate higher-level business or incident response workflows.

              By applying the application architecture concepts of SOA and microservices to cybersecurity we can take security automation to the next level.

              Featured

              JC Version 1.7.1 Released

              Try the jc web demo!

              I’m happy to announce that jc version 1.7.1 has been released and is available on github and pypi. In addition to the new and updated parsers and features outlined below, some back-end code cleanup to improve performance along with minor bug fixes were completed.

              To upgrade, run:

              $ pip3 install --upgrade jc

              New Parsers

              jc now includes 37 parsers! New parsers (tested on linux and OSX) include id, crontab-u, INI, XML, and YAML:

              id parser

              Linux and OSX support for the id command:

              $ id | jc --id -p
              {
                "uid": {
                  "id": 1000,
                  "name": "joeuser"
                },
                "gid": {
                  "id": 1000,
                  "name": "joeuser"
                },
                "groups": [
                  {
                    "id": 1000,
                    "name": "joeuser"
                  },
                  {
                    "id": 10,
                    "name": "wheel"
                  }
                ],
                "context": {
                  "user": "unconfined_u",
                  "role": "unconfined_r",
                  "type": "unconfined_t",
                  "level": "s0-s0:c0.c1023"
                }
              }

              crontab files with user defined

              Some crontab files contain the user field. In this case, use the new crontab-u parser:

              $ cat /etc/crontab | jc --crontab-u -p
              {
                "variables": [
                  {
                    "name": "MAILTO",
                    "value": "root"
                  },
                  {
                    "name": "PATH",
                    "value": "/sbin:/bin:/usr/sbin:/usr/bin"
                  },
                  {
                    "name": "SHELL",
                    "value": "/bin/bash"
                  }
                ],
                "schedule": [
                  {
                    "minute": [
                      "5"
                    ],
                    "hour": [
                      "10-11",
                      "22"
                    ],
                    "day_of_month": [
                      "*"
                    ],
                    "month": [
                      "*"
                    ],
                    "day_of_week": [
                      "*"
                    ],
                    "user": "root",
                    "command": "/var/www/devdaily.com/bin/mk-new-links.php"
                  },
                  {
                    "minute": [
                      "30"
                    ],
                    "hour": [
                      "4/2"
                    ],
                    "day_of_month": [
                      "*"
                    ],
                    "month": [
                      "*"
                    ],
                    "day_of_week": [
                      "*"
                    ],
                    "user": "root",
                    "command": "/var/www/devdaily.com/bin/create-all-backups.sh"
                  },
                  {
                    "occurrence": "yearly",
                    "user": "root",
                    "command": "/home/maverick/bin/annual-maintenance"
                  },
                  {
                    "occurrence": "reboot",
                    "user": "root",
                    "command": "/home/cleanup"
                  },
                  {
                    "occurrence": "monthly",
                    "user": "root",
                    "command": "/home/maverick/bin/tape-backup"
                  }
                ]
              }

              INI file parser

              Convert generic INI files to JSON:

              $ cat example.ini
              [DEFAULT]
              ServerAliveInterval = 45
              Compression = yes
              CompressionLevel = 9
              ForwardX11 = yes
              
              [bitbucket.org]
              User = hg
              
              [topsecret.server.com]
              Port = 50022
              ForwardX11 = no
              
              $ cat example.ini | jc --ini -p
              {
                "bitbucket.org": {
                  "serveraliveinterval": "45",
                  "compression": "yes",
                  "compressionlevel": "9",
                  "forwardx11": "yes",
                  "user": "hg"
                },
                "topsecret.server.com": {
                  "serveraliveinterval": "45",
                  "compression": "yes",
                  "compressionlevel": "9",
                  "forwardx11": "no",
                  "port": "50022"
                }
              }

              XML file parser

              Convert generic XML files to JSON:

              $ cat cd_catalog.xml 
              <?xml version="1.0" encoding="UTF-8"?>
              <CATALOG>
                <CD>
                  <TITLE>Empire Burlesque</TITLE>
                  <ARTIST>Bob Dylan</ARTIST>
                  <COUNTRY>USA</COUNTRY>
                  <COMPANY>Columbia</COMPANY>
                  <PRICE>10.90</PRICE>
                  <YEAR>1985</YEAR>
                </CD>
                <CD>
                  <TITLE>Hide your heart</TITLE>
                  <ARTIST>Bonnie Tyler</ARTIST>
                  <COUNTRY>UK</COUNTRY>
                  <COMPANY>CBS Records</COMPANY>
                  <PRICE>9.90</PRICE>
                  <YEAR>1988</YEAR>
                </CD>
                ...
              
              $ cat cd_catalog.xml | jc --xml -p
              {
                "CATALOG": {
                  "CD": [
                    {
                      "TITLE": "Empire Burlesque",
                      "ARTIST": "Bob Dylan",
                      "COUNTRY": "USA",
                      "COMPANY": "Columbia",
                      "PRICE": "10.90",
                      "YEAR": "1985"
                    },
                    {
                      "TITLE": "Hide your heart",
                      "ARTIST": "Bonnie Tyler",
                      "COUNTRY": "UK",
                      "COMPANY": "CBS Records",
                      "PRICE": "9.90",
                      "YEAR": "1988"
                    },
                ...
              }

              YAML file parser

              Convert YAML files to JSON – even files that contain multiple YAML documents:

              $ cat istio-mtls-permissive.yaml 
              apiVersion: "authentication.istio.io/v1alpha1"
              kind: "Policy"
              metadata:
                name: "default"
                namespace: "default"
              spec:
                peers:
                - mtls: {}
              ---
              apiVersion: "networking.istio.io/v1alpha3"
              kind: "DestinationRule"
              metadata:
                name: "default"
                namespace: "default"
              spec:
                host: "*.default.svc.cluster.local"
                trafficPolicy:
                  tls:
                    mode: ISTIO_MUTUAL
              
              $ cat istio-mtls-permissive.yaml | jc --yaml -p
              [
                {
                  "apiVersion": "authentication.istio.io/v1alpha1",
                  "kind": "Policy",
                  "metadata": {
                    "name": "default",
                    "namespace": "default"
                  },
                  "spec": {
                    "peers": [
                      {
                        "mtls": {}
                      }
                    ]
                  }
                },
                {
                  "apiVersion": "networking.istio.io/v1alpha3",
                  "kind": "DestinationRule",
                  "metadata": {
                    "name": "default",
                    "namespace": "default"
                  },
                  "spec": {
                    "host": "*.default.svc.cluster.local",
                    "trafficPolicy": {
                      "tls": {
                        "mode": "ISTIO_MUTUAL"
                      }
                    }
                  }
                }
              ]

              Updated Parsers

              • history parser now outputs line fields as integers
              • crontab parser bug fix for an issue that sometimes lost a row of data
              • Updated the compatibility information for du and history parsers

              __version__ Attribute Added

              Python programmers can now call the __version__ attribute on all parsers when running them as modules.

              >>> import jc.parsers.arp
              >>> print(jc.parsers.arp.__version__)
              1.1

              Added Exit Codes

              jc will now provide an exit code (1) if it did not successfully exit.

              Schema Changes

              The history parser now outputs line fields as integers

              $ history | jc --history -p
              [
                {
                  "line": 118,
                  "command": "sleep 100"
                },
                ...
              ]
               

              Full Parser List

              • arp
              • crontab
              • crontab-u
              • df
              • dig
              • du
              • env
              • free
              • fstab
              • history
              • hosts
              • id
              • ifconfig
              • INI
              • iptables
              • jobs
              • ls
              • lsblk
              • lsmod
              • lsof
              • mount
              • netstat
              • pip list
              • pip show
              • ps
              • route
              • ss
              • stat
              • systemctl
              • systemctl list-jobs
              • systemctl list-sockets
              • systemctl list-unit-files
              • uname -a
              • uptime
              • w
              • XML
              • YAML

              For more information on the motivations for creating jc, see my blog post.

              Happy parsing!

              Featured

              Microservice Security Design Patterns for Kubernetes (Part 5)

              The Service Mesh Sidecar-on-Sidecar Pattern

              In Part 4 of of my series on Microservice Security Patterns for Kubernetes we dove into the Sidecar Security Pattern and configured a working application with micro-segmentation enforcement and deep inspection for application-layer protection. The Sidecar Security Pattern is nice and clean, but what if you are running a Service Mesh like Istio with Envoy?

              For a great overview of the state of the art in Service Mesh, see this article by Guillaume Dury. He provides a nice comparison between modern Service Mesh options.

              In this post we will take the Sidecar Security Pattern from Part 4 and apply it in an Istio Service Mesh using Envoy sidecars. This is essentially a Sidecar-on-Sidecar Pattern that will allow us to not only use the native encryption and segmentation capabilities of the Service Mesh, but will allow us to layer on L7 application security for OWASP top 10 type of attacks against the microservices.

              How does the Service Mesh Sidecar-on-Sidecar Pattern work?

              It’s Sidecars All The Way Down

              As we discussed in Part 4, you can have multiple containers in a Pod. We used the modsecurity container as a sidecar to intercept HTTP requests and inspect them before forwarding them on to the microsimserver container in the same pod. But with an Istio Service Mesh, there will also be an Envoy container injected into the Pod and it will do the egress and ingress traffic interception. Can we have two sidecars in a Pod?

              The answer is yes. In the case of Envoy using the sidecar injection functionality, it configures itself based on the existing Pod spec in the deployment manifest. This means that we can use a manifest nearly identical to what we used in Part 4 and Envoy will correctly configure itself to send intercepted traffic on to the modsecurity container, which will then send the traffic to the microsimserver container.

              In this post we will be demonstrating this in action. There are surprisingly few changes that need to be made to the Security Sidecar Pattern deployment file to make this work. Also, we’ll be able to easily see how this works using the Kiali dashboard which provides visualization for the Istio Service Mesh.

              The Sidecar-on-Sidecar Pattern

              We’ll be using this deployment manifest that is nearly identical to the Security Sidecar Pattern manifest from Part 4. Here is what the design looks like:

              First we’ll enable service-to-service encryption, then strict mutual TLS (mTLS) with RBAC to provide micro-segmentation. Finally, we’ll configure Istio ingress gateway so we can access the app from the public internet.

              But first, let’s just deploy the modified Sidecar Pattern manifest with a vanilla Istio configuration.

              Spinning up the Cluster in GKE

              We’ll spin up a kubernetes cluster in GKE similar to how we did previously in Part 2 except this time we’ll use 4 nodes of n1-standard-2 machine type instead of 3. Since we’ll be using Istio to control service-to-service traffic (East/West flows) we no longer need to check the Enable Network Policy box. Instead, we will need to check the Enable Istio (beta) box under Additional Features.

              We’ll start with setting Enable mTLS (beta) to Permissive. We will change this later via configuration files as we try out some scenarios.

              I’m not going to give a complete tutorial on how to complete the set up of Istio on GKE, but I basically used the instructions documented in the following links to enable Prometheus and Grafana. I used the same idea to enable the Kiali dashboard to visualize the Service Mesh. We’ll be using the Kiali service graphs to verify the status of the application.

              Once you have Kiali enabled, you can configure port forwarding on the Service so you can browse to the dashboard using your laptop.

              Click the https://ssh.cloud.google.com/devshell/proxy?port=8080 link and then append /kiali at the end of the translated link in your browser. You should see a login screen. Use the default credentials or the ones you specified with a kubernetes secret during setup. You should see a blank service graph:

              Make sure to check the Security checkbox under the Display menu:

              Finally, we want to enable automatic sidecar injection for the Envoy proxy by running this command within Cloud Shell:

              $ kubectl label namespace default istio-injection=enabled

              Alright! Now let’s deploy the app.

              Deploying the Sidecar-on-Sidecar Manifest

              There are only a few minor differences between the sidecar.yaml manifest used in Part 4 and the istio-sidecar.yaml that we will be using for the following examples. Let’s take a look:

              Service Accounts

              apiVersion: v1
              kind: ServiceAccount
              metadata:
                name: www
              ---
              apiVersion: v1
              kind: ServiceAccount
              metadata:
                name: db
              ---
              apiVersion: v1
              kind: ServiceAccount
              metadata:
                name: auth

              First, we have added these ServiceAccount objects. This is what Istio uses to differentiate services within the mesh and affects how the certificates used in mTLS are generated. You’ll see how we bind these ServiceAccount objects to the Pods next.

              Deployments

              We’ll just take a look at the www Deployment since the same changes are required for all of the Deployments.

              apiVersion: apps/v1
              kind: Deployment
              metadata:
                name: www
              spec:
                replicas: 3
                selector:
                  matchLabels:
                    app: www
                template:
                  metadata:
                    labels:
                      app: www
                      version: v1.0       # add version
                  spec:
                    serviceAccountName: www      # add serviceAccountName
                    containers:
                    - name: modsecurity
                      image: owasp/modsecurity-crs:v3.2-modsec2-apache
                      ports:
                      - containerPort: 80
                      env:
                      - name: SETPROXY
                        value: "True"
                      - name: PROXYLOCATION
                        value: "http://127.0.0.1:8080/"
                    - name: microsimserver
                      image: kellybrazil/microsimserver
                      ports:
                      - containerPort: 8080       # add microsimserver port
                      env:
                      - name: STATS_PORT
                        value: "5000"
                    - name: microsimclient
                      image: kellybrazil/microsimclient
                      env:
                      - name: STATS_PORT
                        value: "5001"
                      - name: REQUEST_URLS
                        value: "http://auth.default.svc.cluster.local:8080/,http://db.default.svc.cluster.local:8080/"
                      - name: SEND_SQLI
                        value: "True"

              The only difference from the original sidecar.yaml is:

              • We have added a version label. Istio requires this label to be included.
              • We associated the Pods with the appropriate ServiceAccountName. This will be important for micro-segmentation later on.
              • We add the containerPort configuration for the microsimserver containers. This is important so the Envoy proxy sidecar can configure itself properly.

              Services

              Now let’s see the minor changes to the Services. Since they are all very similar, we will just take a look at the www Service:

              apiVersion: v1
              kind: Service
              metadata:
                labels:
                  app: www
                name: www
              spec:
                # externalTrafficPolicy: Local      # remove externalTrafficPolicy
                ports:
                - port: 8080
                  targetPort: 80
                  name: http         # add port name
                selector:
                  app: www
                sessionAffinity: None
                # type: LoadBalancer          # remove LoadBalancer type

              We have removed a couple of items from the www service: externalTrafficPolicy and type. This is because the www service is no longer directly exposed to the public internet. We’ll expose it later using an Istio Ingress Gateway.

              Also, we have added the port name field. This is required so Istio can correctly configure Envoy to listen for the correct protocol and produce the correct telemetry for the inter-service traffic.

              Deploy the App

              Now let’s deploy the application using kubectl. Copy/paste the manifest to a file called istio-sidecar.yaml within Cloud Shell using vi. Then run:

              $ kubectl apply -f istio-sidecar.yaml
              serviceaccount/www created
              serviceaccount/db created
              serviceaccount/auth created
              deployment.apps/www created
              deployment.apps/auth created
              deployment.apps/db created
              service/www created
              service/auth created
              service/db created

              After a couple of minutes you should see this within the Kiali dashboard:

              Excellent! You’ll notice the services will alternate between green and orange. This is because the www service is sending SQLi attacks to the db and auth services every so often and those are being blocked with HTTP 403 errors being returned by the modsecurity WAF container.

              Voila! We have application layer security in Istio!

              But you may have noticed that there is no encryption between services enabled yet. Also, all services can talk to each other, so we don’t have proper micro-segmentation. We can illustrate that with a curl from auth to db:

              $ kubectl exec auth-cf6f45fb-9k678 -c microsimserver curl http://db:8080
              <snip>
              sufH1FhoMgvXvbPOkE3O0H3MwNAN
              Tue Jan 28 01:16:48 2020   hostname: db-55747d84d8-jlz7z   ip: 10.8.0.13   remote: 127.0.0.1   hostheader: 127.0.0.1:8080   path: /

              Let’s fix these issues.

              Encrypting the East/West Traffic

              It is fairly easy to encrypt East/West traffic using Istio. First we’ll demonstrate permissive mTLS and then we’ll advance to strict mTLS with RBAC to enforce micro-segmentation.

              Here’s what the manifest for this configuration looks like:

              apiVersion: "authentication.istio.io/v1alpha1"
              kind: "Policy"
              metadata:
                name: "default"
                namespace: "default"
              spec:
                peers:
                - mtls: {}
              ---
              apiVersion: "networking.istio.io/v1alpha3"
              kind: "DestinationRule"
              metadata:
                name: "default"
                namespace: "default"
              spec:
                host: "*.default.svc.cluster.local"
                trafficPolicy:
                  tls:
                    mode: ISTIO_MUTUAL

              The Policy manifest specifies that all Pods in the default namespace will only accept encrypted requests using TLS. The DestinationRule manifest specifies how the client-side outbound connections are handled. Here we see that connections to any services in the default namespace will use TLS (*.default.svc.cluster.local) This effectively disables plaintext traffic between services in the namespace.

              Copy/paste the manifest text to a file called istio-mtls-permissive.yaml. Then apply it with kubectl:

              $ kubectl apply -f istio-mtls-permissive.yaml
              policy.authentication.istio.io/default created
              destinationrule.networking.istio.io/default created

              After 30 seconds or so you should start to see the padlocks between the services in the Kiali Dashboard indicating that the communications are encrypted. (Ensure you checked the Security checkbox under the Display drop-down)

              Nice! We have successfully encrypted traffic between our services.

              Enforcing micro-segmentation

              Even though the communications between services is now encrypted, we still don’t have effective micro-segmentation between Pods running the Envoy sidecar. We can test this again with a curl from an auth pod to a db pod:

              $ kubectl exec auth-cf6f45fb-9k678 -c microsimserver curl http://db:8080
              <snip>
              2S76Q83lFt3eplRkAHoHkqUl1PhX
              Tue Jan 28 03:47:03 2020   hostname: db-55747d84d8-9bhwx   ip: 10.8.1.5   remote: 127.0.0.1   hostheader: 127.0.0.1:8080   path: /

              And here is the connection displayed in Kiali:

              So the good news is that the connection is encrypted. The bad news is that auth shouldn’t be able to communicate with db. Let’s implement micro-segmentation.

              The first step is to enforce strict mTLS and enable Role Based Access Control (RBAC) for the default namespace. First copy/paste the manifest to a file called istio-mtls-strict.yaml with vi. Let’s take a look at the configuration:

              apiVersion: "authentication.istio.io/v1alpha1"
              kind: "Policy"
              metadata:
                name: "default"
                namespace: "default"
              spec:
                peers:
                - mtls:
                    mode: STRICT
              ---
              apiVersion: "networking.istio.io/v1alpha3"
              kind: "DestinationRule"
              metadata:
                name: "default"
                namespace: "default"
              spec:
                host: "*.default.svc.cluster.local"
                trafficPolicy:
                  tls:
                    mode: ISTIO_MUTUAL
              ---
              apiVersion: "rbac.istio.io/v1alpha1"
              kind: ClusterRbacConfig
              metadata:
                name: default
              spec:
                mode: 'ON_WITH_INCLUSION'
                inclusion:
                  namespaces: ["default"]

              The important bits here are:

              • Line 9: mode: STRICT in the Policy, which disallows any plaintext communications
              • Line 27: mode: 'ON_WITH_INCLUSION', which requires RBAC policies to be satisfied before allowing connections between services for the namespaces defined in line 29
              • Line 29: namespaces: ["default"], which are the namespaces that have the RBAC policies applied

              Let’s apply this by deleting the old config and applying the new one:

              $ kubectl delete -f istio-mtls-permissive.yaml
              policy.authentication.istio.io "default" deleted
              destinationrule.networking.istio.io "default" deleted
              
              $ kubectl apply -f istio-mtls-strict.yaml
              policy.authentication.istio.io/default created
              destinationrule.networking.istio.io/default created
              clusterrbacconfig.rbac.istio.io/default created

              Hmm… the entire application is broken now. No worries – this is expected! We did this to illustrate that policies need to be explicitly defined to allow any service-to-service (East/West) communications.

              Let’s add one service at a time to see these policies in action. Copy paste this manifest to a file called istio-rbac-policy-test.yaml with vi:

              apiVersion: "rbac.istio.io/v1alpha1"
              kind: ServiceRole
              metadata:
                name: www-access-role
                namespace: default
              spec:
                rules:
                - services: ["db.default.svc.cluster.local"]
                  methods: ["GET", "POST"]
                  paths: ["*"]
              ---
              apiVersion: "rbac.istio.io/v1alpha1"
              kind: ServiceRoleBinding
              metadata:
                name: www-to-db
                namespace: default
              spec:
                subjects:
                - user: "cluster.local/ns/default/sa/www"
                roleRef:
                  kind: ServiceRole
                  name: "www-access-role"

              Remember those serviceAccounts we created in the beginning? Now we are tying them to an RBAC policy. In this case we are allowing GET and POST requests to db.default.svc.cluster.local from Pods that offer client certificates identifying themselves as www.

              The user field takes an entry in the form of cluster.local/ns/<namespace>/sa/<serviceAcountName>. In this case cluster.local/ns/default/sa/www refers to the www Service Account we created earlier.

              Let’s apply this:

              $ kubectl apply -f istio-rbac-policy-test.yaml
              servicerole.rbac.istio.io/www-access-role created
              servicerolebinding.rbac.istio.io/www-to-db created

              It worked! www can now talk to db. Now we can fix auth by updating the policy to look like this:

              spec:
                rules:
                - services: ["db.default.svc.cluster.local", "auth.default.svc.cluster.local"]

              Let’s do that, plus allow the Istio Ingress Gateway service istio-ingressgateway-service-account to access www. This will allow public access to the service when we configure the Ingress Gateway later. Copy/paste this manifest to a file called istio-rbac-policy-final.yaml and apply it:

              $ kubectl delete -f istio-rbac-policy-test.yaml
              servicerole.rbac.istio.io "www-access-role" deleted
              servicerolebinding.rbac.istio.io "www-to-db" deleted
              
              $ kubectl apply -f istio-rbac-policy-final.yaml
              servicerole.rbac.istio.io/www-access-role created
              servicerolebinding.rbac.istio.io/www-to-db created
              servicerole.rbac.istio.io/pub-access-role created
              servicerolebinding.rbac.istio.io/pub-to-www created

              Very good! We’re back up and running. Let’s verify that micro-segmentation is in place and that requests cannot get through even by using IP addresses instead of Service names. We’ll try connecting from an auth Pod to a db Pod:

              $ kubectl exec auth-cf6f45fb-9k678 -c microsimserver curl http://db:8080
              RBAC: access denied
              
              $ kubectl exec auth-cf6f45fb-9k678 -c microsimserver curl 10.4.3.10:8080
              upstream connect error or disconnect/reset before headers. reset reason: connection termination

              Success!

              Exposing the App to the Internet

              Now that we have secured the app internally, we can expose it to the internet. If you try to visit the site now it will fail since the Istio Ingress has not been configured to forward traffic to the www service.

              In Cloud Shell, copy/paste this manifest to a file called istio-ingress.yaml with vi:

              apiVersion: networking.istio.io/v1alpha3
              kind: Gateway
              metadata:
                name: www-gateway
              spec:
                selector:
                  app: istio-ingressgateway
                  istio: ingressgateway
                  release: istio
                servers:
                - port:
                    number: 80
                    name: http2
                    protocol: HTTP2
                  hosts:
                  - "*"
              ---
              apiVersion: networking.istio.io/v1alpha3
              kind: VirtualService
              metadata:
                name: www-vservice
              spec:
                hosts:
                - "*"
                gateways:
                - www-gateway
                http:
                - match:
                  - uri:
                      prefix: "/"
                  route:
                  - destination:
                      port:
                        number: 8080
                      host: www.default.svc.cluster.local

              Here we’re telling Istio Ingress to listen on port 80 using HTTP2 protocol and then we attach our www service to that gateway. We allowed the Ingress Gateway to communicate with the www service earlier via RBAC policy so we should be good to apply this:

              $ kubectl apply -f istio-ingress.yaml
              gateway.networking.istio.io/www-gateway created
              virtualservice.networking.istio.io/www-vservice created

              Now we should be able to reach the application from the internet:

              $ kubectl get services -n istio-system
              NAME                     TYPE           CLUSTER-IP     EXTERNAL-IP     PORT(S)                                                                                                                                      AGE
              grafana                  ClusterIP      10.70.12.231   <none>          3000/TCP                                                                                                                                     83m
              istio-citadel            ClusterIP      10.70.2.197    <none>          8060/TCP,15014/TCP                                                                                                                           87m
              istio-galley             ClusterIP      10.70.11.184   <none>          443/TCP,15014/TCP,9901/TCP                                                                                                                   87m
              istio-ingressgateway     LoadBalancer   10.70.10.196   34.68.212.250   15020:30100/TCP,80:31596/TCP,443:32314/TCP,31400:31500/TCP,15029:32208/TCP,15030:31368/TCP,15031:31242/TCP,15032:31373/TCP,15443:30451/TCP   87m
              istio-pilot              ClusterIP      10.70.3.210    <none>          15010/TCP,15011/TCP,8080/TCP,15014/TCP                                                                                                       87m
              istio-policy             ClusterIP      10.70.4.74     <none>          9091/TCP,15004/TCP,15014/TCP                                                                                                                 87m
              istio-sidecar-injector   ClusterIP      10.70.3.147    <none>          443/TCP                                                                                                                                      87m
              istio-telemetry          ClusterIP      10.70.10.55    <none>          9091/TCP,15004/TCP,15014/TCP,42422/TCP                                                                                                       87m
              kiali                    ClusterIP      10.70.15.2     <none>          20001/TCP                                                                                                                                    86m
              prometheus               ClusterIP      10.70.7.187    <none>          9090/TCP                                                                                                                                     84m
              promsd                   ClusterIP      10.70.8.70     <none>          9090/TCP     
              
              $ curl 34.68.212.250
              <snip>
              ja1IO2Hm2GJAqKBPao2YyccDAVrd
              Wed Jan 29 01:24:46 2020   hostname: www-74f9dc9df8-j54k4   ip: 10.4.3.9   remote: 127.0.0.1   hostheader: 127.0.0.1:8080   path: /

              Excellent! Our simple App is secured internally and exposed to the Internet.

              Conclusion

              I really enjoyed this challenge and I see great potential in using a Service Mesh along with a security sidecar proxy like modsecurity. Though, I have to say that things are changing quickly, including the best practices and configuration syntax.

              For example, in this proof of concept I used the default version of Istio that was installed on my GKE cluster (1.1.16) which already seems old since version 1.4 has deprecated the RBAC configuration I used for a new style called AuthorizationPolicy. Unfortunately, this option was not available in my version of Istio but it does look more straightforward than RBAC.

              There is a great deal more complexity in a Service Mesh deployment and troubleshooting connectivity issues can be difficult.

              One thing that would probably need to be addressed in a production environment would be the Envoy proxy sidecar configuration. In my simple scenario I was getting very strange connectivity results until I exposed port 8080 on the microsimserver container in the Deployment. Without that configuration (which worked fine without Istio) Envoy didn’t properly grab all of the ports, so it was possible to completely bypass Envoy altogether which meant broken micro-segmentation and WAF bypass when connecting directly to the Pod IP address.

              There is a traffic management configuration called sidecar which allows you to fine-tune how the Envoy sidecar configures itself. Fortunately, I ended up not needing to do this in this example, though I did go through some iterations of experimenting with it to get micro-segmentation working without exposing port 8080 on the Pod.

              So in the end, the Service Mesh Sidecar-on-Sidecar Pattern may work for you, but you might end up tearing out a fair bit of your hair getting it to work in your environment.

              I’m looking forward to doing a proof of concept of the Service Mesh Security Plugin Pattern in the future, which will require compiling a custom version of Envoy that automatically filters traffic through modsecurity. I may let the versions of Istio and Envoy mature a bit before attempting that, though.

              What do you think about the Sidecar-on-Sidecar Pattern?

              Featured

              Explaining Kubernetes to a Five Year Old

              A friend of mine pointed me to a twitter thread on how to explain Kubernetes to a five year old. Since I have a two year old, this immediately popped into my head.

              I’ve seen the Lonely Goatherd scene from The Sound of Music many a time – my daughter absolutely loves it. And it seems to be a fairly good explanation for Kubernetes. Hear me out:

              Stage = Kubernetes Cluster

              The stage is the Kubernetes cluster where the application is deployed. This includes the Nodes, environment, config maps, secrets, etc.

              Puppets = Containers/Pods/Microservices

              The puppets are the actual microservices made up of Pods and Containers.

              Julie Andrews = DevOps

              Julie Andrews (Maria) is the poor DevOps soul who is staving off disaster with kubectl, helm charts, APIs, etc.

              Kids = Kubernetes Scheduler

              The Kids are (mostly) doing what Julie (DevOps) is telling them to do. They are adding and removing the puppets (containers) as she has directed.

              Audience = End Users

              The Audience is the end users of the application… but let’s not kid ourselves – this app is not in production, so the audience is really QA. 🙂

              Featured

              Silly Terminal Plotting with jc, jq, and jp

              I ran across a cool little utility called jp that takes JSON input and plots bar, line, histogram, and scatterplot graphs right in the terminal. I’m always looking for ways to hone my jq skills so I found some time to play around with it.

              I figured it would be fun to plot some system stats like CPU and Memory utilization in the terminal, so I started with some simple plots piping jc, and jp together. For example, here’s a bar graph of the output of df:

              df | jc --df | jp -type bar -canvas full-escape -x ..filesystem -y ..used

              Not super useful, but we’re just having fun here! How about graphing the relative sizes of files in a directory using ls?

              ls -l Documents/lab\ license/ | jc --ls | jp -type bar -canvas full-escape -x ..filename -y ..size

              Not bad! Let’s get a little fancier by filtering results through jq. We’ll plot the output of ps to see the CPU utilization of processes with more than .5% CPU utilization:

              ps axu | jc --ps | jq '[.[] | select (.cpu_percent > 0.5)]' | jp -type bar -canvas full-escape -x ..pid -y ..cpu_percent

              That’s a nice static bar chart of the most active PIDs on the system. But we can do better. Let’s make the graph dynamic by enclosing the above in a while true loop:

              while true; do ps axu | jc --ps | jq '[.[] | select (.cpu_percent > 0.5)]' | jp -type bar -canvas full-escape -x ..pid -y ..cpu_percent; sleep 3; done

              Fancy! Of course we could have plotted mem_percent instead to plot memory utilization by PID. By the way, I made the animated GIF above using ttyrec and ttygif.

              Ok, one last dynamic graph. This time, let’s track system load over time using the output of uptime. To pull this off we’ll need to keep a history of load values over time, so we’ll move from a one-liner to a small bash script:

              #!/bin/bash
              
              rm /tmp/load.json
              SECONDS=0
              
              while true; do 
              
                  uptime | jc --uptime | jq --arg sec "$SECONDS" '{"seconds": $sec | tonumber, "load": .load_1m}' >> /tmp/load.json
                  cat /tmp/load.json | jq -s . | jp -canvas full-escape -x ..seconds -y ..load
                  sleep 2
              
              done

              Fun! We got to do a couple of neat things with jq here.

              We pulled in the uptime output converted to JSON with jc and rebuilt the JSON to use only the load_1m value and the SECONDS environment variable. We used tonumber to convert the SECONDS variable into a number that could be plotted by jp. We redirect the output to a temporary text file called /tmp/load.json so jp can read it later and build out the line graph.

              I know, I know – I’m piping cat output into jq but I just wanted to make the script readable. The interesting thing here is that we are using the -s or “slurp” option of jq, which essentially reformats the JSON lines output in /tmp/load.json into a proper JSON array so jp can consume it.

              By the way, the graphs animate a little nicer in real life since you don’t get the artificial delay between frames you see in the animated GIF.

              I thought that was pretty fun and I got to try a couple different things in jq I haven’t tried before. Happy JSON plotting!

              Featured

              Microservice Security Design Patterns for Kubernetes (Part 4)

              The Security Sidecar Pattern

              In Part 3 of my series on Microservice Security Patterns for Kubernetes we dove into the Security Service Layer Pattern and configured a working application with micro-segmentation enforcement and deep inspection for application-layer protection. We were able to secure the application with that configuration, but, as we saw, the micro-segmentation configuration can get a bit unwieldy when you have more than a couple services.

              In this post we’ll configure a Security Sidecar Pattern which will provide the same level of security but with a simpler configuration. I really like the Security Sidecar Pattern because it tightly couples the application security layer with the application without requiring any changes to the application.

              This also means you can scale the application and your security together, so you don’t have to worry about scaling the security layer separately as your application needs grow. The only downside to this is that the application security layer (we’ll be using the Modsecurity WAF) may be overprovisioned and could waste cluster resources if not kept in check.

              Let’s find out how the Security Sidecar Pattern works.

              Sidecar where art thou?

              One of the really cool things about Kubernetes is that the smallest workload unit is a Pod and a Pod can be made up of multiple containers. Even better, these containers share the loopback network interface. (127.0.0.1) This means you can communicate between containers using normal network protocols without needing to expose these ports to the rest of the cluster.

              In practice, what this means is that you can deploy a reverse proxy, such as the one we have been using in Part 3, but instead of setting the origin server as the Kubernetes cluster DNS name of the service, we can just use localhost or 127.0.0.1. Pretty neat!

              Sidecar Injection

              Another cool thing about Pods is that there are multiple ways to define how the containers within the Pod are defined. In the most basic scenario (and the one we will be deploying in this post) you can simply manually define the application and the WAF container in the Deployment YAML.

              But there are fancier ways to automatically inject a sidecar container, like the WAF, by using Mutating Webhooks. Some examples of how this can be done can be found here and here. The nice thing about automatic sidecar injection is that the developers or DevOps team can define their Deployment YAML per usual and the sidecar will be injected without them needing to change their process. Automatic application layer protection!

              One more thing about automatic sidecar injection – this is how the Envoy dataplane proxy sidecar is typically injected in an Istio Service Mesh deployment. Istio has its own sidecar injection service, but you can also manually configure the Envoy sidecar if you would like.

              The Security Sidecar Pattern

              Let’s dive in and see how to configure the Security Sidecar Pattern. We will be using the same application that we set up in Part 2, so go ahead and take a look there to refresh your memory on how things are set up. Here is the diagram:

              Figure 1: Insecure Application

              As demonstrated before, all microsim services can communicate with each other and there is no deep inspection implemented to block application layer attacks like SQLi. In this post, we will be implementing this sidecar.yaml deployment that adds modsecurity reverse proxy WAF containers with the Core Rule Set as sidecars in front of the microsim services. modsecurity will perform deep inspection on the JSON/HTTP traffic and block application layer attacks.

              Then we will add on a Kubernetes Network Policy to enforce segmentation between the services.

              Security Sidecar Pattern Deployment Spec

              We’ll immediately notice how much smaller and simpler the Security Sidecar Pattern configuration is compared to the Security Service Layer Pattern. We went from 238 lines of configuration down to 142!

              Instead of creating separate security deployments and services to secure the application like we did in the Security Service Layer Pattern, we will simply add the WAF container to the same Pod as the application. We will need to make sure the WAF and the application listen on different TCP Ports since they share the loopback interface which doesn’t allow overlapping ports.

              In this case, the WAF will become the front-end and will be listening on behalf of the application and will forward on the clean, inspected traffic to the application via the loopback interface. We will only need to expose the WAF listening port to the cluster. Since we don’t want to allow bypassing the WAF we don’t want to expose the application port directly any longer.

              Note: Container TCP and UDP ports are still accessible via IP within the Kubernetes cluster even if they are not explicitly configured in the deployment YAML via containerPort configuration. To completely lock down direct access to the application TCP port so the WAF cannot be bypassed we will need to configure Network Policy.

              Figure 2: Security Sidecar Pattern

              Let’s take a closer look at the spec.

              www Deployment

              apiVersion: apps/v1
              kind: Deployment
              metadata:
                name: www
              spec:
                replicas: 3
                selector:
                  matchLabels:
                    app: www
                template:
                  metadata:
                    labels:
                      app: www
                  spec:
                    containers:
                    - name: modsecurity
                      image: owasp/modsecurity-crs:v3.2-modsec2-apache
                      ports:
                      - containerPort: 80
                      env:
                      - name: SETPROXY
                        value: "True"
                      - name: PROXYLOCATION
                        value: "http://127.0.0.1:8080/"
                    - name: microsimserver
                      image: kellybrazil/microsimserver
                      env:
                      - name: STATS_PORT
                        value: "5000"
                    - name: microsimclient
                      image: kellybrazil/microsimclient
                      env:
                      - name: STATS_PORT
                        value: "5001"
                      - name: REQUEST_URLS
                        value: "http://auth.default.svc.cluster.local:8080/,http://db.default.svc.cluster.local:8080/"
                      - name: SEND_SQLI
                        value: "True"

              We see three replicas of the www pods that are made up of both the official OWASP modsecurity container available on Docker Hub configured as a reverse proxy WAF listening on TCP port 80. The microsimserver application container listening on TCP port 8080 remains unchanged. Note that it is important that services listen on different ports since they are sharing the same loopback interface in the Pod.

              All requests that go to the WAF containers will be inspected and proxied to the microsimserver application container within the same Pod at http://127.0.0.1:8080/.

              These WAF containers are effectively impersonating the original service so the user or application does not need to modify its configuration. One nice thing about this design is that it allows you to scale the security layer along with the application, so as you scale up the application, security scales along with it automatically.

              The microsimclient container configuration remains unchanged from the original, which is nice. This shows that you can implement the Security Sidecar Pattern with little to no application logic changes if you are careful about how you set up the ports.

              Now, let’s take a look at the www Service that points to this deployment.

              www Service

              apiVersion: v1
              kind: Service
              metadata:
                labels:
                  app: www
                name: www
              spec:
                externalTrafficPolicy: Local
                ports:
                - port: 8080
                  targetPort: 80
                selector:
                  app: www
                sessionAffinity: None
                type: LoadBalancer

              Here we are just forwarding TCP port 8080 application traffic to TCP port 80 on the www Pods since that is the port the modsecurity reverse proxy containers listen on. Since this is an externally facing service we are using type: LoadBalancer and externalTrafficPolicy: Local just like the original Service did.

              Next we’ll take a look at the internal microservices. Since the auth and db deployments and services are configured identically we’ll just go over the db configuration.

              db Deployment

              apiVersion: apps/v1
              kind: Deployment
              metadata:
                name: db
              spec:
                replicas: 3
                selector:
                  matchLabels:
                    app: db
                template:
                  metadata:
                    labels:
                      app: db
                  spec:
                    containers:
                    - name: modsecurity
                      image: owasp/modsecurity-crs:v3.2-modsec2-apache
                      ports:
                      - containerPort: 80
                      env:
                      - name: SETPROXY
                        value: "True"
                      - name: PROXYLOCATION
                        value: "http://127.0.0.1:8080/"
                    - name: microsimserver
                      image: kellybrazil/microsimserver
                      env:
                      - name: STATS_PORT
                        value: "5000"

              Again, we have just added the modsecurity WAF container to the Pod listening on TCP Port 80. Since this is different than the listening port of the microsimserver container we are good to go without any changes to the app. Just like on the www Deployment, we have configured the modsecurity reverse proxy to send inspected traffic locally within the Pod to http://127.0.0.1:8080/.

              Note that even though we aren’t explicitly configuring the microsimserver TCP port 8080 via containerPort in the Deployment spec, this port is still technically available on the cluster via direct IP access. To fully lock down connectivity, we will be using Network Policy later on.

              db Service

              apiVersion: v1
              kind: Service
              metadata:
                labels:
                  app: db
                name: db
              spec:
                ports:
                - port: 8080
                  targetPort: 80
                selector:
                  app: db
                sessionAffinity: None

              Nothing fancy here – just listening on TCP port 8080 and forwarding to port 80, which is what the modsecurity WAF containers listen on. This is an internal service so no need for type: LoadBalancer or externalTrafficPolicy: Local.

              Now that we understand how the Deployment and Service specs work, let’s apply them on our Kubernetes cluster.

              See Part 2 for more information on setting up the cluster.

              Applying the Deployments and Services

              First, let’s delete the original insecure deployment in Cloud Shell if it is still running:

              $ kubectl delete -f simple.yaml

              Your Pods, Deployments, and Services should be empty before you proceed:

              $ kubectl get pods
              No resources found.
              $ kubectl get deploy
              No resources found.
              $ kubectl get services
              NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
              kubernetes   ClusterIP   10.12.0.1    <none>        443/TCP   3m46s

              Next, copy/paste the deployment text into a file called sidecar.yaml using vi. Then apply the deployment with kubectl:

              $ kubectl create -f sidecar.yaml
              deployment.apps/www created
              deployment.apps/auth created
              deployment.apps/db created
              service/www created
              service/auth created
              service/db created

              Testing the Deployment

              Once the www service has an external IP, you can send an HTTP GET or POST request to it from Cloud Shell or your laptop:

              $ kubectl get services
              NAME         TYPE           CLUSTER-IP    EXTERNAL-IP     PORT(S)          AGE
              auth         ClusterIP      10.12.7.96    <none>          8080/TCP         90m
              db           ClusterIP      10.12.8.118   <none>          8080/TCP         90m
              kubernetes   ClusterIP      10.12.0.1     <none>          443/TCP          93m
              www          LoadBalancer   10.12.14.67   35.238.35.208   8080:32032/TCP   90m
              $ curl 35.238.35.208:8080
              ...vME2NtSGaTBnt2zsprKdes5KKXCCAG9pk0yUr4K
              Thu Jan  9 22:09:27 2020   hostname: www-5bfc744996-tdzsk   ip: 10.8.2.3   remote: 127.0.0.1   hostheader: 127.0.0.1:8080   path: /

              The originating IP address is now the IP address of the local WAF in the Pod that handled the request. (always 127.0.0.1, since it is a sidecar). Since the WAF is deployed as a reverse proxy, the only way to get the originating IP information will be via HTTP headers, such as X-Forwarded-For (XFF). Also, the host header has now changed, so keep this in mind if the application is expecting certain values in the headers.

              We can do a quick check to see if the modsecurity WAF is inspecting traffic by sending an HTTP POST request to an IP address with no data or size information. This will be seen as an anomalous request and blocked:

              $ curl -X POST 35.238.35.208:8080
              <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
              <html><head>
              <title>403 Forbidden</title>
              </head><body>
              <h1>Forbidden</h1>
              <p>You don't have permission to access /
              on this server.<br />
              </p>
              </body></html>

              Excellent! Now let’s take a look at the microsim stats to see if the WAF layers are blocking the East/West SQLi attacks. Let’s open two tabs in Cloud Shell: one for shell access to a www microsimclient container and another for shell access to a db microsimserver container.

              In the first tab, use kubectl to find the name of one of the www pods and shell into the microsimclient container running in it:

              $ kubectl get pods
              NAME                    READY   STATUS    RESTARTS   AGE
              auth-7559599f89-d8tnw   2/2     Running   0          102m
              auth-7559599f89-k8qht   2/2     Running   0          102m
              auth-7559599f89-wfbp4   2/2     Running   0          102m
              db-59f8d84df-4kbvg      2/2     Running   0          102m
              db-59f8d84df-5csh8      2/2     Running   0          102m
              db-59f8d84df-ncksp      2/2     Running   0          102m
              www-5bfc744996-6jbr7    3/3     Running   0          102m
              www-5bfc744996-bgh9h    3/3     Running   0          102m
              www-5bfc744996-tdzsk    3/3     Running   0          102m
              $ kubectl exec www-5bfc744996-6jbr7 -c microsimclient -it sh
              /app #

              Then curl to the microsimclient stats server on localhost:5001:

              /app # curl localhost:5001
              {
                "time": "Thu Jan  9 22:23:25 2020",
                "runtime": 6349,
                "hostname": "www-5bfc744996-6jbr7",
                "ip": "10.8.0.4",
                "stats": {
                  "Requests": 6320,
                  "Sent Bytes": 6547520,
                  "Received Bytes": 112275897,
                  "Internet Requests": 0,
                  "Attacks": 64,
                  "SQLi": 64,
                  "XSS": 0,
                  "Directory Traversal": 0,
                  "DGA": 0,
                  "Malware": 0,
                  "Error": 0
                },
                "config": {
                  "STATS_PORT": 5001,
                  "STATSD_HOST": null,
                  "STATSD_PORT": 8125,
                  "REQUEST_URLS": "http://auth.default.svc.cluster.local:8080/,http://db.default.svc.cluster.local:8080/",
                  "REQUEST_INTERNET": false,
                  "REQUEST_MALWARE": false,
                  "SEND_SQLI": true,
                  "SEND_DIR_TRAVERSAL": false,
                  "SEND_XSS": false,
                  "SEND_DGA": false,
                  "REQUEST_WAIT_SECONDS": 1.0,
                  "REQUEST_BYTES": 1024,
                  "STOP_SECONDS": 0,
                  "STOP_PADDING": false,
                  "TOTAL_STOP_SECONDS": 0,
                  "REQUEST_PROBABILITY": 1.0,
                  "EGRESS_PROBABILITY": 0.1,
                  "ATTACK_PROBABILITY": 0.01
                }
              }

              Here we see 64 SQLi attacks have been sent to the auth and db services in the last 6349 seconds.

              Now, let’s see if the attacks are getting through like they did in the insecure deployment. In the other tab, find the name of one of the db pods and shell into the microsimserver container running in it:

              $ kubectl exec db-59f8d84df-4kbvg -c microsimserver -it sh
              /app #
              /app # curl localhost:5000
              {
                "time": "Thu Jan  9 22:39:30 2020",
                "runtime": 7316,
                "hostname": "db-59f8d84df-4kbvg",
                "ip": "10.8.0.5",
                "stats": {
                  "Requests": 3659,
                  "Sent Bytes": 60563768,
                  "Received Bytes": 3790724,
                  "Attacks": 0,
                  "SQLi": 0,
                  "XSS": 0,
                  "Directory Traversal": 0
                },
                "config": {
                  "LISTEN_PORT": 8080,
                  "STATS_PORT": 5000,
                  "STATSD_HOST": null,
                  "STATSD_PORT": 8125,
                  "RESPOND_BYTES": 16384,
                  "STOP_SECONDS": 0,
                  "STOP_PADDING": false,
                  "TOTAL_STOP_SECONDS": 0
                }

              In the insecure deployment we saw the SQLi value incrementing. Now that the modsecurity WAF is inspecting the East/West traffic, the SQLi attacks are no longer getting through, though we still see normal RequestsSent Bytes, and Received Bytes incrementing.

              modsecurity Logs

              Now, let’s check the modsecurity logs to see how the East/West application attacks are being identified. To see the modsecurity audit log we’ll need to shell into one of the WAF containers and look at the /var/log/modsec_audit.log file:

              $ kubectl exec db-59f8d84df-4kbvg -c modsecurity -it sh
              # grep -C 60 sql /var/log/modsec_audit.log
              <snip>
              --a05a312e-A--
              [09/Jan/2020:23:41:46 +0000] Xhe6OmUpgBRl4hgX8QIcmAAAAIE 10.8.0.4 50990 10.8.0.5 80
              --a05a312e-B--
              GET /?username=joe%40example.com&password=%3BUNION+SELECT+1%2C+version%28%29+limit+1%2C1-- HTTP/1.1
              Host: db.default.svc.cluster.local:8080
              User-Agent: python-requests/2.22.0
              Accept-Encoding: gzip, deflate
              Accept: */*
              Connection: keep-alive
              
              --a05a312e-F--
              HTTP/1.1 403 Forbidden
              Content-Length: 209
              Keep-Alive: timeout=5, max=100
              Connection: Keep-Alive
              Content-Type: text/html; charset=iso-8859-1
              
              --a05a312e-E--
              <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
              <html><head>
              <title>403 Forbidden</title>
              </head><body>
              <h1>Forbidden</h1>
              <p>You don't have permission to access /
              on this server.<br />
              </p>
              </body></html>
              
              --a05a312e-H--
              Message: Warning. Pattern match "(?i:(?:[\"'`](?:;?\\s*?(?:having|select|union)\\b\\s*?[^\\s]|\\s*?!\\s*?[\"'`\\w])|(?:c(?:onnection_id|urrent_user)|database)\\s*?\\([^\\)]*?|u(?:nion(?:[\\w(\\s]*?select| select @)|ser\\s*?\\([^\\)]*?)|s(?:chema\\s*?\\([^\\)]*?|elect.*?\\w?user\\()|in ..." at ARGS:password. [file "/etc/modsecurity.d/owasp-crs/rules/REQUEST-942-APPLICATION-ATTACK-SQLI.conf"] [line "190"] [id "942190"] [msg "Detects MSSQL code execution and information gathering attempts"] [data "Matched Data: UNION SELECT found within ARGS:password: ;UNION SELECT 1, version() limit 1,1--"] [severity "CRITICAL"] [ver "OWASP_CRS/3.2.0"] [tag "application-multi"] [tag "language-multi"] [tag "platform-multi"] [tag "attack-sqli"] [tag "OWASP_CRS"] [tag "OWASP_CRS/WEB_ATTACK/SQL_INJECTION"] [tag "WASCTC/WASC-19"] [tag "OWASP_TOP_10/A1"] [tag "OWASP_AppSensor/CIE1"] [tag "PCI/6.5.2"]
              Message: Warning. Pattern match "(?i:(?:^[\\W\\d]+\\s*?(?:alter\\s*(?:a(?:(?:pplication\\s*rol|ggregat)e|s(?:ymmetric\\s*ke|sembl)y|u(?:thorization|dit)|vailability\\s*group)|c(?:r(?:yptographic\\s*provider|edential)|o(?:l(?:latio|um)|nversio)n|ertificate|luster)|s(?:e(?:rv(?:ice|er)| ..." at ARGS:password. [file "/etc/modsecurity.d/owasp-crs/rules/REQUEST-942-APPLICATION-ATTACK-SQLI.conf"] [line "471"] [id "942360"] [msg "Detects concatenated basic SQL injection and SQLLFI attempts"] [data "Matched Data: ;UNION SELECT found within ARGS:password: ;UNION SELECT 1, version() limit 1,1--"] [severity "CRITICAL"] [ver "OWASP_CRS/3.2.0"] [tag "application-multi"] [tag "language-multi"] [tag "platform-multi"] [tag "attack-sqli"] [tag "OWASP_CRS"] [tag "OWASP_CRS/WEB_ATTACK/SQL_INJECTION"] [tag "WASCTC/WASC-19"] [tag "OWASP_TOP_10/A1"] [tag "OWASP_AppSensor/CIE1"] [tag "PCI/6.5.2"]
              Message: Access denied with code 403 (phase 2). Operator GE matched 5 at TX:anomaly_score. [file "/etc/modsecurity.d/owasp-crs/rules/REQUEST-949-BLOCKING-EVALUATION.conf"] [line "91"] [id "949110"] [msg "Inbound Anomaly Score Exceeded (Total Score: 10)"] [severity "CRITICAL"] [tag "application-multi"] [tag "language-multi"] [tag "platform-multi"] [tag "attack-generic"]
              Message: Warning. Operator GE matched 5 at TX:inbound_anomaly_score. [file "/etc/modsecurity.d/owasp-crs/rules/RESPONSE-980-CORRELATION.conf"] [line "86"] [id "980130"] [msg "Inbound Anomaly Score Exceeded (Total Inbound Score: 10 - SQLI=10,XSS=0,RFI=0,LFI=0,RCE=0,PHPI=0,HTTP=0,SESS=0): individual paranoia level scores: 10, 0, 0, 0"] [tag "event-correlation"]
              Apache-Error: [file "apache2_util.c"] [line 273] [level 3] [client 10.8.0.4] ModSecurity: Warning. Pattern match "(?i:(?:[\\\\"'`](?:;?\\\\\\\\s*?(?:having|select|union)\\\\\\\\b\\\\\\\\s*?[^\\\\\\\\s]|\\\\\\\\s*?!\\\\\\\\s*?[\\\\"'`\\\\\\\\w])|(?:c(?:onnection_id|urrent_user)|database)\\\\\\\\s*?\\\\\\\\([^\\\\\\\\)]*?|u(?:nion(?:[\\\\\\\\w(\\\\\\\\s]*?select| select @)|ser\\\\\\\\s*?\\\\\\\\([^\\\\\\\\)]*?)|s(?:chema\\\\\\\\s*?\\\\\\\\([^\\\\\\\\)]*?|elect.*?\\\\\\\\w?user\\\\\\\\()|in ..." at ARGS:password. [file "/etc/modsecurity.d/owasp-crs/rules/REQUEST-942-APPLICATION-ATTACK-SQLI.conf"] [line "190"] [id "942190"] [msg "Detects MSSQL code execution and information gathering attempts"] [data "Matched Data: UNION SELECT found within ARGS:password: ;UNION SELECT 1, version() limit 1,1--"] [severity "CRITICAL"] [ver "OWASP_CRS/3.2.0"] [tag "application-multi"] [tag "language-multi"] [tag "platform-multi"] [tag "attack-sqli"] [tag "OWASP_CRS"] [tag "OWASP_CRS/WEB_ATTACK/SQL_INJECTION"] [tag "WASCTC/WASC-19"] [tag "OWASP_TOP_10/A1"] [tag "OWASP_AppSensor/CIE1"] [tag "PCI/6.5.2"] [hostname "db.default.svc.cluster.local"] [uri "/"] [unique_id "Xhe6OmUpgBRl4hgX8QIcmAAAAIE"]
              Apache-Error: [file "apache2_util.c"] [line 273] [level 3] [client 10.8.0.4] ModSecurity: Warning. Pattern match "(?i:(?:^[\\\\\\\\W\\\\\\\\d]+\\\\\\\\s*?(?:alter\\\\\\\\s*(?:a(?:(?:pplication\\\\\\\\s*rol|ggregat)e|s(?:ymmetric\\\\\\\\s*ke|sembl)y|u(?:thorization|dit)|vailability\\\\\\\\s*group)|c(?:r(?:yptographic\\\\\\\\s*provider|edential)|o(?:l(?:latio|um)|nversio)n|ertificate|luster)|s(?:e(?:rv(?:ice|er)| ..." at ARGS:password. [file "/etc/modsecurity.d/owasp-crs/rules/REQUEST-942-APPLICATION-ATTACK-SQLI.conf"] [line "471"] [id "942360"] [msg "Detects concatenated basic SQL injection and SQLLFI attempts"] [data "Matched Data: ;UNION SELECT found within ARGS:password: ;UNION SELECT 1, version() limit 1,1--"] [severity "CRITICAL"] [ver "OWASP_CRS/3.2.0"] [tag "application-multi"] [tag "language-multi"] [tag "platform-multi"] [tag "attack-sqli"] [tag "OWASP_CRS"] [tag "OWASP_CRS/WEB_ATTACK/SQL_INJECTION"] [tag "WASCTC/WASC-19"] [tag "OWASP_TOP_10/A1"] [tag "OWASP_AppSensor/CIE1"] [tag "PCI/6.5.2"] [hostname "db.default.svc.cluster.local"] [uri "/"] [unique_id "Xhe6OmUpgBRl4hgX8QIcmAAAAIE"]
              Apache-Error: [file "apache2_util.c"] [line 273] [level 3] [client 10.8.0.4] ModSecurity: Access denied with code 403 (phase 2). Operator GE matched 5 at TX:anomaly_score. [file "/etc/modsecurity.d/owasp-crs/rules/REQUEST-949-BLOCKING-EVALUATION.conf"] [line "91"] [id "949110"] [msg "Inbound Anomaly Score Exceeded (Total Score: 10)"] [severity "CRITICAL"] [tag "application-multi"] [tag "language-multi"] [tag "platform-multi"] [tag "attack-generic"] [hostname "db.default.svc.cluster.local"] [uri "/"] [unique_id "Xhe6OmUpgBRl4hgX8QIcmAAAAIE"]
              Apache-Error: [file "apache2_util.c"] [line 273] [level 3] [client 10.8.0.4] ModSecurity: Warning. Operator GE matched 5 at TX:inbound_anomaly_score. [file "/etc/modsecurity.d/owasp-crs/rules/RESPONSE-980-CORRELATION.conf"] [line "86"] [id "980130"] [msg "Inbound Anomaly Score Exceeded (Total Inbound Score: 10 - SQLI=10,XSS=0,RFI=0,LFI=0,RCE=0,PHPI=0,HTTP=0,SESS=0): individual paranoia level scores: 10, 0, 0, 0"] [tag "event-correlation"] [hostname "db.default.svc.cluster.local"] [uri "/"] [unique_id "Xhe6OmUpgBRl4hgX8QIcmAAAAIE"]
              Action: Intercepted (phase 2)
              Apache-Handler: proxy-server
              Stopwatch: 1578613306195047 3522 (- - -)
              Stopwatch2: 1578613306195047 3522; combined=2944, p1=904, p2=1734, p3=0, p4=0, p5=306, sr=353, sw=0, l=0, gc=0
              Response-Body-Transformed: Dechunked
              Producer: ModSecurity for Apache/2.9.3 (http://www.modsecurity.org/); OWASP_CRS/3.2.0.
              Server: Apache
              Engine-Mode: "ENABLED"
              
              --a05a312e-Z--

              Here we see modsecurity has blocked and logged the East/West SQLi attack from one of the www Pods to a db Pod. Sweet!

              Yet, we’re still not done. Even though we are now inspecting and protecting traffic at the application layer, we are not yet enforcing micro-segmentation between the services. That means that, even with the WAFs in place, any auth Pod can communicate with any db Pod. We can demonstrate this by opening a shell on any auth microsimserver container and attempting to send a request to a db Pod from it:

              /app # curl 'http://db:8080'
              ...JsHT4A8GK8H0Am47jSG7MppM3o7BOlTrRZl4EEA9bNzsjND
              Thu Jan  9 23:57:54 2020   hostname: db-59f8d84df-5csh8   ip: 10.8.2.5   remote: 127.0.0.1   hostheader: 127.0.0.1:8080   path: /

              Even worse, if I know the IP address of the db pod, I can even bypass the WAF and send a successful SQLi attack:

              /app # curl 'http://10.8.2.5:8080/?username=joe%40example.com&password=%3BUNION+SELECT+1%2C+version%28%29+limit+1%2C1--'
              ...7Z7Kw2JxEgXipBnDZyyoZI4TK3RswBuZ509y2WY1wJTsERJFoRW6ZYY1QiA
              Fri Jan 10 00:01:37 2020   hostname: db-59f8d84df-5csh8   ip: 10.8.2.5   remote: 10.8.2.4   hostheader: 10.8.2.5:8080   path: /?username=joe%40example.com&password=%3BUNION+SELECT+1%2C+version%28%29+limit+1%2C1--

              Not good! Now, let’s add Network Policy to provide micro-segmentation and button this thing up.

              Adding Micro-segmentation

              Here is a simple Network Policy spec that will control the ingress to each internal service. I tried to keep the rules simple, but in a production deployment a tighter policy would likely be desired. For example, you would probably also want to include Egress policies.

              apiVersion: networking.k8s.io/v1
              kind: NetworkPolicy
              metadata:
                name: auth-ingress
                namespace: default
              spec:
                podSelector:
                  matchLabels:
                    app: auth
                policyTypes:
                - Ingress
                ingress:
                - from:
                  - podSelector:
                      matchLabels:
                        app: www
                  to:
                  ports:
                  - protocol: TCP
                    port: 80
              ---
              apiVersion: networking.k8s.io/v1
              kind: NetworkPolicy
              metadata:
                name: db-ingress
                namespace: default
              spec:
                podSelector:
                  matchLabels:
                    app: db
                policyTypes:
                - Ingress
                ingress:
                - from:
                  - podSelector:
                      matchLabels:
                        app: www
                  to:
                  ports:
                  - protocol: TCP
                    port: 80

              Another big difference here is the simplicity of the Network Policy when compared to the Security Service Layer Pattern. We went from 104 lines of configuration down to 41.

              This policy is says:

              • On the auth Pods only accept traffic from the www Pods that is destined to TCP port 80
              • On the db Pods only accept traffic from the www Pods that is destined to TCP port 80

              Let’s try it out. Copy the Nework Policy text to a file named sidecar-network-policy.yaml in vi and apply the Network Policy to the cluster with kubectl:

              $ kubectl create -f sidecar-network-policy.yaml
              networkpolicy.networking.k8s.io/auth-ingress created
              networkpolicy.networking.k8s.io/db-ingress created

              Next, let’s try that simulated SQLi attack again from auth to db:

              $ kubectl exec auth-7559599f89-d8tnw -c microsimserver -it sh
              /app #
              /app # curl 'http://10.8.2.5:8080/?username=joe%40example.com&password=%3BUNION+SELECT+1%2C+version%28%29+limit+1%2C1--'
              curl: (7) Failed to connect to 10.8.2.5 port 8080: Operation timed out

              Good stuff – no matter how you try to connect from auth to db it will now fail.

              Finally, let’s ensure that the rest of the application is still working correctly by checking the db logs. If we are still getting legitimate requests then we should be good to go:

              $ kubectl logs -f db-59f8d84df-4kbvg microsimserver
              <snip>
              127.0.0.1 - - [10/Jan/2020 00:27:57] "POST / HTTP/1.1" 200 -
              127.0.0.1 - - [10/Jan/2020 00:27:58] "POST / HTTP/1.1" 200 -
              127.0.0.1 - - [10/Jan/2020 00:27:59] "POST / HTTP/1.1" 200 -
              127.0.0.1 - - [10/Jan/2020 00:28:02] "POST / HTTP/1.1" 200 -
              127.0.0.1 - - [10/Jan/2020 00:28:04] "POST / HTTP/1.1" 200 -
              {"Total": {"Requests": 6987, "Sent Bytes": 115648879, "Received Bytes": 7235424, "Attacks": 1, "SQLi": 1, "XSS": 0, "Directory Traversal": 0}, "Last 30 Seconds": {"Requests": 15, "Sent Bytes": 248280, "Received Bytes": 15540, "Attacks": 0, "SQLi": 0, "XSS": 0, "Directory Traversal": 0}}
              127.0.0.1 - - [10/Jan/2020 00:28:04] "POST / HTTP/1.1" 200 -

              The service is still getting requests with the Network Policy in place. We can even see the test SQLi request we sent earlier when we bypassed the WAF, but no SQLi attacks are seen since the Network Policy was applied.

              Conclusion

              We have successfully secured the intra-cluster service communication (East/West communications) via micro-segmentation and WAF utilizing the Sidecar Security Pattern. This pattern is great for quickly and easily adding security to your cluster without creating a lot of overhead for the developers or DevOps teams. The configuration is also smaller and simpler than the Security Service Layer Pattern. It is also possible to automate the injection of the security sidecar with Mutating Webhooks. The nice thing about this pattern is that the security layer scales alongside the application automatically, though one downside to this pattern is that you could waste cluster resources if the WAF containers are not being fully utilized.

              What’s next?

              My goal is to demonstrate the Service Mesh Security Plugin Pattern in a future post. There are a couple of commercial and open source projects that provide this option, but it’s still early days in this space. In my opinion this pattern makes the most sense since it tightly integrates security with the cluster and cleanly provides both micro-segmentation and application layer security as code, which is the direction everything is moving.

              I’m also looking at implementing a Security Sidecar Pattern in conjunction with Istio Service Mesh. This is effectively a Sidecar on Sidecar Pattern. (The Envoy container and WAF container are both added to the application Pod) We’ll see how that goes, and if successful I’ll write that one up as well.

              I hope this series has been helpful and if you have suggestions for future topics, please feel free to let me know!

              Next in the series: Part 5

              Featured

              Tools of the Trade for Security Systems Engineers in 2020

              Happy New Year, everyone! As we begin a new decade and I reflect on the last quarter century of networking and security I thought it would be cool to see how the tools of the trade for pre-sales Systems Engineers in the network security field have changed and which tools the SE’s SE will need to be proficient with in 2020.

              As an SE in the 90’s and early 2000’s I remember carrying a heavy laptop bag filled with now obsolete dongles, serial converters, null-modem cables, ethernet patch cables and crossover cables, screw drivers, papers and excerpts of manuals. I probably couldn’t get through TSA with that bag these days!

              Networking and security has changed so much from those years. My early days were spent learning the opaque details of Windows NT and the black art of IPv4 subnetting (and CIDR!). I was obsessed with linux, OSPF, and BGP and made sure I understood the details of how encryption and key exchanges work for IPSEC VPNs.

              Obviously all of those foundational skills have served all of us well, but in the past few years we’ve seen the security industry change quite dramatically. Stateful inspection firewalls have given way to Defense in Depth and Zero Trust, which includes so much more. (EDR, NDR, IPS, VM/Cloud/Micro Services, UEBA, Deception, SOAR… whew!) To that end, here are a few tools that I have added to my toolbox in the past few years that I look for SEs to at least have some familiarity with on my high-performing teams.

              Cloud Providers

              Every SE should have accounts in all of the major cloud providers. Each has its own flavor, advantages, and APIs. Cloud accounts are perfect for setting up temporary labs to test out a configuration or a quick POC. You never know which combination of providers your customers will be using these days so you really need to be familiar with at least these:

              The good news is that all of the providers have free signups and the monthly bill is usually very low for lab usage.

              Integrations and Automation

              A lot of SEs have at least some background in scripting and programming and those skills are becoming more important now with everything becoming more connected and integrated. Integrations are the name of the game and if you can make a POC successful by building one yourself in a pinch it will make you that much more valuable to the customer and your company.

              Python has become so popular in the past few years that it’s definitely something that I look for in SE candidates, but BASH, and PowerShell skills are still very relevant. Extra credit for learning Go! Here are some of the more important tools to help in this area:

              • Proper IDE or text editor (I like Sublime, but there are many options, including old-school vi!)
                • UPDATE: I now tend to use VSCode for most of my Python work, but I still use Sublime for smaller code snippets and as a scratch pad/staging area
              • git (open some sort of git account, like github, and share your code)
                • I’m not a git expert, by any means, so I use Sourcetree to keep me sane. (UPDATE: I now tend to use the git source control features built-in to VSCode)
              • SOAR Platforms (Phantom, Demisto, FortiSOAR)
                • These typically have free community editions
              • SIEM (Elastic Search, Splunk, etc.)
                • Again, set up the free community editions in your lab

              APIs

              In line with Integrations and Automation, some of the lower-level skills that will be needed is to understand the different flavors of APIs. You’ll find that RESTful or REST-like APIs are very common these days, which makes things easy, but you’ll definitely need to understand JSON format.

              Here are some helpful tools for navigating APIs:

              • Online JSON pretty printer and validator
              • Online encoder/decoder (Cyberchef)
              • Postman – I love using this tool to learn a new API or to share quick python/BASH snippets with a customer.
              • jq – one of my favorite command line tools. It’s like sed or awk for JSON. Also, a quick and dirty JSON pretty printer/validator at the command line.

              Containers and Microservices

              Don’t worry, all of your legacy networking skills (OSI 1-7) aren’t obsolete, but a lot of the lower levels are becoming more abstracted and more emphasis is being laid on layer 7 for security.

              I think it’s a good exercise to write a small, simple app in Python and package it up as a Docker container running standalone or in a Kubernetes cluster. Extra credit for learning Service Mesh technologies like Istio/Envoy and CI/CD Pipelines and tools like Jenkins.

              It’s a big topic and a lot of things are changing rapidly, so this is an opportunity to learn something a bit bleeding edge, but quickly becoming mainstream. The SEs that understand these technologies will be the most relevant in 2020 and beyond as their customers transition to them.

              To get started, make sure these tools are in your tool belt:

              • Docker Desktop
              • Kubernetes Cluster (I use Google GKE, but you can also use something like Amazon EKS or Azure AKS)

              Penetration Testing/Hacking

              Of course, we can’t forget the basics of security, including pen testing and hacking tools that will enable you to test and demonstrate your technologies and solutions.

              • netcat (aka ncat or nc) – this is one of the first command line tools I install on my laptop. It’s a Swiss army knife for network testing.
              • nmap – another must have at the command line – tried and true for many years.
              • Kali Linux – here is a nice summary.
              • Application security test tools available from the OWASP site.
              • Virus Total – just be careful you don’t upload sensitive files or compromise an ongoing investigation by uploading a file the incident responders are still reversing.

              There are so many more tools for this section but they will typically be dependent on the type of security products you support.

              2020 and Beyond!

              There’s no shortage of things to learn and tools in the toolbox, though I have noticed that my laptop bag is a lot lighter these days! What are your favorite tools that I have missed?

              Featured

              Microservice Security Design Patterns for Kubernetes (Part 3)

              The Security Service Layer Pattern

              In Part 1 of this series on microservices security patterns for Kubernetes we went over three design patterns that enable micro-segmentation and deep inspection of the application and API traffic between microservices:

              1. Security Service Layer Pattern
              2. Security Sidecar Pattern
              3. Service Mesh Security Plugin Pattern

              In Part 2 we set up a simple, insecure deployment and demonstrated application layer attacks and the lack of micro-segmentation. In this post we will take that insecure deployment and implement a Security Service Layer Pattern to block application layer attacks and enforce strict segmentation between services.

              The Insecure Deployment

              Let’s take a quick look at the insecure deployment from Part 2:

              Figure 1: Insecure Deployment

              insecure deployment

              As demonstrated before, all microsim services can communicate with each other and there is no deep inspection implemented to block application layer attacks like SQLi. In this post, we will be implementing this servicelayer.yaml deployment that adds modsecurity reverse proxy WAF Pods with the Core Rule Set in front of the microsim services. modsecurity will perform deep inspection on the JSON/HTTP traffic and block application layer attacks.

              Then we will add on a Kubernetes Network Policy to enforce segmentation between the services. In the end, the deployment will look like this:

              Figure 2: Security Service Layer Pattern

              Security Service Layer Deployment Spec

              You’ll notice that each original service has been split into two services: a modsecurity WAF service (in orange) and the original service (in blue). Let’s take a look at the deployment YAML file to understand how this pattern works.

              The Security Service Layer Pattern does add quite a bit of lines to our deployment file, but they are simple additions. We’ll just need to keep our port numbers and service names straight as we add the WAF layers into the deployment.

              Let’s take a closer look at the components that have changed from the insecure deployment.

              www Deployment

              apiVersion: apps/v1
              kind: Deployment
              metadata:
                name: www
              spec:
                replicas: 3
                selector:
                  matchLabels:
                    app: www
                template:
                  metadata:
                    labels:
                      app: www
                  spec:
                    containers:
                    - name: modsecurity
                      image: owasp/modsecurity-crs:v3.2-modsec2-apache
                      ports:
                      - containerPort: 80
                      env:
                      - name: SETPROXY
                        value: "True"
                      - name: PROXYLOCATION
                        value: "http://wwworigin.default.svc.cluster.local:8080/"

              We see three replicas of the official OWASP modsecurity container available on Docker Hub configured as a reverse proxy WAF listening on TCP port 80. All requests that go to any of these WAF instances will be inspected and proxied to the origin service, wwworigin, on TCP port 8080. wwworigin is the original Service and Deployment from the insecure deployment.

              These WAF containers are effectively impersonating the original service so the user or application does not need to modify its configuration. One nice thing about this design is that it allows you to scale the security layer independent from the application. For instance, you might only require two modsecurity Pods to secure 10 of your application Pods.

              Now, let’s take a look at the www Service that points to this WAF deployment.

              www Service

              apiVersion: v1
              kind: Service
              metadata:
                labels:
                  app: www
                name: www
              spec:
                externalTrafficPolicy: Local
                ports:
                - port: 80
                  targetPort: 80
                selector:
                  app: www
                sessionAffinity: None
                type: LoadBalancer

              Nothing too fancy here – just forwarding TCP port 80 application traffic to TCP port 80 on the modsecurity WAF Pods since that is the port they listen on. Since this is an externally facing service we are using type: LoadBalancer and externalTrafficPolicy: Local just like the original Service did.

              Next, let’s check out the wwworigin Deployment spec where the original application Pods are defined.

              wwworigin Deployment

              apiVersion: apps/v1
              kind: Deployment
              metadata:
                name: wwworigin
              spec:
                replicas: 3
                selector:
                  matchLabels:
                    app: wwworigin
                template:
                  metadata:
                    labels:
                      app: wwworigin
                  spec:
                    containers:
                    - name: microsimserver
                      image: kellybrazil/microsimserver
                      env:
                      - name: STATS_PORT
                        value: "5000"
                      ports:
                      - containerPort: 8080
                    - name: microsimclient
                      image: kellybrazil/microsimclient
                      env:
                      - name: REQUEST_URLS
                        value: "http://auth.default.svc.cluster.local:80,http://db.default.svc.cluster.local:80"
                      - name: SEND_SQLI
                        value: "True"
                      - name: STATS_PORT
                        value: "5001"

              There’s a lot going on here, but basically it’s nearly identical to what we had in the insecure deployment. The only thing that has changed is the name of the deployment from www to wwworigin and we changed the REQUEST_URLS destination ports from 8080 to 80. This is because the modsecurity WAF containers listen on port 80 and they are the true front-end to the auth and db services.

              Next, let’s take a look at the wwworigin Service spec.

              wwworigin Service

              apiVersion: v1
              kind: Service
              metadata:
                labels:
                  app: wwworigin
                name: wwworigin
              spec:
                ports:
                - port: 8080
                  targetPort: 8080
                selector:
                  app: wwworigin
                sessionAffinity: None

              The only change to the original deployment here is that we changed the name from www to wwworigin and the port from 80 to 8080 since the origin Pods are now internal and not directly exposed to the internet.

              Now we need to repeat this process for the auth and db services. Since they are configured the same way, we will only go over the db Deployment and Service. Remember, there is now a db (WAF) and dborigin (application) Deployment and Service that we need to define.

              db Deployment

              apiVersion: apps/v1
              kind: Deployment
              metadata:
                name: db
              spec:
                replicas: 3
                selector:
                  matchLabels:
                    app: db
                template:
                  metadata:
                    labels:
                      app: db
                  spec:
                    containers:
                    - name: modsecurity
                      image: owasp/modsecurity-crs:v3.2-modsec2-apache
                      ports:
                      - containerPort: 80
                      env:
                      - name: SETPROXY
                        value: "True"
                      - name: PROXYLOCATION
                        value: "http://dborigin.default.svc.cluster.local:8080/"

              This is essentially the same as the www Deployment except we are proxying to dborigin. The WAF containers listen on port 80 and then they proxy the traffic to port 8080 on the origin application service.

              db Service

              apiVersion: v1
              kind: Service
              metadata:
                labels:
                  app: db
                name: db
              spec:
                ports:
                - port: 80
                  targetPort: 80
                selector:
                  app: db
                sessionAffinity: None

              Again, nothing fancy here – just listening on TCP port 80, which is what the modsecurity WAF containers listen on. This is an internal service so no need for type: LoadBalancer or externalTrafficPolicy: Local.

              Finally, let’s take a look at the dborigin Deployment and Service.

              dborigin Deployment

              apiVersion: apps/v1
              kind: Deployment
              metadata:
                name: dborigin
              spec:
                replicas: 3
                selector:
                  matchLabels:
                    app: dborigin
                template:
                  metadata:
                    labels:
                      app: dborigin
                  spec:
                    containers:
                    - name: microsimserver
                      image: kellybrazil/microsimserver
                      ports:
                      - containerPort: 8080
                      env:
                      - name: STATS_PORT
                        value: "5000"

              This Deployment is essentially the same as the original, except the name has been changed from db to dborigin.

              dborigin Service

              apiVersion: v1
              kind: Service
              metadata:
                labels:
                  app: dborigin
                name: dborigin
              spec:
                ports:
                - port: 8080
                  targetPort: 8080
                selector:
                  app: dborigin
                sessionAffinity: None

              Again, the only change from the original here is the name from db to dborigin.

              Now that we understand how the Deployment and Service specs work, let’s apply them on our Kubernetes cluster.

              See Part 2 for more information on setting up the cluster.

              Applying the Deployments and Services

              First, let’s delete the original insecure deployment in Cloud Shell if it is still running:

              $ kubectl delete -f simple.yaml

              Your Pods, Deployments, and Services should be empty before you proceed:

              $ kubectl get pods
              No resources found.
              $ kubectl get deploy
              No resources found.
              $ kubectl get services
              NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
              kubernetes   ClusterIP   10.12.0.1    <none>        443/TCP   3m46s

              Next, copy/paste the deployment text into a file called servicelayer.yaml using vi. Then apply the deployment with kubectl:

              $ kubectl apply -f servicelayer.yaml
              deployment.apps/www created
              deployment.apps/wwworigin created
              deployment.apps/auth created
              deployment.apps/authorigin created
              deployment.apps/db created
              deployment.apps/dborigin created
              service/www created
              service/auth created
              service/db created
              service/wwworigin created
              service/authorigin created
              service/dborigin created

              Testing the Deployment

              Once the www service has an external IP, you can send an HTTP GET or POST request to it from Cloud Shell or your laptop:

              $ kubectl get services
              NAME         TYPE           CLUSTER-IP     EXTERNAL-IP   PORT(S)        AGE
              auth         ClusterIP      10.12.14.41    <none>        80/TCP         52s
              authorigin   ClusterIP      10.12.5.222    <none>        8080/TCP       52s
              db           ClusterIP      10.12.9.224    <none>        80/TCP         52s
              dborigin     ClusterIP      10.12.13.80    <none>        8080/TCP       51s
              kubernetes   ClusterIP      10.12.0.1      <none>        443/TCP        7m43s
              www          LoadBalancer   10.12.13.193   34.66.99.16   80:30394/TCP   52s
              wwworigin    ClusterIP      10.12.6.122    <none>        8080/TCP       52s
              $ curl 34.66.99.16
              ...o7yXXg70Olfu2MvVsm9kos8ksEXyzX4oYnZ7wQh29FaqSF
              Thu Dec 19 00:58:15 2019   hostname: wwworigin-6c8fb48f79-frmk9   ip: 10.8.1.9   remote: 10.8.0.7   hostheader: wwworigin.default.svc.cluster.local:8080   path: /

              You can probably already see some interesting side effects of this deployment. The originating IP address is now the IP address of the WAF that handled the request. (10.8.0.7 in this case). Since the WAF is deployed as a reverse proxy, the only way to get the originating IP information will be via HTTP headers, such as X-Forwarded-For (XFF). Also, the host header has now changed, so keep this in mind if the application is expecting certain values in the headers.

              We can do a quick check to see if the modsecurity WAF is inspecting traffic by sending an HTTP POST request with no data or size information. This will be seen as an anomalous request and blocked:

              $ curl -X POST http://34.66.99.16
              <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
              <html><head>
              <title>403 Forbidden</title>
              </head><body>
              <h1>Forbidden</h1>
              <p>You don't have permission to access /
              on this server.<br />
              </p>
              </body></html>

              That looks good! Now let’s take a look at the microsim stats to see if the WAF layers are blocking the East/West SQLi attacks. Let’s open two tabs in Cloud Shell: one for shell access to a wwworigin container and another for shell access to a dborigin container.

              In the first tab, use kubectl to find the name of one of the wwworigin pods and shell into the microsimclient container running in it:

              $ kubectl get pods
              NAME                          READY   STATUS    RESTARTS   AGE
              auth-865675dd7f-4nld7         1/1     Running   0          23m
              auth-865675dd7f-7xsks         1/1     Running   0          23m
              auth-865675dd7f-lzdzg         1/1     Running   0          23m
              authorigin-5f6b795dcd-47gwn   1/1     Running   0          23m
              authorigin-5f6b795dcd-r5lr2   1/1     Running   0          23m
              authorigin-5f6b795dcd-xb68n   1/1     Running   0          23m
              db-dc6f6f5f9-b2j2f            1/1     Running   0          23m
              db-dc6f6f5f9-kb5q9            1/1     Running   0          23m
              db-dc6f6f5f9-wmj4n            1/1     Running   0          23m
              dborigin-7dc8d69f86-6mj2d     1/1     Running   0          23m
              dborigin-7dc8d69f86-bvpdn     1/1     Running   0          23m
              dborigin-7dc8d69f86-n42vg     1/1     Running   0          23m
              www-7cdc675f9-bhrhp           1/1     Running   0          23m
              www-7cdc675f9-dldhq           1/1     Running   0          23m
              www-7cdc675f9-rlqwv           1/1     Running   0          23m
              wwworigin-6c8fb48f79-9tq5t    2/2     Running   0          23m
              wwworigin-6c8fb48f79-frmk9    2/2     Running   0          23m
              wwworigin-6c8fb48f79-tltzd    2/2     Running   0          23m
              $ kubectl exec wwworigin-6c8fb48f79-9tq5t -c microsimclient -it sh
              /app #

              Then curl to the microsimclient stats server on localhost:5001:

              /app # curl localhost:5001
              {
                "time": "Thu Dec 19 01:26:24 2019",
                "runtime": 1855,
                "hostname": "wwworigin-6c8fb48f79-9tq5t",
                "ip": "10.8.0.10",
                "stats": {
                  "Requests": 1848,
                  "Sent Bytes": 1914528,
                  "Received Bytes": 30650517,
                  "Internet Requests": 0,
                  "Attacks": 18,
                  "SQLi": 18,
                  "XSS": 0,
                  "Directory Traversal": 0,
                  "DGA": 0,
                  "Malware": 0,
                  "Error": 0
                },
                "config": {
                  "STATS_PORT": 5001,
                  "STATSD_HOST": null,
                  "STATSD_PORT": 8125,
                  "REQUEST_URLS": "http://auth.default.svc.cluster.local:80,http://db.default.svc.cluster.local:80",
                  "REQUEST_INTERNET": false,
                  "REQUEST_MALWARE": false,
                  "SEND_SQLI": true,
                  "SEND_DIR_TRAVERSAL": false,
                  "SEND_XSS": false,
                  "SEND_DGA": false,
                  "REQUEST_WAIT_SECONDS": 1.0,
                  "REQUEST_BYTES": 1024,
                  "STOP_SECONDS": 0,
                  "STOP_PADDING": false,
                  "TOTAL_STOP_SECONDS": 0,
                  "REQUEST_PROBABILITY": 1.0,
                  "EGRESS_PROBABILITY": 0.1,
                  "ATTACK_PROBABILITY": 0.01
                }
              }

              Here we see 18 SQLi attacks have been sent to the auth and db services in the last 1855 seconds.

              Now, let’s see if the attacks are getting through like they did in the insecure deployment. In the other tab, find the name of one of the dborigin pods and shell into the microsimserver container running in it:

              $ kubectl exec dborigin-7dc8d69f86-6mj2d -c microsimserver -it sh
              /app #

              Then curl to the microsimserver stats server on localhost:5000:

              /app # curl localhost:5000
              {
                "time": "Thu Dec 19 01:29:00 2019",
                "runtime": 2013,
                "hostname": "dborigin-7dc8d69f86-6mj2d",
                "ip": "10.8.2.10",
                "stats": {
                  "Requests": 1009,
                  "Sent Bytes": 16733599,
                  "Received Bytes": 1045324,
                  "Attacks": 0,
                  "SQLi": 0,
                  "XSS": 0,
                  "Directory Traversal": 0
                },
                "config": {
                  "LISTEN_PORT": 8080,
                  "STATS_PORT": 5000,
                  "STATSD_HOST": null,
                  "STATSD_PORT": 8125,
                  "RESPOND_BYTES": 16384,
                  "STOP_SECONDS": 0,
                  "STOP_PADDING": false,
                  "TOTAL_STOP_SECONDS": 0
                }
              }

              Remember in the insecure deployment we saw the SQLi value incrementing. Now that the modsecurity WAF is inspecting the East/West traffic, the SQLi attacks are no longer getting through, though we still see normal Requests, Sent Bytes, and Received Bytes incrementing.

              modsecurity Logs

              Let’s check the modsecurity logs to see how the East/West application attacks are being identified. To see the modsecurity audit log we’ll need to shell into one of the WAF containers and look at the /var/log/modsec_audit.log file:

              $ kubectl exec db-dc6f6f5f9-b2j2f -it sh
              /app # grep -C 60 sql /var/log/modsec_audit.log
              <snip>
              --fa628b64-A--
              [19/Dec/2019:03:06:44 +0000] XfrpRArFgedF@mTDKh9QvAAAAI4 10.8.1.9 60612 10.8.2.9 80
              --fa628b64-B--
              GET /?username=joe%40example.com&password=%3BUNION+SELECT+1%2C+version%28%29+limit+1%2C1-- HTTP/1.1
              Host: db.default.svc.cluster.local
              User-Agent: python-requests/2.22.0
              Accept-Encoding: gzip, deflate
              Accept: */*
              Connection: keep-alive
              
              --fa628b64-F--
              HTTP/1.1 403 Forbidden
              Content-Length: 209
              Keep-Alive: timeout=5, max=100
              Connection: Keep-Alive
              Content-Type: text/html; charset=iso-8859-1
              
              --fa628b64-E--
              <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
              <html><head>
              <title>403 Forbidden</title>
              </head><body>
              <h1>Forbidden</h1>
              <p>You don't have permission to access /
              on this server.<br />
              </p>
              </body></html>
              
              --fa628b64-H--
              Message: Warning. Pattern match "(?i:(?:[\"'`](?:;?\\s*?(?:having|select|union)\\b\\s*?[^\\s]|\\s*?!\\s*?[\"'`\\w])|(?:c(?:onnection_id|urrent_user)|database)\\s*?\\([^\\)]*?|u(?:nion(?:[\\w(\\s]*?select| select @)|ser\\s*?\\([^\\)]*?)|s(?:chema\\s*?\\([^\\)]*?|elect.*?\\w?user\\()|in ..." at ARGS:password. [file "/etc/modsecurity.d/owasp-crs/rules/REQUEST-942-APPLICATION-ATTACK-SQLI.conf"] [line "190"] [id "942190"] [msg "Detects MSSQL code execution and information gathering attempts"] [data "Matched Data: UNION SELECT found within ARGS:password: ;UNION SELECT 1, version() limit 1,1--"] [severity "CRITICAL"] [ver "OWASP_CRS/3.2.0"] [tag "application-multi"] [tag "language-multi"] [tag "platform-multi"] [tag "attack-sqli"] [tag "OWASP_CRS"] [tag "OWASP_CRS/WEB_ATTACK/SQL_INJECTION"] [tag "WASCTC/WASC-19"] [tag "OWASP_TOP_10/A1"] [tag "OWASP_AppSensor/CIE1"] [tag "PCI/6.5.2"]
              Message: Warning. Pattern match "(?i:(?:^[\\W\\d]+\\s*?(?:alter\\s*(?:a(?:(?:pplication\\s*rol|ggregat)e|s(?:ymmetric\\s*ke|sembl)y|u(?:thorization|dit)|vailability\\s*group)|c(?:r(?:yptographic\\s*provider|edential)|o(?:l(?:latio|um)|nversio)n|ertificate|luster)|s(?:e(?:rv(?:ice|er)| ..." at ARGS:password. [file "/etc/modsecurity.d/owasp-crs/rules/REQUEST-942-APPLICATION-ATTACK-SQLI.conf"] [line "471"] [id "942360"] [msg "Detects concatenated basic SQL injection and SQLLFI attempts"] [data "Matched Data: ;UNION SELECT found within ARGS:password: ;UNION SELECT 1, version() limit 1,1--"] [severity "CRITICAL"] [ver "OWASP_CRS/3.2.0"] [tag "application-multi"] [tag "language-multi"] [tag "platform-multi"] [tag "attack-sqli"] [tag "OWASP_CRS"] [tag "OWASP_CRS/WEB_ATTACK/SQL_INJECTION"] [tag "WASCTC/WASC-19"] [tag "OWASP_TOP_10/A1"] [tag "OWASP_AppSensor/CIE1"] [tag "PCI/6.5.2"]
              Message: Access denied with code 403 (phase 2). Operator GE matched 5 at TX:anomaly_score. [file "/etc/modsecurity.d/owasp-crs/rules/REQUEST-949-BLOCKING-EVALUATION.conf"] [line "91"] [id "949110"] [msg "Inbound Anomaly Score Exceeded (Total Score: 10)"] [severity "CRITICAL"] [tag "application-multi"] [tag "language-multi"] [tag "platform-multi"] [tag "attack-generic"]
              Message: Warning. Operator GE matched 5 at TX:inbound_anomaly_score. [file "/etc/modsecurity.d/owasp-crs/rules/RESPONSE-980-CORRELATION.conf"] [line "86"] [id "980130"] [msg "Inbound Anomaly Score Exceeded (Total Inbound Score: 10 - SQLI=10,XSS=0,RFI=0,LFI=0,RCE=0,PHPI=0,HTTP=0,SESS=0): individual paranoia level scores: 10, 0, 0, 0"] [tag "event-correlation"]
              Apache-Error: [file "apache2_util.c"] [line 273] [level 3] [client 10.8.1.9] ModSecurity: Warning. Pattern match "(?i:(?:[\\\\"'`](?:;?\\\\\\\\s*?(?:having|select|union)\\\\\\\\b\\\\\\\\s*?[^\\\\\\\\s]|\\\\\\\\s*?!\\\\\\\\s*?[\\\\"'`\\\\\\\\w])|(?:c(?:onnection_id|urrent_user)|database)\\\\\\\\s*?\\\\\\\\([^\\\\\\\\)]*?|u(?:nion(?:[\\\\\\\\w(\\\\\\\\s]*?select| select @)|ser\\\\\\\\s*?\\\\\\\\([^\\\\\\\\)]*?)|s(?:chema\\\\\\\\s*?\\\\\\\\([^\\\\\\\\)]*?|elect.*?\\\\\\\\w?user\\\\\\\\()|in ..." at ARGS:password. [file "/etc/modsecurity.d/owasp-crs/rules/REQUEST-942-APPLICATION-ATTACK-SQLI.conf"] [line "190"] [id "942190"] [msg "Detects MSSQL code execution and information gathering attempts"] [data "Matched Data: UNION SELECT found within ARGS:password: ;UNION SELECT 1, version() limit 1,1--"] [severity "CRITICAL"] [ver "OWASP_CRS/3.2.0"] [tag "application-multi"] [tag "language-multi"] [tag "platform-multi"] [tag "attack-sqli"] [tag "OWASP_CRS"] [tag "OWASP_CRS/WEB_ATTACK/SQL_INJECTION"] [tag "WASCTC/WASC-19"] [tag "OWASP_TOP_10/A1"] [tag "OWASP_AppSensor/CIE1"] [tag "PCI/6.5.2"] [hostname "db.default.svc.cluster.local"] [uri "/"] [unique_id "XfrpRArFgedF@mTDKh9QvAAAAI4"]
              Apache-Error: [file "apache2_util.c"] [line 273] [level 3] [client 10.8.1.9] ModSecurity: Warning. Pattern match "(?i:(?:^[\\\\\\\\W\\\\\\\\d]+\\\\\\\\s*?(?:alter\\\\\\\\s*(?:a(?:(?:pplication\\\\\\\\s*rol|ggregat)e|s(?:ymmetric\\\\\\\\s*ke|sembl)y|u(?:thorization|dit)|vailability\\\\\\\\s*group)|c(?:r(?:yptographic\\\\\\\\s*provider|edential)|o(?:l(?:latio|um)|nversio)n|ertificate|luster)|s(?:e(?:rv(?:ice|er)| ..." at ARGS:password. [file "/etc/modsecurity.d/owasp-crs/rules/REQUEST-942-APPLICATION-ATTACK-SQLI.conf"] [line "471"] [id "942360"] [msg "Detects concatenated basic SQL injection and SQLLFI attempts"] [data "Matched Data: ;UNION SELECT found within ARGS:password: ;UNION SELECT 1, version() limit 1,1--"] [severity "CRITICAL"] [ver "OWASP_CRS/3.2.0"] [tag "application-multi"] [tag "language-multi"] [tag "platform-multi"] [tag "attack-sqli"] [tag "OWASP_CRS"] [tag "OWASP_CRS/WEB_ATTACK/SQL_INJECTION"] [tag "WASCTC/WASC-19"] [tag "OWASP_TOP_10/A1"] [tag "OWASP_AppSensor/CIE1"] [tag "PCI/6.5.2"] [hostname "db.default.svc.cluster.local"] [uri "/"] [unique_id "XfrpRArFgedF@mTDKh9QvAAAAI4"]
              Apache-Error: [file "apache2_util.c"] [line 273] [level 3] [client 10.8.1.9] ModSecurity: Access denied with code 403 (phase 2). Operator GE matched 5 at TX:anomaly_score. [file "/etc/modsecurity.d/owasp-crs/rules/REQUEST-949-BLOCKING-EVALUATION.conf"] [line "91"] [id "949110"] [msg "Inbound Anomaly Score Exceeded (Total Score: 10)"] [severity "CRITICAL"] [tag "application-multi"] [tag "language-multi"] [tag "platform-multi"] [tag "attack-generic"] [hostname "db.default.svc.cluster.local"] [uri "/"] [unique_id "XfrpRArFgedF@mTDKh9QvAAAAI4"]
              Apache-Error: [file "apache2_util.c"] [line 273] [level 3] [client 10.8.1.9] ModSecurity: Warning. Operator GE matched 5 at TX:inbound_anomaly_score. [file "/etc/modsecurity.d/owasp-crs/rules/RESPONSE-980-CORRELATION.conf"] [line "86"] [id "980130"] [msg "Inbound Anomaly Score Exceeded (Total Inbound Score: 10 - SQLI=10,XSS=0,RFI=0,LFI=0,RCE=0,PHPI=0,HTTP=0,SESS=0): individual paranoia level scores: 10, 0, 0, 0"] [tag "event-correlation"] [hostname "db.default.svc.cluster.local"] [uri "/"] [unique_id "XfrpRArFgedF@mTDKh9QvAAAAI4"]
              Action: Intercepted (phase 2)
              Apache-Handler: proxy-server
              Stopwatch: 1576724804853810 2752 (- - -)
              Stopwatch2: 1576724804853810 2752; combined=2296, p1=669, p2=1340, p3=0, p4=0, p5=287, sr=173, sw=0, l=0, gc=0
              Response-Body-Transformed: Dechunked
              Producer: ModSecurity for Apache/2.9.3 (http://www.modsecurity.org/); OWASP_CRS/3.2.0.
              Server: Apache
              Engine-Mode: "ENABLED"
              
              --fa628b64-Z--

              Here we see modsecurity has blocked and logged the East/West SQLi attack from one of the wwworigin containers to a dborigin container. Excellent!

              But there’s still a bit more to do. Even though we are now inspecting and protecting traffic at the application layer, we are not yet enforcing micro-segmentation between the services. That means that, even with the WAFs in place, any authorigin container can communicate with any dborigin container. We can demonstrate this by opening a shell on a authorigin container and attempting to send a simulated SQLi to a dborigin container from it:

              # curl 'http://dborigin:8080/?username=joe%40example.com&password=%3BUNION+SELECT+1%2C+version%28%29+limit+1%2C1--'
              X7fJ4MnlHo5gzJFQ1...
              Thu Dec 19 04:54:25 2019   hostname: dborigin-7dc8d69f86-6mj2d   ip: 10.8.2.10   remote: 10.8.2.5   hostheader: dborigin:8080   path: /?username=joe%40example.com&password=%3BUNION+SELECT+1%2C+version%28%29+limit+1%2C1--

              Not only can they communicate – we have completely bypassed the WAF! Let’s fix this with Network Policy.

              Network Policy

              Here is a Network Policy spec that will control the ingress to each internal pod. I tried to keep the rules simple, but in a production deployment a tighter policy would likely be desired. For example, you would probably also want to include Egress policies.

              apiVersion: networking.k8s.io/v1
              kind: NetworkPolicy
              metadata:
                name: wwworigin-ingress
                namespace: default
              spec:
                podSelector:
                  matchLabels:
                    app: wwworigin
                policyTypes:
                - Ingress
                ingress:
                - from:
                  - podSelector:
                      matchLabels:
                        app: www
                  to:
                  ports:
                  - protocol: TCP
                    port: 8080
              ---
              apiVersion: networking.k8s.io/v1
              kind: NetworkPolicy
              metadata:
                name: auth-ingress
                namespace: default
              spec:
                podSelector:
                  matchLabels:
                    app: auth
                policyTypes:
                - Ingress
                ingress:
                - from:
                  - podSelector:
                      matchLabels:
                        app: wwworigin
                  to:
                  ports:
                  - protocol: TCP
                    port: 80
              ---
              apiVersion: networking.k8s.io/v1
              kind: NetworkPolicy
              metadata:
                name: db-ingress
                namespace: default
              spec:
                podSelector:
                  matchLabels:
                    app: db
                policyTypes:
                - Ingress
                ingress:
                - from:
                  - podSelector:
                      matchLabels:
                        app: wwworigin
                  to:
                  ports:
                  - protocol: TCP
                    port: 80
              ---
              apiVersion: networking.k8s.io/v1
              kind: NetworkPolicy
              metadata:
                name: authorigin-ingress
                namespace: default
              spec:
                podSelector:
                  matchLabels:
                    app: authorigin
                policyTypes:
                - Ingress
                ingress:
                - from:
                  - podSelector:
                      matchLabels:
                        app: auth
                  to:
                  ports:
                  - protocol: TCP
                    port: 8080
              ---
              apiVersion: networking.k8s.io/v1
              kind: NetworkPolicy
              metadata:
                name: dborigin-ingress
                namespace: default
              spec:
                podSelector:
                  matchLabels:
                    app: dborigin
                policyTypes:
                - Ingress
                ingress:
                - from:
                  - podSelector:
                      matchLabels:
                        app: db
                  to:
                  ports:
                  - protocol: TCP
                    port: 8080

              Even with a simple Network Policy you can see one of the downsides to the Security Services Layer Pattern: it can be tedious to set the proper micro-segmentation policy without making errors.

              Basically what this policy is saying is:

              • On the wwworigin containers only accept traffic from the www containers that is destined to TCP port 8080
              • On the auth containers only accept traffic from the wwworigin containers that is destined to TCP port 80
              • On the db containers only accept traffic from the wwworigin containers that is destined to TCP port 80
              • On the authorigin containers only accept traffic from the auth containers that is destined to TCP port 8080
              • On the dborigin containers only accept traffic from the db containers that is destined to TCP port 8080

              Not Fun! In a large deployment with many services, this can quickly get out of hand and errors will be easy to make as you trace the traffic flow between each service. That’s why a Service Mesh is probably a better choice for an application with more than a few services.

              So let’s see if this works. Let’s copy the Nework Policy text to a file named servicelayer-network-policy.yaml in vi and apply the Network Policy to the cluster with kubectl:

              $ kubectl create -f servicelayer-network-policy.yaml
              networkpolicy.networking.k8s.io/wwworigin-ingress created
              networkpolicy.networking.k8s.io/auth-ingress created
              networkpolicy.networking.k8s.io/db-ingress created
              networkpolicy.networking.k8s.io/authorigin-ingress created
              networkpolicy.networking.k8s.io/dborigin-ingress created

              And now let’s try that simulated SQLi attack again from authorigin to dborigin:

              /var/log # curl 'http://dborigin:8080/?username=joe%40example.com&password=%3BUNION+SELECT+1%2C+version%28%29+limit+1%2C1--'
              curl: (7) Failed to connect to dborigin port 8080: Operation timed out

              Success!

              Finally, let’s doublecheck that the rest of the application is still working by checking the dborigin logs. If we are still getting legitimate requests then we should be good to go:

              $ kubectl logs -f dborigin-7dc8d69f86-6mj2d
              <snip>
              10.8.2.6 - - [19/Dec/2019 05:23:26] "POST / HTTP/1.1" 200 -
              10.8.2.6 - - [19/Dec/2019 05:23:28] "POST / HTTP/1.1" 200 -
              10.8.2.9 - - [19/Dec/2019 05:23:31] "POST / HTTP/1.1" 200 -
              10.8.2.6 - - [19/Dec/2019 05:23:33] "POST / HTTP/1.1" 200 -
              10.8.0.11 - - [19/Dec/2019 05:23:34] "POST / HTTP/1.1" 200 -
              10.8.2.9 - - [19/Dec/2019 05:23:34] "POST / HTTP/1.1" 200 -
              10.8.2.6 - - [19/Dec/2019 05:23:35] "POST / HTTP/1.1" 200 -
              10.8.2.9 - - [19/Dec/2019 05:23:39] "POST / HTTP/1.1" 200 -
              10.8.2.9 - - [19/Dec/2019 05:23:40] "POST / HTTP/1.1" 200 -
              10.8.2.9 - - [19/Dec/2019 05:23:41] "POST / HTTP/1.1" 200 -
              {"Total": {"Requests": 8056, "Sent Bytes": 133603375, "Received Bytes": 8342908, "Attacks": 1, "SQLi": 1, "XSS": 0, "Directory Traversal": 0}, "Last 30 Seconds": {"Requests": 17, "Sent Bytes": 281932, "Received Bytes": 17612, "Attacks": 0, "SQLi": 0, "XSS": 0, "Directory Traversal": 0}}
              10.8.2.6 - - [19/Dec/2019 05:23:43] "POST / HTTP/1.1" 200 -
              10.8.2.6 - - [19/Dec/2019 05:23:43] "POST / HTTP/1.1" 200 -

              Nice! We see the service is still getting requests with the Network Policy in place. We can even see that test SQLi request we sent earlier when we bypassed the WAF, but no SQLi attacks are seen since the Network Policy was applied.

              Conclusion

              Whew – that was fun! As you can see, it is possible to lock down an application with just a few microservices that need to communicate with each other using the Security Services Layer Pattern, but for anything more than a few services things can get complicated quickly. It does have the advantage, however, of allowing you to independently scale the security layers and the application layers.

              Stay tuned for the next post where we’ll go over the Security Sidecar Pattern and we’ll see the advantages and disadvantages of that approach.

              Next in the series: Part 4

              Featured

              JC Version 1.6.1 Released

              Try the jc web demo!

              I’m happy to announce that jc version 1.6.1 has been released and is available on github and pypi.

              To upgrade, run:

              $ pip install --upgrade jc

              New Parsers

              jc now includes 32 parsers! New parsers (tested on linux and OSX) include:

              • du
              • crontab files
              • pip list
              • pip show

              Updated Parsers

              ifconfig parser now outputs rx_bytes and tx_bytes as integers.

              More OSX Support

              Version 1.6.1 provides more OSX support and testing for several existing parsers, including:

              • ifconfig
              • arp
              • df
              • mount
              • uname -a
              • ls
              • dig
              • ps
              • w
              • uptime

              About JC Information

              jc now has an about option that will show the version of jc and all of the included parsers. Other information, including parser compatibility and authorship will also be shown in JSON format.

              $ jc -a -p
              {
                "name": "jc",
                "version": "1.6.1",
                "description": "jc cli output JSON conversion tool",
                "author": "Kelly Brazil",
                "author_email": "kellyjonbrazil@gmail.com",
                "parser_count": 32,
                "parsers": [
                  {
                    "name": "arp",
                    "argument": "--arp",
                    "version": "1.1",
                    "description": "arp parser",
                    "author": "Kelly Brazil",
                    "author_email": "kellyjonbrazil@gmail.com",
                    "compatible": [
                      "linux",
                      "aix",
                      "freebsd",
                      "darwin"
                    ]
                  },
                  {
                    "name": "crontab",
                    "argument": "--crontab",
                    "version": "1.0",
                    "description": "crontab file parser",
                    "author": "Kelly Brazil",
                    "author_email": "kellyjonbrazil@gmail.com",
                    "compatible": [
                      "linux",
                      "darwin",
                      "aix",
                      "freebsd"
                    ]
                  },
                  ...
                ]
              }

              Schema Changes

              The ifconfig parser output now prints the state value as a JSON array instead of a string. Also, as mentioned above, rx_bytes and tx_bytes are available.

              $ ifconfig lo | jc --ifconfig -p
              [
                {
                  "name": "lo",
                  "flags": 73,
                  "state": [
                    "UP",
                    "LOOPBACK",
                    "RUNNING"
                  ],
                  "mtu": 65536,
                  "ipv4_addr": "127.0.0.1",
                  "ipv4_mask": "255.0.0.0",
                  "ipv4_bcast": null,
                  "ipv6_addr": "::1",
                  "ipv6_mask": 128,
                  "ipv6_scope": "0x10",
                  "mac_addr": null,
                  "type": "Local Loopback",
                  "rx_packets": 0,
                  "rx_bytes": 0,
                  "rx_errors": 0,
                  "rx_dropped": 0,
                  "rx_overruns": 0,
                  "rx_frame": 0,
                  "tx_packets": 0,
                  "tx_bytes": 0,
                  "tx_errors": 0,
                  "tx_dropped": 0,
                  "tx_overruns": 0,
                  "tx_carrier": 0,
                  "tx_collisions": 0,
                  "metric": null
                }
              ]

              The df parser now uses an underscore instead of a dash in the “blocks” field name:

              $ df | jc --df -p
              [
                {
                  "filesystem": "devtmpfs",
                  "1k_blocks": 1918816,
                  "used": 0,
                  "available": 1918816,
                  "mounted_on": "/dev",
                  "use_percent": 0
                },
                ...
              ]

              Full Parser List

              • arp
              • crontab
              • df
              • dig
              • du
              • env
              • free
              • fstab
              • history
              • hosts
              • ifconfig
              • iptables
              • jobs
              • ls
              • lsblk
              • lsmod
              • lsof
              • mount
              • netstat
              • pip list
              • pip show
              • ps
              • route
              • ss
              • stat
              • systemctl
              • systemctl list-jobs
              • systemctl list-sockets
              • systemctl list-unit-files
              • uname -a
              • uptime
              • w

              For more information on the motivations for creating jc, see my blog post.

              Happy parsing!

              Featured

              Microservice Security Design Patterns for Kubernetes (Part 2)

              Setting Up the Insecure Deployment

              In Part 1 of this series on microservices security patterns for Kubernetes we went over three design patterns that enable micro-segmentation and deep inspection of the application and API traffic between microservices:

              1. Security Service Layer Pattern
              2. Security Sidecar Pattern
              3. Service Mesh Security Plugin Pattern

              In this post we will set the groundwork to deep dive into the Security Service Layer Pattern with a live insecure deployment on Google Kubernetes Engine (GKE). By the end of this post you will be able to bring up an insecure deployment and demonstrate layer 7 attacks and unrestricted access between internal services. In the next post we will layer on a Security Service Layer Pattern to secure the application.

              The Base Deployment

              Let’s first get our cluster up and running with a simple deployment with no security and show what is possible in a nearly default state.  We’ll use this simple.yaml deployment I have created using my microsim app. microsim is a microservice simulator that can send simulated JSON/HTTP and application attack traffic between services. It has some logging and statistics reporting functionality that will allow us to see attacks being sent by the client and received or blocked by the server.

              Here is a diagram of the deployment.

              Figure 1: Simple Deployment

              insecure deployment

              In this microservice architecture we see three simulated services:

              1. Public Web interface service
              2. Internal Authentication service
              3. Internal Database service

              In the default state, all services are able to communicate with one another and there are no protections from application layer attacks. Let’s take a quick look at the Pod Deployments and Services in this application.

              www Deployment

              apiVersion: apps/v1
              kind: Deployment
              metadata:
                name: www
              spec:
                replicas: 3
                selector:
                  matchLabels:
                    app: www
                template:
                  metadata:
                    labels:
                      app: www
                  spec:
                    containers:
                    - name: microsimserver
                      image: kellybrazil/microsimserver
                      env:
                      - name: STATS_PORT
                        value: "5000"
                      ports:
                      - containerPort: 8080
                    - name: microsimclient
                      image: kellybrazil/microsimclient
                      env:
                      - name: REQUEST_URLS
                        value: "http://auth.default.svc.cluster.local:8080,http://db.default.svc.cluster.local:8080"
                      - name: SEND_SQLI
                        value: "True"
                      - name: STATS_PORT
                        value: "5001"

              In the www deployment above we see three Pod replicas, each running two containers. (microsimserver and microsimclient)

              The microsimserver container is configured to expose port 8080, which is the default port the service listens on. By default, the server will respond with 16KB of data and some diagnostic information in either plain HTTP or JSON/HTTP, depending on whether the request is an HTTP GET or POST.

              The microsimclient container is configured to send a single 1KB JSON/HTTP POST request every second to http://auth.default.svc.cluster.local:8080 or http://db.default.svc.cluster.local:8080 which will resolve to the internal auth and db Services using the default Kubernetes DNS resolver.

              We also see that microsimclient is configured to occasionally send SQLi attack traffic to the auth and db Services. There are many other behaviors that can be configured, but we’ll keep things simple.

              The stats server for microsimserver is configured to run on port 5000 and the stats server for microsimclient is configured to run on port 5001. These ports are not exposed to the cluster, so we will need to get shell access to the containers to see the stats.

              Now, let’s look at the www service.

              www Service

              apiVersion: v1
              kind: Service
              metadata:
                labels:
                  app: www
                name: www
              spec:
                externalTrafficPolicy: Local
                ports:
                - port: 80
                  targetPort: 8080
                selector:
                  app: www
                sessionAffinity: None
                type: LoadBalancer

              The service is configured to publicly expose the www service via port 80 with a LoadBalancer type. The externalTrafficPolicy: Local option allows the originating IP address to be preserved within the cluster.

              Now let’s take a look at the db deployment and service. The auth service is exactly the same as the db service so we’ll skip going over that one.

              db Deployment

              apiVersion: apps/v1
              kind: Deployment
              metadata:
                name: db
              spec:
                replicas: 3
                selector:
                  matchLabels:
                    app: db
                template:
                  metadata:
                    labels:
                      app: db
                  spec:
                    containers:
                    - name: microsimserver
                      image: kellybrazil/microsimserver
                      env:
                      - name: STATS_PORT
                        value: "5000"
                      ports:
                      - containerPort: 8080

              Just like the www service, there are three Pod replicas, but only one container (microsimserver) runs in each Pod. The default microsimserver listening port of 8080 is exposed and the stats server listens on port 5000, though it is not exposed, so we’ll need to shell into it to view the stats.

              And here is the db Service:

              db Service

              apiVersion: v1
              kind: Service
              metadata:
                labels:
                  app: db
                name: db
              spec:
                ports:
                - port: 8080
                  targetPort: 8080
                selector:
                  app: db
                sessionAffinity: None

              Since this is an internal service, we are not using the LoadBalancer type, which will cause the Service to be created as a ClusterIP type, nor do we need to define externalTrafficPolicy.

              Firing up the Cluster

              Let’s bring up the cluster from within the GKE console. Create a standard cluster using the n1-standard-2 machine type with the Enable network policy option checked under the advanced Network security options:

              Figure 2: Enable network policy in GKE

              enable network policy

              Note: you can also create a cluster with network policy enabled at the command line with the --enable-network-policy argument:

              $ gcloud container clusters create test --machine-type=n1-standard-2 --enable-network-policy

              Once the cluster is up and running, we can spin up the deployment using kubectl locally after configuring it with the gcloud command, or you can use the Google Cloud Shell terminal. For simplicity, let’s use the Cloud Shell and connect to the cluster:

              Figure 3: Connect to the Cluster via Cloud Shell

              run in Cloud Shell

              Within Cloud Shell, copy paste the deployment text into a new file called simple.yaml with vi.

              Then create the deployment:

              $ kubectl create -f simple.yaml
              deployment.apps/www created
              deployment.apps/auth created
              deployment.apps/db created
              service/www created
              service/auth created
              service/db created

              You will see the deployments and services start up. You can verify the application is running successfully with the following commands:

              $ kubectl get pods
              NAME                    READY   STATUS    RESTARTS   AGE
              auth-5f964774bd-mvtcl   1/1     Running   0          67s
              auth-5f964774bd-sn4cw   1/1     Running   0          66s
              auth-5f964774bd-xtt54   1/1     Running   0          66s
              db-578757bf68-dzjdq     1/1     Running   0          66s
              db-578757bf68-kkwzr     1/1     Running   0          66s
              db-578757bf68-mlf5t     1/1     Running   0          66s
              www-5d89bcb54f-bcjm9    2/2     Running   0          67s
              www-5d89bcb54f-bzpwl    2/2     Running   0          67s
              www-5d89bcb54f-vbdf6    2/2     Running   0          67s
              $ kubectl get deploy
              NAME   READY   UP-TO-DATE   AVAILABLE   AGE
              auth   3/3     3            3           92s
              db     3/3     3            3           92s
              www    3/3     3            3           92s
              $ kubectl get service
              NAME         TYPE           CLUSTER-IP    EXTERNAL-IP     PORT(S)        AGE
              auth         ClusterIP      10.0.13.227   <none>          8080/TCP       2m1s
              db           ClusterIP      10.0.3.1      <none>          8080/TCP       2m1s
              kubernetes   ClusterIP      10.0.0.1      <none>          443/TCP        10m
              www          LoadBalancer   10.0.6.39     35.188.221.11   80:32596/TCP   2m1s

              Find the external address assigned to the www service and send an HTTP GET request to it to verify the service is responding. You can do this from Cloud Shell or your laptop:

              $ curl http://35.188.221.11
              FPGpqiVZivddHQvkvDHFErFiW2WK8Kl3ky9cEeI7TA6vH8PYmA1obaZGd1AR3avz3SqPZlcrbXFOn3hVlFQdFm9S07ca
              <snip>
              jYbD5jNA62JEQbUSqk9V0JGgYLATbYe2rv3XeFQIEayJD4qeGnPp7UbEESPBmxrw
              Wed Dec 11 20:07:08 2019   hostname: www-5d89bcb54f-vbdf6   ip: 10.56.0.4   remote: 35.197.46.124   hostheader: 35.188.221.11   path: /

              You should see a long block of random text and some client and server information on the last line. Notice if you send the request as an HTTP POST the response comes back as JSON. Here I have run the response through jq to pretty-print the response:

              $ curl -X POST http://35.188.221.11 | jq .
              {
                "data": "hhV9jogGrM7FMxsQCUAcjdsLQRgjgpCoO...",
                "time": "Wed Dec 11 20:14:20 2019",
                "hostname": "www-5d89bcb54f-vbdf6",
                "ip": "10.56.0.4",
                "remote": "46.18.117.38",
                "hostheader": "35.188.221.11",
                "path": "/"
              }

              Testing the Deployment

              Now, let’s prove that any Pod can communicate with any other Pod and that the SQLi attacks are being received by the internal services. We can do this by opening a shell to one of the www pods and one of the db pods.

              Open two new tabs in Cloud Shell and find the Pod names from the kubectl get pods command output above.

              In one tab, run the following to get a shell on the microsimclient container in the www Pod:

              $ kubectl exec www-5d89bcb54f-bcjm9 -c microsimclient -it sh
              /app #

              In the other tab, run the following to get a shell on the microsimserver container in the db Pod:

              $ kubectl exec db-578757bf68-dzjdq -c microsimserver -it sh
              /app #

              From the microsimclient shell, run the following curl command to see the application stats. This will show us how many normal and attack requests have been sent:

              /app # curl http://localhost:5001
              {
                "time": "Wed Dec 11 20:21:30 2019",
                "runtime": 1031,
                "hostname": "www-5d89bcb54f-bcjm9",
                "ip": "10.56.1.3",
                "stats": {
                  "Requests": 1026,
                  "Sent Bytes": 1062936,
                  "Received Bytes": 17006053,
                  "Internet Requests": 0,
                  "Attacks": 9,
                  "SQLi": 9,
                  "XSS": 0,
                  "Directory Traversal": 0,
                  "DGA": 0,
                  "Malware": 0,
                  "Error": 1
                },
                "config": {
                  "STATS_PORT": 5001,
                  "STATSD_HOST": null,
                  "STATSD_PORT": 8125,
                  "REQUEST_URLS": "http://auth.default.svc.cluster.local:8080,http://db.default.svc.cluster.local:8080",
                  "REQUEST_INTERNET": false,
                  "REQUEST_MALWARE": false,
                  "SEND_SQLI": true,
                  "SEND_DIR_TRAVERSAL": false,
                  "SEND_XSS": false,
                  "SEND_DGA": false,
                  "REQUEST_WAIT_SECONDS": 1.0,
                  "REQUEST_BYTES": 1024,
                  "STOP_SECONDS": 0,
                  "STOP_PADDING": false,
                  "TOTAL_STOP_SECONDS": 0,
                  "REQUEST_PROBABILITY": 1.0,
                  "EGRESS_PROBABILITY": 0.1,
                  "ATTACK_PROBABILITY": 0.01
                }
              }

              Run the command a few times until you see a number of SQLi attacks have been sent. Here we see that this microsimclient instance has sent 9 SQLi attacks in the last 1031 seconds of runtime.

              From the microsimserver shell, curl the server stats to see if any SQLi attacks have been detected:

              /app # curl http://localhost:5000
              {
                "time": "Wed Dec 11 20:23:52 2019",
                "runtime": 1177,
                "hostname": "db-578757bf68-dzjdq",
                "ip": "10.56.2.11",
                "stats": {
                  "Requests": 610,
                  "Sent Bytes": 10110236,
                  "Received Bytes": 629888,
                  "Attacks": 2,
                  "SQLi": 2,
                  "XSS": 0,
                  "Directory Traversal": 0
                },
                "config": {
                  "LISTEN_PORT": 8080,
                  "STATS_PORT": 5000,
                  "STATSD_HOST": null,
                  "STATSD_PORT": 8125,
                  "RESPOND_BYTES": 16384,
                  "STOP_SECONDS": 0,
                  "STOP_PADDING": false,
                  "TOTAL_STOP_SECONDS": 0
                }
              }

              Here we see that this particular server has detected two SQLi attacks coming from the clients within the cluster. (East/West traffic) Remember, there are also five other db and auth Pods that are receiving attacks so you will see the attack load shared amongst them.

              Let’s also demonstrate that the db server can directly communicate with the auth service:

              /app # curl http://auth:8080
              firOXAY4hktZLjHvbs41JhReCWHqs... <snip>
              Wed Dec 11 20:26:38 2019   hostname: auth-5f964774bd-mvtcl   ip: 10.56.1.4   remote: 10.56.2.11   hostheader: auth:8080   path: /

              Since we get a response it is clear that there is no micro-segmentation in place between the db and auth Services and Pods.

              Microservice logging

              As with most services in Kubernetes, both microsimclient and microsimserver regularly send logs for each request and response to stdout, which means they can be found with the kubectl logs command. Every 30 seconds a JSON summary will also be logged:

              microsimclient logs

              $ kubectl logs www-5d89bcb54f-bcjm9 microsimclient
              2019-12-11T20:04:19   Request to http://auth.default.svc.cluster.local:8080/   Request size: 1036   Response size: 16577
              2019-12-11T20:04:20   Request to http://db.default.svc.cluster.local:8080/   Request size: 1036   Response size: 16573
              2019-12-11T20:04:21   Request to http://auth.default.svc.cluster.local:8080/   Request size: 1036   Response size: 16577
              2019-12-11T20:04:22   Request to http://auth.default.svc.cluster.local:8080/   Request size: 1036   Response size: 16577
              2019-12-11T20:04:23   Request to http://auth.default.svc.cluster.local:8080/   Request size: 1036   Response size: 16577
              2019-12-11T20:04:23   SQLi sent: http://auth.default.svc.cluster.local:8080/?username=joe%40example.com&password=%3BUNION+SELECT+1%2C+version%28%29+limit+1%2C1--
              2019-12-11T20:04:24   Request to http://db.default.svc.cluster.local:8080/   Request size: 1036   Response size: 16574
              2019-12-11T20:04:25   Request to http://auth.default.svc.cluster.local:8080/   Request size: 1036   Response size: 16577
              2019-12-11T20:04:26   Request to http://db.default.svc.cluster.local:8080/   Request size: 1036   Response size: 16573
              2019-12-11T20:04:27   Request to http://db.default.svc.cluster.local:8080/   Request size: 1036   Response size: 16573
              2019-12-11T20:04:28   Request to http://auth.default.svc.cluster.local:8080/   Request size: 1036   Response size: 16577
              2019-12-11T20:04:29   Request to http://auth.default.svc.cluster.local:8080/   Request size: 1036   Response size: 16577
              2019-12-11T20:04:30   Request to http://auth.default.svc.cluster.local:8080/   Request size: 1036   Response size: 16577
              2019-12-11T20:04:31   Request to http://auth.default.svc.cluster.local:8080/   Request size: 1036   Response size: 16577
              2019-12-11T20:04:32   Request to http://auth.default.svc.cluster.local:8080/   Request size: 1036   Response size: 16577
              2019-12-11T20:04:33   Request to http://db.default.svc.cluster.local:8080/   Request size: 1036   Response size: 16573
              2019-12-11T20:04:34   Request to http://db.default.svc.cluster.local:8080/   Request size: 1036   Response size: 16573
              2019-12-11T20:04:35   Request to http://auth.default.svc.cluster.local:8080/   Request size: 1036   Response size: 16577
              2019-12-11T20:04:36   Request to http://auth.default.svc.cluster.local:8080/   Request size: 1036   Response size: 16577
              2019-12-11T20:04:37   Request to http://auth.default.svc.cluster.local:8080/   Request size: 1036   Response size: 16577
              2019-12-11T20:04:38   Request to http://auth.default.svc.cluster.local:8080/   Request size: 1036   Response size: 16577
              2019-12-11T20:04:39   Request to http://db.default.svc.cluster.local:8080/   Request size: 1036   Response size: 16573
              2019-12-11T20:04:40   Request to http://auth.default.svc.cluster.local:8080/   Request size: 1036   Response size: 16577
              2019-12-11T20:04:41   Request to http://db.default.svc.cluster.local:8080/   Request size: 1036   Response size: 16573
              2019-12-11T20:04:42   Request to http://db.default.svc.cluster.local:8080/   Request size: 1036   Response size: 16573
              2019-12-11T20:04:43   Request to http://auth.default.svc.cluster.local:8080/   Request size: 1036   Response size: 16577
              2019-12-11T20:04:44   Request to http://auth.default.svc.cluster.local:8080/   Request size: 1036   Response size: 16577
              2019-12-11T20:04:45   Request to http://auth.default.svc.cluster.local:8080/   Request size: 1036   Response size: 16577
              2019-12-11T20:04:46   Request to http://db.default.svc.cluster.local:8080/   Request size: 1036   Response size: 16573
              2019-12-11T20:04:47   Request to http://db.default.svc.cluster.local:8080/   Request size: 1036   Response size: 16573
              2019-12-11T20:04:48   Request to http://auth.default.svc.cluster.local:8080/   Request size: 1036   Response size: 16577
              {"Total": {"Requests": 30, "Sent Bytes": 31080, "Received Bytes": 497267, "Internet Requests": 0, "Attacks": 1, "SQLi": 1, "XSS": 0, "Directory Traversal": 0, "DGA": 0, "Malware": 0, "Error": 0}, "Last 30 Seconds": {"Requests": 30, "Sent Bytes": 31080, "Received Bytes": 497267, "Internet Requests": 0, "Attacks": 1, "SQLi": 1, "XSS": 0, "Directory Traversal": 0, "DGA": 0, "Malware": 0, "Error": 0}}
              2019-12-11T20:04:49   Request to http://db.default.svc.cluster.local:8080/   Request size: 1036   Response size: 16573
              ...

              microsimserver logs

              $ kubectl logs db-578757bf68-dzjdq microsimserver
              10.56.1.5 - - [11/Dec/2019 20:04:22] "POST / HTTP/1.1" 200 -
              10.56.0.4 - - [11/Dec/2019 20:04:22] "POST / HTTP/1.1" 200 -
              10.56.1.3 - - [11/Dec/2019 20:04:24] "POST / HTTP/1.1" 200 -
              10.56.1.5 - - [11/Dec/2019 20:04:25] "POST / HTTP/1.1" 200 -
              10.56.0.4 - - [11/Dec/2019 20:04:26] "POST / HTTP/1.1" 200 -
              10.56.1.5 - - [11/Dec/2019 20:04:27] "POST / HTTP/1.1" 200 -
              10.56.0.4 - - [11/Dec/2019 20:04:33] "POST / HTTP/1.1" 200 -
              10.56.0.4 - - [11/Dec/2019 20:04:35] "POST / HTTP/1.1" 200 -
              10.56.0.4 - - [11/Dec/2019 20:04:41] "POST / HTTP/1.1" 200 -
              10.56.0.4 - - [11/Dec/2019 20:04:43] "POST / HTTP/1.1" 200 -
              {"Total": {"Requests": 10, "Sent Bytes": 165740, "Received Bytes": 10360, "Attacks": 0, "SQLi": 0, "XSS": 0, "Directory Traversal": 0}, "Last 30 Seconds": {"Requests": 10, "Sent Bytes": 165740, "Received Bytes": 10360, "Attacks": 0, "SQLi": 0, "XSS": 0, "Directory Traversal": 0}}
              10.56.1.5 - - [11/Dec/2019 20:04:47] "POST / HTTP/1.1" 200 -
              ...

              You can see how the traffic is automatically being load balanced by the Kubernetes cluster by inspecting the request sources in the microsimserver logs.

              Adding Micro-segmentation and Application Layer Protection

              Stay tuned for the next post where we will take this simple, insecure deployment, and implement a Security Services Layer pattern. Then we’ll show how the internal application layer attacks are blocked with this approach. Finally, we will demonstrate micro-segmentation which will restrict access between microservices, for example, traffic between the auth and db services.

              Note: Depending on your Google Cloud account status you may incur charges for the cluster, so remember to delete it from the GKE console when you are done. You may also need to delete any load balancer objects that were created by the deployment within GCP to avoid residual charges to your account.

              Next in the series: Part 3