Practical JSON at the Command Line

Prefer python syntax over jq? Please see a new version of this article that uses jello instead.

I’m a big fan of using JSON at the command line instead of filtering and piping unstructured text between processes. My article on Bringing the Unix Philosopy to the 21st Century explains many of the benefits of using JSON instead of plain text. I also created jc, which converts the output of dozens of commands and file-types to JSON, which allows many new possibilities for automation at the command line.

There are many blog posts on how to use tools like jq to filter JSON at the command line. But I would like to write about how you can actually use that JSON to make your life easier in Bash.

How do you get that beautifully filtered JSON data into a usable form, such as a list or array, in Bash? What are some best practices when working with JSON data in Bash? Let’s start simple and work our way up.

In this article we will be processing the output of rpm -qia so we can get a nice list of RPM package metadata objects to play around with. We’ll use jc to convert the rpm command output to JSON so we can process it in jq and then use in our script.

We’ll look at three scenarios:

  • Assigning a Bash variable from a single JSON attribute
  • Assigning a simple list Bash variable from a JSON array
  • Assigning a Bash array from a JSON array of objects

Assigning a Variable from a Single Attribute

The simplest scenario is to pull a single value from the JSON data we are interested in. If we run rpm -qia | jc --rpm-qi we will get a JSON array of rpm metadata objects to work with. I’ll use the -p option in jc to pretty-print the JSON:

$ rpm -qia | jc --rpm-qi -p
[
  {
    "name": "make",
    "epoch": 1,
    "version": "3.82",
    "release": "24.el7",
    "architecture": "x86_64",
    "install_date": "Wed 16 Oct 2019 09:21:42 AM PDT",
    "group": "Development/Tools",
    "size": 1160660,
    "license": "GPLv2+",
    "signature": "RSA/SHA256, Thu 22 Aug 2019 02:34:59 PM PDT, Key ID 24c6a8a7f4a80eb5",
    "source_rpm": "make-3.82-24.el7.src.rpm",
    "build_date": "Thu 08 Aug 2019 05:47:25 PM PDT",
    "build_host": "x86-01.bsys.centos.org",
    "relocations": "(not relocatable)",
    "packager": "CentOS BuildSystem <http://bugs.centos.org>",
    "vendor": "CentOS",
    "url": "http://www.gnu.org/software/make/",
    "summary": "A GNU tool which simplifies the build process for users",
    "description": "A GNU tool for controlling the generation of executables and other non-source files of a program from the program's source files. Make allows users to build and install packages without any significant knowledge about the details of the build process. The details about how the program should be built are provided for make in the program's makefile.",
    "build_epoch": 1565311645,
    "build_epoch_utc": null
  },
  {
    "name": "kbd-legacy",
    "version": "1.15.5",
    "release": "15.el7",
    "architecture": "noarch",
    "install_date": "Thu 15 Aug 2019 10:53:08 AM PDT",
    "group": "System Environment/Base",
    "size": 503608,
    "license": "GPLv2+",
    "signature": "RSA/SHA256, Mon 12 Nov 2018 07:17:49 AM PST, Key ID 24c6a8a7f4a80eb5",
    "source_rpm": "kbd-1.15.5-15.el7.src.rpm",
    "build_date": "Tue 30 Oct 2018 03:40:00 PM PDT",
    "build_host": "x86-01.bsys.centos.org",
    "relocations": "(not relocatable)",
    "packager": "CentOS BuildSystem <http://bugs.centos.org>",
    "vendor": "CentOS",
    "url": "http://ftp.altlinux.org/pub/people/legion/kbd",
    "summary": "Legacy data for kbd package",
    "description": "The kbd-legacy package contains original keymaps for kbd package. Please note that kbd-legacy is not helpful without kbd.",
    "build_epoch": 1540939200,
    "build_epoch_utc": null
  },
  ...
]

Ok, that is a long JSON array of objects. Let’s narrow it down to only packages that use the MIT license with jq:

$ rpm -qia | jc --rpm-qi | jq '.[] | select(.license == "MIT")'
{
  "name": "ncurses-base",
  "version": "5.9",
  "release": "14.20130511.el7_4",
  "architecture": "noarch",
  "install_date": "Thu 15 Aug 2019 10:53:08 AM PDT",
  "group": "System Environment/Base",
  "size": 223432,
  "license": "MIT",
  "signature": "RSA/SHA256, Thu 07 Sep 2017 05:43:15 AM PDT, Key ID 24c6a8a7f4a80eb5",
  "source_rpm": "ncurses-5.9-14.20130511.el7_4.src.rpm",
  "build_date": "Wed 06 Sep 2017 03:08:29 PM PDT",
  "build_host": "c1bm.rdu2.centos.org",
  "relocations": "(not relocatable)",
  "packager": "CentOS BuildSystem <http://bugs.centos.org>",
  "vendor": "CentOS",
  "url": "http://invisible-island.net/ncurses/ncurses.html",
  "summary": "Descriptions of common terminals",
  "description": "This package contains descriptions of common terminals. Other terminal descriptions are included in the ncurses-term package.",
  "build_epoch": 1504735709,
  "build_epoch_utc": null
}
{
  "name": "ncurses-libs",
  "version": "5.9",
  "release": "14.20130511.el7_4",
  "architecture": "x86_64",
  "install_date": "Thu 15 Aug 2019 10:53:16 AM PDT",
  "group": "System Environment/Libraries",
  "size": 1028216,
  "license": "MIT",
  "signature": "RSA/SHA256, Thu 07 Sep 2017 05:43:31 AM PDT, Key ID 24c6a8a7f4a80eb5",
  "source_rpm": "ncurses-5.9-14.20130511.el7_4.src.rpm",
  "build_date": "Wed 06 Sep 2017 03:08:29 PM PDT",
  "build_host": "c1bm.rdu2.centos.org",
  "relocations": "(not relocatable)",
  "packager": "CentOS BuildSystem <http://bugs.centos.org>",
  "vendor": "CentOS",
  "url": "http://invisible-island.net/ncurses/ncurses.html",
  "summary": "Ncurses libraries",
  "description": "The curses library routines are a terminal-independent method of updating character screens with reasonable optimization.  The ncurses (new curses) library is a freely distributable replacement for the discontinued 4.4 BSD classic curses library. This package contains the ncurses libraries.",
  "build_epoch": 1504735709,
  "build_epoch_utc": null
}
...

Now the list is much smaller. Also, notice that jq unpacked the JSON objects from the array for us. (There is no-longer a set of square brackets around the output). In this form, this is not exactly usable in a Bash script. In fact, this is no longer even a single valid JSON object, but a series of smaller JSON objects. We’ll need to get this data into a format that Bash can use.

In this first, simple example, we just want a single attribute from a single object. So let’s filter the data to do that by filtering on the newest build_epoch date and selecting the name field:

$ rpm -qia | jc --rpm-qi | jq 'sort_by(.build_epoch)[] | select(.license == "MIT")' | jq -sr '.[-1].name'
jc

The particulars of the jq query itself are outside the scope of this article. For more information on how to properly structure a jq query, see here, here, and here.

Not a fan of jq syntax? Already know how to work with JSON in Python? Try out jello, which works just like jq, but uses Python syntax!

Well, isn’t that convenient? jc was the last package built on the system. Notice that we use the -r option in jq to strip the quotation marks from the string result. Since that jq query spit out a single word, it’s pretty straightforward to assign it to a Bash variable:

$ package_name=$(rpm -qia | jc --rpm-qi | jq 'sort_by(.build_epoch)[] | select(.license == "MIT")' | jq -sr '.[-1].name')
$ echo $package_name
jc

This is a good start if we just need a single attribute, but many times in our scripts we have multiple items we need to deal with. Assigning a single Bash variable to a JSON attribute can get tedious and slow if we need to iterate over a large dataset.

Now, let’s look at assigning more than one item to a Bash variable to use it as a list in a for loop.

Assigning a List from a JSON Array

In our next example, we’ll get a list of MIT licensed packages from our rpm -qia query and do something with the output. In this case, we’ll just create a text file for each package, using the name attribute as the filename and the contents will have some text, including the package name. First, lets see the output of the jq filter:

$ rpm -qia | jc --rpm-qi | jq -r '.[] | select(.license == "MIT") | .name'
curl
dbus-python
expat
jansson
...

And now, lets use that filter in a script by assigning it to a Bash variable that will act as a word list:

#!/bin/bash

packages=$(rpm -qia | jc --rpm-qi | jq -r '.[] | select(.license == "MIT") | .name')

for package in $packages; do
    echo "Package name is ${package}" > "${package}".txt
done

After running this script, we get a list of files named after the package names. Inside of the files is a bit of text:

$ ls
create_files.sh  jc.txt                libcom_err.txt   libpciaccess.txt    libyaml.txt       popt.txt
curl.txt         json-c.txt            libcurl.txt      libss.txt           lua.txt           python-iniparse.txt
dbus-python.txt  krb5-devel.txt        libdrm.txt       libverto-devel.txt  ncurses-base.txt  python-pytoml.txt
expat.txt        krb5-libs.txt         libfastjson.txt  libverto.txt        ncurses-libs.txt  PyYAML.txt
jansson.txt      libcom_err-devel.txt  libkadm5.txt     libxml2.txt         ncurses.txt       rubygem-psych.txt
$ cat jc.txt 
Package name is jc

That was easy enough, but remember this only works when each item is a single word and you just want to iterate over the same JSON attribute over and over again in a Bash for loop.

What if I want to include other metadata, like the description, in the text file? One way would be to create another list Bash variable from another jq query and then iterate over the list again. Or, inside the for loop, we could do another rpm -qi query and grab the attribute we want just-in-time:

#!/bin/bash

packages=$(rpm -qia | jc --rpm-qi | jq -r '.[] | select(.license == "MIT") | .name')

for package in $packages; do
    description=$(rpm -qi "${package}" | jc --rpm-qi | jq -r .[0].description)
    echo "Package name is ${package}" > "${package}".txt
    echo "The description is:  ${description}" >> "${package}".txt
done

This works:

$ ./create_files.sh 
$ ls
create_files.sh  jc.txt                libcom_err.txt   libpciaccess.txt    libyaml.txt       popt.txt
curl.txt         json-c.txt            libcurl.txt      libss.txt           lua.txt           python-iniparse.txt
dbus-python.txt  krb5-devel.txt        libdrm.txt       libverto-devel.txt  ncurses-base.txt  python-pytoml.txt
expat.txt        krb5-libs.txt         libfastjson.txt  libverto.txt        ncurses-libs.txt  PyYAML.txt
jansson.txt      libcom_err-devel.txt  libkadm5.txt     libxml2.txt         ncurses.txt       rubygem-psych.txt
$ cat jc.txt 
Package name is jc
The description is:  This tool serializes the output of popular gnu linux command line tools and file types to structured JSON output

But it is a little inefficient since we need to run the rpm -qi [package] query many times during the script. A better method would be to do the rpm -qia query one time, which will give us all of the package data at once and then just select the attributes we want in our script. We’ll do that next!

Assigning a Bash Array from a JSON Array of Objects

In other programming languages, like python, it is pretty straightforward to load a JSON string of any depth and complexity and use it as a dictionary or list. Unfortunately, Bash does not have the same native capability, but we can do some useful things by assigning JSON objects to a Bash array.

At first glance, this seems like it should be pretty easy with a single variable assignment statement, but in fact, we’ll need to use a while loop and read lines from jq so Bash can ingest the JSON lines data into the Bash array. This way we can easily iterate through the data in a similar way we would with python.

In this example, we’ll take the filtered JSON output of the rpm -qia command, iterate over all of the objects (each object is a package) and pull the attributes we want to use in a for loop. This should be a more efficient example of the last script we created since we are only running the rpm -qia command once. First let’s just iterate and print the raw Bash array elements so we can see what it looks like:

#!/bin/bash

# pull the rpm package objects into a bash array from jq
packages=()
while read -r value; do
    packages+=("$value")
done < <(rpm -qia | jc --rpm-qi | jq -c '.[] | select(.license == "MIT")')

# iterate over the bash array
for package in "${packages[@]}"; do
    echo "${package}"
    echo
done

There are a few interesting things going on in this script:

  • A Bash array variable named packages is created with packages=()
  • A while loop reads in all of the JSON objects created by jq into the packages Bash array.
    • Note: mapfile -t packages < <( ... ) can be substituted for the while loop when using Bash 4.0 and higher.
  • The jq command uses the -c option which prints each JSON object on a single line. This is the magic that allows the object to be read in as a Bash array element.
  • Then we use a standard for loop to iterate over each package object, which contains all of the attributes we want to extract into variables.
  • Finally, we do something with those variables.

When we run this script, we see the following output:

$ ./print_array.sh 
{"name":"ncurses-base","version":"5.9","release":"14.20130511.el7_4","architecture":"noarch","install_date":"Thu 15 Aug 2019 10:53:08 AM PDT","group":"System Environment/Base","size":223432,"license":"MIT","signature":"RSA/SHA256, Thu 07 Sep 2017 05:43:15 AM PDT, Key ID 24c6a8a7f4a80eb5","source_rpm":"ncurses-5.9-14.20130511.el7_4.src.rpm","build_date":"Wed 06 Sep 2017 03:08:29 PM PDT","build_host":"c1bm.rdu2.centos.org","relocations":"(not relocatable)","packager":"CentOS BuildSystem <http://bugs.centos.org>","vendor":"CentOS","url":"http://invisible-island.net/ncurses/ncurses.html","summary":"Descriptions of common terminals","description":"This package contains descriptions of common terminals. Other terminal descriptions are included in the ncurses-term package.","build_epoch":1504735709,"build_epoch_utc":null}

{"name":"ncurses-libs","version":"5.9","release":"14.20130511.el7_4","architecture":"x86_64","install_date":"Thu 15 Aug 2019 10:53:16 AM PDT","group":"System Environment/Libraries","size":1028216,"license":"MIT","signature":"RSA/SHA256, Thu 07 Sep 2017 05:43:31 AM PDT, Key ID 24c6a8a7f4a80eb5","source_rpm":"ncurses-5.9-14.20130511.el7_4.src.rpm","build_date":"Wed 06 Sep 2017 03:08:29 PM PDT","build_host":"c1bm.rdu2.centos.org","relocations":"(not relocatable)","packager":"CentOS BuildSystem <http://bugs.centos.org>","vendor":"CentOS","url":"http://invisible-island.net/ncurses/ncurses.html","summary":"Ncurses libraries","description":"The curses library routines are a terminal-independent method of updating character screens with reasonable optimization.  The ncurses (new curses) library is a freely distributable replacement for the discontinued 4.4 BSD classic curses library. This package contains the ncurses libraries.","build_epoch":1504735709,"build_epoch_utc":null}
...

Very cool! Now we can use jq to pull any attribute we want into a variable within the for loop:

#!/bin/bash

# pull the rpm package objects into a bash array from jq
packages=()
while read -r value; do
    packages+=("$value")
done < <(rpm -qia | jc --rpm-qi | jq -c '.[] | select(.license == "MIT")')

# iterate over the bash array
for package in "${packages[@]}"; do
    name=$(jq -r '.name' <<< "${package}")
    description=$(jq -r '.description' <<< "${package}")
    version=$(jq -r '.version' <<< "${package}")
    
    echo "Package name is ${name}" > "${name}".txt
    echo "The description is:  ${description}" >> "${name}".txt
    echo "The version is:  ${version}" >> "${name}".txt
done

And here’s what it does:

$ ./create_files.sh 
$ ls
create_files.sh  jc.txt                libcom_err.txt   libpciaccess.txt    libyaml.txt       popt.txt
curl.txt         json-c.txt            libcurl.txt      libss.txt           lua.txt           python-iniparse.txt
dbus-python.txt  krb5-devel.txt        libdrm.txt       libverto-devel.txt  ncurses-base.txt  python-pytoml.txt
expat.txt        krb5-libs.txt         libfastjson.txt  libverto.txt        ncurses-libs.txt  PyYAML.txt
jansson.txt      libcom_err-devel.txt  libkadm5.txt     libxml2.txt         ncurses.txt       rubygem-psych.txt
$ cat jc.txt 
Package name is jc
The description is:  This tool serializes the output of popular gnu linux command line tools and file types to structured JSON output
The version is:  1.15.0

As you can see, this is more efficient and allows you to pull in any attribute you would like from each Bash array element. Each element is acting like a JSON object that jq can query.

Conclusion

We went through a few scenarios of how to assign JSON data to Bash variables and arrays with jc and jq. Using JSON instead of plain text allows you to be more expressive in your queries. Also, JSON has the advantage of allowing new fields to be added at any time without breaking your existing query.

JSON can be used by simply assigning a string word to a Bash variable, a string list of words to a variable and looping over the list, or by assigning entire JSON objects to Bash array elements, which can be further queried by jq within a loop. These are powerful ways JSON data can help you write better scripts.

Published by kellyjonbrazil

I'm a cybersecurity and cloud computing nerd.

One thought on “Practical JSON at the Command Line

Leave a Reply

%d bloggers like this: