Prefer python syntax over
jq
? Please see a new version of this article that usesjello
instead.
I’m a big fan of using JSON at the command line instead of filtering and piping unstructured text between processes. My article on Bringing the Unix Philosopy to the 21st Century explains many of the benefits of using JSON instead of plain text. I also created jc
, which converts the output of dozens of commands and file-types to JSON, which allows many new possibilities for automation at the command line.
There are many blog posts on how to use tools like jq
to filter JSON at the command line. But I would like to write about how you can actually use that JSON to make your life easier in Bash.
How do you get that beautifully filtered JSON data into a usable form, such as a list or array, in Bash? What are some best practices when working with JSON data in Bash? Let’s start simple and work our way up.
In this article we will be processing the output of rpm -qia
so we can get a nice list of RPM package metadata objects to play around with. We’ll use jc
to convert the rpm
command output to JSON so we can process it in jq
and then use in our script.
We’ll look at three scenarios:
- Assigning a Bash variable from a single JSON attribute
- Assigning a simple list Bash variable from a JSON array
- Assigning a Bash array from a JSON array of objects
Assigning a Variable from a Single Attribute
The simplest scenario is to pull a single value from the JSON data we are interested in. If we run rpm -qia | jc --rpm-qi
we will get a JSON array of rpm
metadata objects to work with. I’ll use the -p
option in jc
to pretty-print the JSON:
$ rpm -qia | jc --rpm-qi -p [ { "name": "make", "epoch": 1, "version": "3.82", "release": "24.el7", "architecture": "x86_64", "install_date": "Wed 16 Oct 2019 09:21:42 AM PDT", "group": "Development/Tools", "size": 1160660, "license": "GPLv2+", "signature": "RSA/SHA256, Thu 22 Aug 2019 02:34:59 PM PDT, Key ID 24c6a8a7f4a80eb5", "source_rpm": "make-3.82-24.el7.src.rpm", "build_date": "Thu 08 Aug 2019 05:47:25 PM PDT", "build_host": "x86-01.bsys.centos.org", "relocations": "(not relocatable)", "packager": "CentOS BuildSystem <http://bugs.centos.org>", "vendor": "CentOS", "url": "http://www.gnu.org/software/make/", "summary": "A GNU tool which simplifies the build process for users", "description": "A GNU tool for controlling the generation of executables and other non-source files of a program from the program's source files. Make allows users to build and install packages without any significant knowledge about the details of the build process. The details about how the program should be built are provided for make in the program's makefile.", "build_epoch": 1565311645, "build_epoch_utc": null }, { "name": "kbd-legacy", "version": "1.15.5", "release": "15.el7", "architecture": "noarch", "install_date": "Thu 15 Aug 2019 10:53:08 AM PDT", "group": "System Environment/Base", "size": 503608, "license": "GPLv2+", "signature": "RSA/SHA256, Mon 12 Nov 2018 07:17:49 AM PST, Key ID 24c6a8a7f4a80eb5", "source_rpm": "kbd-1.15.5-15.el7.src.rpm", "build_date": "Tue 30 Oct 2018 03:40:00 PM PDT", "build_host": "x86-01.bsys.centos.org", "relocations": "(not relocatable)", "packager": "CentOS BuildSystem <http://bugs.centos.org>", "vendor": "CentOS", "url": "http://ftp.altlinux.org/pub/people/legion/kbd", "summary": "Legacy data for kbd package", "description": "The kbd-legacy package contains original keymaps for kbd package. Please note that kbd-legacy is not helpful without kbd.", "build_epoch": 1540939200, "build_epoch_utc": null }, ... ]
Ok, that is a long JSON array of objects. Let’s narrow it down to only packages that use the MIT license with jq
:
$ rpm -qia | jc --rpm-qi | jq '.[] | select(.license == "MIT")' { "name": "ncurses-base", "version": "5.9", "release": "14.20130511.el7_4", "architecture": "noarch", "install_date": "Thu 15 Aug 2019 10:53:08 AM PDT", "group": "System Environment/Base", "size": 223432, "license": "MIT", "signature": "RSA/SHA256, Thu 07 Sep 2017 05:43:15 AM PDT, Key ID 24c6a8a7f4a80eb5", "source_rpm": "ncurses-5.9-14.20130511.el7_4.src.rpm", "build_date": "Wed 06 Sep 2017 03:08:29 PM PDT", "build_host": "c1bm.rdu2.centos.org", "relocations": "(not relocatable)", "packager": "CentOS BuildSystem <http://bugs.centos.org>", "vendor": "CentOS", "url": "http://invisible-island.net/ncurses/ncurses.html", "summary": "Descriptions of common terminals", "description": "This package contains descriptions of common terminals. Other terminal descriptions are included in the ncurses-term package.", "build_epoch": 1504735709, "build_epoch_utc": null } { "name": "ncurses-libs", "version": "5.9", "release": "14.20130511.el7_4", "architecture": "x86_64", "install_date": "Thu 15 Aug 2019 10:53:16 AM PDT", "group": "System Environment/Libraries", "size": 1028216, "license": "MIT", "signature": "RSA/SHA256, Thu 07 Sep 2017 05:43:31 AM PDT, Key ID 24c6a8a7f4a80eb5", "source_rpm": "ncurses-5.9-14.20130511.el7_4.src.rpm", "build_date": "Wed 06 Sep 2017 03:08:29 PM PDT", "build_host": "c1bm.rdu2.centos.org", "relocations": "(not relocatable)", "packager": "CentOS BuildSystem <http://bugs.centos.org>", "vendor": "CentOS", "url": "http://invisible-island.net/ncurses/ncurses.html", "summary": "Ncurses libraries", "description": "The curses library routines are a terminal-independent method of updating character screens with reasonable optimization. The ncurses (new curses) library is a freely distributable replacement for the discontinued 4.4 BSD classic curses library. This package contains the ncurses libraries.", "build_epoch": 1504735709, "build_epoch_utc": null } ...
Now the list is much smaller. Also, notice that jq
unpacked the JSON objects from the array for us. (There is no-longer a set of square brackets around the output). In this form, this is not exactly usable in a Bash script. In fact, this is no longer even a single valid JSON object, but a series of smaller JSON objects. We’ll need to get this data into a format that Bash can use.
In this first, simple example, we just want a single attribute from a single object. So let’s filter the data to do that by filtering on the newest build_epoch
date and selecting the name
field:
$ rpm -qia | jc --rpm-qi | jq 'sort_by(.build_epoch)[] | select(.license == "MIT")' | jq -sr '.[-1].name' jc
The particulars of the
jq
query itself are outside the scope of this article. For more information on how to properly structure ajq
query, see here, here, and here.
Not a fan of
jq
syntax? Already know how to work with JSON in Python? Try outjello
, which works just likejq
, but uses Python syntax!
Well, isn’t that convenient? jc
was the last package built on the system. Notice that we use the -r
option in jq
to strip the quotation marks from the string result. Since that jq
query spit out a single word, it’s pretty straightforward to assign it to a Bash variable:
$ package_name=$(rpm -qia | jc --rpm-qi | jq 'sort_by(.build_epoch)[] | select(.license == "MIT")' | jq -sr '.[-1].name') $ echo $package_name jc
This is a good start if we just need a single attribute, but many times in our scripts we have multiple items we need to deal with. Assigning a single Bash variable to a JSON attribute can get tedious and slow if we need to iterate over a large dataset.
Now, let’s look at assigning more than one item to a Bash variable to use it as a list in a for
loop.
Assigning a List from a JSON Array
In our next example, we’ll get a list of MIT licensed packages from our rpm -qia
query and do something with the output. In this case, we’ll just create a text file for each package, using the name
attribute as the filename and the contents will have some text, including the package name. First, lets see the output of the jq
filter:
$ rpm -qia | jc --rpm-qi | jq -r '.[] | select(.license == "MIT") | .name' curl dbus-python expat jansson ...
And now, lets use that filter in a script by assigning it to a Bash variable that will act as a word list:
#!/bin/bash packages=$(rpm -qia | jc --rpm-qi | jq -r '.[] | select(.license == "MIT") | .name') for package in $packages; do echo "Package name is ${package}" > "${package}".txt done
After running this script, we get a list of files named after the package names. Inside of the files is a bit of text:
$ ls create_files.sh jc.txt libcom_err.txt libpciaccess.txt libyaml.txt popt.txt curl.txt json-c.txt libcurl.txt libss.txt lua.txt python-iniparse.txt dbus-python.txt krb5-devel.txt libdrm.txt libverto-devel.txt ncurses-base.txt python-pytoml.txt expat.txt krb5-libs.txt libfastjson.txt libverto.txt ncurses-libs.txt PyYAML.txt jansson.txt libcom_err-devel.txt libkadm5.txt libxml2.txt ncurses.txt rubygem-psych.txt $ cat jc.txt Package name is jc
That was easy enough, but remember this only works when each item is a single word and you just want to iterate over the same JSON attribute over and over again in a Bash for
loop.
What if I want to include other metadata, like the description, in the text file? One way would be to create another list Bash variable from another jq
query and then iterate over the list again. Or, inside the for
loop, we could do another rpm -qi
query and grab the attribute we want just-in-time:
#!/bin/bash packages=$(rpm -qia | jc --rpm-qi | jq -r '.[] | select(.license == "MIT") | .name') for package in $packages; do description=$(rpm -qi "${package}" | jc --rpm-qi | jq -r .[0].description) echo "Package name is ${package}" > "${package}".txt echo "The description is: ${description}" >> "${package}".txt done
This works:
$ ./create_files.sh $ ls create_files.sh jc.txt libcom_err.txt libpciaccess.txt libyaml.txt popt.txt curl.txt json-c.txt libcurl.txt libss.txt lua.txt python-iniparse.txt dbus-python.txt krb5-devel.txt libdrm.txt libverto-devel.txt ncurses-base.txt python-pytoml.txt expat.txt krb5-libs.txt libfastjson.txt libverto.txt ncurses-libs.txt PyYAML.txt jansson.txt libcom_err-devel.txt libkadm5.txt libxml2.txt ncurses.txt rubygem-psych.txt $ cat jc.txt Package name is jc The description is: This tool serializes the output of popular gnu linux command line tools and file types to structured JSON output
But it is a little inefficient since we need to run the rpm -qi [package]
query many times during the script. A better method would be to do the rpm -qia
query one time, which will give us all of the package data at once and then just select the attributes we want in our script. We’ll do that next!
Assigning a Bash Array from a JSON Array of Objects
In other programming languages, like python, it is pretty straightforward to load a JSON string of any depth and complexity and use it as a dictionary or list. Unfortunately, Bash does not have the same native capability, but we can do some useful things by assigning JSON objects to a Bash array.
At first glance, this seems like it should be pretty easy with a single variable assignment statement, but in fact, we’ll need to use a while
loop and read lines from jq
so Bash can ingest the JSON lines data into the Bash array. This way we can easily iterate through the data in a similar way we would with python.
In this example, we’ll take the filtered JSON output of the rpm -qia
command, iterate over all of the objects (each object is a package) and pull the attributes we want to use in a for
loop. This should be a more efficient example of the last script we created since we are only running the rpm -qia
command once. First let’s just iterate and print the raw Bash array elements so we can see what it looks like:
#!/bin/bash # pull the rpm package objects into a bash array from jq packages=() while read -r value; do packages+=("$value") done < <(rpm -qia | jc --rpm-qi | jq -c '.[] | select(.license == "MIT")') # iterate over the bash array for package in "${packages[@]}"; do echo "${package}" echo done
There are a few interesting things going on in this script:
- A Bash array variable named
packages
is created withpackages=()
- A
while
loop reads in all of the JSON objects created byjq
into thepackages
Bash array.- Note:
mapfile -t packages < <( ... )
can be substituted for thewhile
loop when using Bash 4.0 and higher.
- Note:
- The
jq
command uses the-c
option which prints each JSON object on a single line. This is the magic that allows the object to be read in as a Bash array element. - Then we use a standard
for
loop to iterate over each package object, which contains all of the attributes we want to extract into variables. - Finally, we do something with those variables.
When we run this script, we see the following output:
$ ./print_array.sh {"name":"ncurses-base","version":"5.9","release":"14.20130511.el7_4","architecture":"noarch","install_date":"Thu 15 Aug 2019 10:53:08 AM PDT","group":"System Environment/Base","size":223432,"license":"MIT","signature":"RSA/SHA256, Thu 07 Sep 2017 05:43:15 AM PDT, Key ID 24c6a8a7f4a80eb5","source_rpm":"ncurses-5.9-14.20130511.el7_4.src.rpm","build_date":"Wed 06 Sep 2017 03:08:29 PM PDT","build_host":"c1bm.rdu2.centos.org","relocations":"(not relocatable)","packager":"CentOS BuildSystem <http://bugs.centos.org>","vendor":"CentOS","url":"http://invisible-island.net/ncurses/ncurses.html","summary":"Descriptions of common terminals","description":"This package contains descriptions of common terminals. Other terminal descriptions are included in the ncurses-term package.","build_epoch":1504735709,"build_epoch_utc":null} {"name":"ncurses-libs","version":"5.9","release":"14.20130511.el7_4","architecture":"x86_64","install_date":"Thu 15 Aug 2019 10:53:16 AM PDT","group":"System Environment/Libraries","size":1028216,"license":"MIT","signature":"RSA/SHA256, Thu 07 Sep 2017 05:43:31 AM PDT, Key ID 24c6a8a7f4a80eb5","source_rpm":"ncurses-5.9-14.20130511.el7_4.src.rpm","build_date":"Wed 06 Sep 2017 03:08:29 PM PDT","build_host":"c1bm.rdu2.centos.org","relocations":"(not relocatable)","packager":"CentOS BuildSystem <http://bugs.centos.org>","vendor":"CentOS","url":"http://invisible-island.net/ncurses/ncurses.html","summary":"Ncurses libraries","description":"The curses library routines are a terminal-independent method of updating character screens with reasonable optimization. The ncurses (new curses) library is a freely distributable replacement for the discontinued 4.4 BSD classic curses library. This package contains the ncurses libraries.","build_epoch":1504735709,"build_epoch_utc":null} ...
Very cool! Now we can use jq
to pull any attribute we want into a variable within the for
loop:
#!/bin/bash # pull the rpm package objects into a bash array from jq packages=() while read -r value; do packages+=("$value") done < <(rpm -qia | jc --rpm-qi | jq -c '.[] | select(.license == "MIT")') # iterate over the bash array for package in "${packages[@]}"; do name=$(jq -r '.name' <<< "${package}") description=$(jq -r '.description' <<< "${package}") version=$(jq -r '.version' <<< "${package}") echo "Package name is ${name}" > "${name}".txt echo "The description is: ${description}" >> "${name}".txt echo "The version is: ${version}" >> "${name}".txt done
And here’s what it does:
$ ./create_files.sh $ ls create_files.sh jc.txt libcom_err.txt libpciaccess.txt libyaml.txt popt.txt curl.txt json-c.txt libcurl.txt libss.txt lua.txt python-iniparse.txt dbus-python.txt krb5-devel.txt libdrm.txt libverto-devel.txt ncurses-base.txt python-pytoml.txt expat.txt krb5-libs.txt libfastjson.txt libverto.txt ncurses-libs.txt PyYAML.txt jansson.txt libcom_err-devel.txt libkadm5.txt libxml2.txt ncurses.txt rubygem-psych.txt $ cat jc.txt Package name is jc The description is: This tool serializes the output of popular gnu linux command line tools and file types to structured JSON output The version is: 1.15.0
As you can see, this is more efficient and allows you to pull in any attribute you would like from each Bash array element. Each element is acting like a JSON object that jq
can query.
Conclusion
We went through a few scenarios of how to assign JSON data to Bash variables and arrays with jc
and jq
. Using JSON instead of plain text allows you to be more expressive in your queries. Also, JSON has the advantage of allowing new fields to be added at any time without breaking your existing query.
JSON can be used by simply assigning a string word to a Bash variable, a string list of words to a variable and looping over the list, or by assigning entire JSON objects to Bash array elements, which can be further queried by jq
within a loop. These are powerful ways JSON data can help you write better scripts.