Built on
jello
:– Jello Explorer (
jellex
): TUI interactive JSON filter using Python syntax–
jello
web demo
I’m a big fan of using structured data at the command line. So much so that I’ve written a couple of utilities to promote JSON in the CLI:
jc
to JSONify command line output of scores of commands and file-typesjtbl
to convert JSON output into table format in the terminal
Typically I use jq
to filter and process the JSON output into submission until I get what I want. But if you’re anything like me, you spend a lot of time googling how to do what you want in jq
because the syntax can get a little out of hand. In fact, I keep notes with example jq
queries I’ve used before in case I need those techniques again.
jq
is great for simple things, but sometimes when I want to iterate through a deeply nested structure with arrays of objects I find python’s list and dictionary syntax easier to comprehend.
Hello jello
That’s why I created jello
. jello
works similarly to jq
but uses the python interpreter, so you can iterate with loops, comprehensions, variables, expressions, etc. just like you would in a full-fledged python script.
The nice thing about jello
is that it removes a lot of the boilerplate code you would need to ingest and output the JSON or JSON Lines data so you can focus on the logic.
Let’s take the following output from jc -ap
:
$ jc -ap { "name": "jc", "version": "1.9.2", "description": "jc cli output JSON conversion tool", "author": "Kelly Brazil", "author_email": "kellyjonbrazil@gmail.com", "parser_count": 50, "parsers": [ { "name": "airport", "argument": "--airport", "version": "1.0", "description": "airport -I command parser", "author": "Kelly Brazil", "author_email": "kellyjonbrazil@gmail.com", "compatible": [ "darwin" ], "magic_commands": [ "airport -I" ] }, { "name": "airport_s", "argument": "--airport-s", "version": "1.0", "description": "airport -s command parser", "author": "Kelly Brazil", "author_email": "kellyjonbrazil@gmail.com", "compatible": [ "darwin" ], "magic_commands": [ "airport -s" ] }, ... ]
Let’s say I want a list of the parser names that are compatible with macOS. Here is a jq
query that will get down to that level:
$ jc -a | jq '[.parsers[] | select(.compatible[] | contains("darwin")) | .name]' [ "airport", "airport_s", "arp", "crontab", "crontab_u", "csv", ... ]
This is not too terribly bad, but you need to be careful about bracket and parenthesis placements. Here’s the same query in jello
:
$ jc -a | jello '[parser.name for parser in _.parsers if "darwin" in parser.compatible]' [ "airport", "airport_s", "arp", "crontab", "crontab_u", "csv", ... ]
As you can see, jello
gives you the JSON or JSON Lines input as a dictionary or list of dictionaries assigned to ‘_
‘. Then you process it as you’d like using standard python syntax, with the convenience of dot notation. jello
automatically takes care of slurping input and printing valid JSON or JSON Lines depending on the value of the last expression.
The example above is not quite as terse as using jq
, but it’s more readable to someone who is familiar with python list comprehension. As with any programming language, there are multiple ways to skin a cat. We can also do a similar query with a for
loop:
$ jc -a | jello '\ result = [] for parser in _.parsers: if "darwin" in parser.compatible: result.append(parser.name) result' [ "airport", "airport_s", "arp", "crontab", "crontab_u", "csv", ... ]
Advanced JSON Processing
These are very simple examples and jq
syntax might be ok here (though I prefer python syntax). But what if we try to do something more complex? Let’s take one of the advanced examples from the excellent jq
tutorial by Matthew Lincoln.
Under Grouping and Counting, Matthew describes an advanced jq
filter against a sample Twitter dataset that includes JSON Lines data. There he describes the following query:
“We can now create a table of users. Let’s create a table with columns for the user id, user name, followers count, and a column of their tweet ids separated by a semicolon.”
https://programminghistorian.org/en/lessons/json-and-jq
Here is the final jq
query:
$ cat twitterdata.jlines | jq -s 'group_by(.user) | .[] | { user_id: .[0].user.id, user_name: .[0].user.screen_name, user_followers: .[0].user.followers_count, tweet_ids: [.[].id | tostring] | join(";") }' ... { "user_id": 47073035, "user_name": "msoltanm", "user_followers": 63, "tweet_ids": "619172275741298700" } { "user_id": 2569107372, "user_name": "SlavinOleg", "user_followers": 35, "tweet_ids": "501064198973960200;501064202794971140;501064214467731460;501064215759568900;501064220121632800" } { "user_id": 2369225023, "user_name": "SkogCarla", "user_followers": 10816, "tweet_ids": "501064217667960800" } { "user_id": 2477475030, "user_name": "bennharr", "user_followers": 151, "tweet_ids": "501064201503113200" } { "user_id": 42226593, "user_name": "shirleycolleen", "user_followers": 2114, "tweet_ids": "619172281294655500;619172179960328200" } ...
This is a fantastic query! It’s actually deceptively simple looking – it takes quite a few paragraphs for Matthew to describe how it works and there are some tricky brackets, braces, and parentheses in there that need to be set just right. Let’s see how we could tackle this task with jello
using standard python syntax:
$ cat twitterdata.jlines | jello -l '\ user_ids = set() for tweet in _: user_ids.add(tweet.user.id) result = [] for user in user_ids: user_profile = {} tweet_ids = [] for tweet in _: if tweet.user.id == user: user_profile.update({ "user_id": user, "user_name": tweet.user.screen_name, "user_followers": tweet.user.followers_count}) tweet_ids.append(str(tweet.id)) user_profile["tweet_ids"] = ";".join(tweet_ids) result.append(user_profile) result' ... {"user_id": 2696111005, "user_name": "EGEVER142", "user_followers": 1433, "tweet_ids": "619172303654518784"} {"user_id": 42226593, "user_name": "shirleycolleen", "user_followers": 2114, "tweet_ids": "619172281294655488;619172179960328192"} {"user_id": 106948003, "user_name": "MrKneeGrow", "user_followers": 172, "tweet_ids": "501064228627705857"} {"user_id": 18270633, "user_name": "ahhthatswhy", "user_followers": 559, "tweet_ids": "501064204661850113"} {"user_id": 14331818, "user_name": "edsu", "user_followers": 4220, "tweet_ids": "615973042443956225;618602288781860864"} {"user_id": 2569107372, "user_name": "SlavinOleg", "user_followers": 35, "tweet_ids": "501064198973960192;501064202794971136;501064214467731457;501064215759568897;501064220121632768"} {"user_id": 22668719, "user_name": "nodehyena", "user_followers": 294, "tweet_ids": "501064222772445187"} ...
So there’s 17 lines of python… again not as terse as jq
, but for pythonistas this is probably a lot easier to understand what is going on. This is a pretty simple and naive implementation – there are probably much better approaches that are shorter, simpler, faster, etc. but the point is I can come back six months from now and understand what is going on if I need to debug or tweak it.
Just for fun, let’s pipe this result through jtbl
to see what it looks like:
user_id user_name user_followers tweet_ids ---------- --------------- ---------------- ---------------------------------------------------------------------------------------------- ... 2481812382 SadieODoyle 42 501064200035516416 2696111005 EGEVER142 1433 619172303654518784 42226593 shirleycolleen 2114 619172281294655488;619172179960328192 106948003 MrKneeGrow 172 501064228627705857 18270633 ahhthatswhy 559 501064204661850113 14331818 edsu 4220 615973042443956225;618602288781860864 2569107372 SlavinOleg 35 501064198973960192;501064202794971136;501064214467731457;501064215759568897;501064220121632768 22668719 nodehyena 294 501064222772445187 23598003 victoriasview 1163 501064228288364546 851336634 20mUsa 15643 50106414 ...
Very cool! Find more examples at https://github.com/kellyjonbrazil/jello. I hope you find jello
useful in your command line pipelines.
Try Jello Explorer and the
jello
web demo!