A New Way to Parse Plain Text Tables

Every so often there are questions on sysadmin forums on how to parse and filter data from plain text tables. For example:

+----+-----------------------+--------------------------------+---------+
| id | name                  | url                            | version |
+----+-----------------------+--------------------------------+---------+
| 25 | example.com           | http://www.example.com/        | 3.8     |
| 34 | anotherexample.com    | https://anotherexample.com/    | 3.2     |
| 62 | yetanotherexample.com | https://yetanotherexample.com/ | 3.9     |
+----+-----------------------+--------------------------------+---------+

Traditionally you would use tools like grep, sed, and/or awk to grab the data you want from a table like this. Now there is a new way, with jc! Now, in version 1.18.6, jc can convert single-line and multi-line ASCII and Unicode tables to JSON with the asciitable and asciitable-m parsers. This then allows you to use JSON filters like jq or jello to filter the data and use in your Bash scripts or other applications.

Here’s how to use the new parsers:

$ echo '
> +----+-----------------------+--------------------------------+---------+
> | id | name                  | url                            | version |
> +----+-----------------------+--------------------------------+---------+
> | 25 | example.com           | http://www.example.com/        | 3.8     |
> | 34 | anotherexample.com    | https://anotherexample.com/    | 3.2     |
> | 62 | yetanotherexample.com | https://yetanotherexample.com/ | 3.9     |
> +----+-----------------------+--------------------------------+---------+
> ' | jc --asciitable -p
[
  {
    "id": "25",
    "name": "example.com",
    "url": "http://www.example.com/",
    "version": "3.8"
  },
  {
    "id": "34",
    "name": "anotherexample.com",
    "url": "https://anotherexample.com/",
    "version": "3.2"
  },
  {
    "id": "62",
    "name": "yetanotherexample.com",
    "url": "https://yetanotherexample.com/",
    "version": "3.9"
  }
]

If there are multi-line rows, then be sure to use the asciitable-m parser:

$ echo '
> ╒══════════╤═════════╤════════╕
> │ foo      │ bar     │ baz    │
> │          │         │ buz    │
> ╞══════════╪═════════╪════════╡
> │ good day │ 12345   │        │
> │ mate     │         │        │
> ├──────────┼─────────┼────────┤
> │ hi there │ abc def │ 3.14   │
> │          │         │        │
> ╘══════════╧═════════╧════════╛' | jc --asciitable-m -p
[
  {
    "foo": "good day\nmate",
    "bar": "12345",
    "baz_buz": null
  },
  {
    "foo": "hi there",
    "bar": "abc def",
    "baz_buz": "3.14"
  }
]

Many different table styles are supported, as long as there is a header row at the top of the table.

Of course, you can also use the parsers as python libraries:

>>> import jc
>>> table = '''
... Protocol  Address     Age (min)  Hardware Addr   Type   Interface
... Internet  10.12.13.1        98   0950.5785.5cd1  ARPA   FastEthernet2.13
... Internet  10.12.13.3       131   0150.7685.14d5  ARPA   GigabitEthernet2.13
... Internet  10.12.13.4       198   0950.5C8A.5c41  ARPA   GigabitEthernet2.17
... '''
>>> jc.parse('asciitable', table)
[{'protocol': 'Internet', 'address': '10.12.13.1', 'age_min': '98', 'hardware_addr': '0950.5785.5cd1', 'type': 'ARPA', 'interface': 'FastEthernet2.13'}, {'protocol': 'Internet', 'address': '10.12.13.3', 'age_min': '131', 'hardware_addr': '0150.7685.14d5', 'type': 'ARPA', 'interface': 'GigabitEthernet2.13'}, {'protocol': 'Internet', 'address': '10.12.13.4', 'age_min': '198', 'hardware_addr': '0950.5C8A.5c41', 'type': 'ARPA', 'interface': 'GigabitEthernet2.17'}]

This can be used to parse the output of some commands that output plaintext tables. For example, the virsh command:

# virsh list --all
 Id   Name          State
------------------------------
 3    rh8-vm01      running
 -    crc           shut off
 -    rh8-tower01   shut off
#
# virsh list -all | jc --asciitable -p
[
  {
    "id": "3",
    "name": "rh8-vm01",
    "state": "running"
  },
  {
    "id": "-",
    "name": "crc",
    "state": "shut off"
  },
  {
    "id": "-",
    "name": "rh8-tower01",
    "state": "shut off"
  }
]

Here’s how you can do the above in an Ansible playbook using the jc community.general plugin:

- name: Get virsh state
  hosts: ubuntu
  tasks:
  - shell: virsh list --all
    register: result
  - set_fact:
      virsh_data: "{{ result.stdout | community.general.jc('asciitable') }}"
  - debug:
      msg: "The virsh state is: {{ virsh_data[0].state }}"

For more information on jc, check out my post on Bringing the UNIX Philosophy to the 21st Century. See these posts for tips on how to use JSON in your Bash scripts.

Happy parsing!

Published by kellyjonbrazil

I'm a cybersecurity and cloud computing nerd.

One thought on “A New Way to Parse Plain Text Tables

Leave a Reply

%d bloggers like this: