logo
linkedin
menu
logo
linkedin

Life is too short to learn Bash

by Pedro Santos

February 5, 2023


As an experienced software developer, I often find that writing scripts in sh or Bash can be a challenging task. Issues such as missing environment variables, the difficulty of using tools like xargs and jq, and the need to constantly re-learn how to write a for loop make the process frustrating. Although these issues are not inherent limitations of the language, other programming languages can also call external programs and make decisions based on their output.

In some situations, Python may be a better alternative to Bash. It is widely available, is installed by default on most modern Linux environments, and supports many other operating systems and CPU architectures. Additionally, its “battery-included” approach provides a large standard library to work with.

It is important to note that while this advice may hold true for some, it is not a universal solution. For an experienced SysOps, the effort invested in learning Bash is well worth it, while a Javascript developer may find that Node.js is a better option for scripting.

Scripting vs Programming

The difference between scripting and programming can be subjective, but for the purpose of this text, a more concise and clear definition can be provided.

  • Scripting refers to writing a series of commands that automate tasks by chaining together external programs.
  • Programming refers to writing code using libraries to perform specific tasks.

This distinction is important to explain that programming in Python is not the same as scripting in Python. When doing the latter, I propose the following rules:

  • Choose a specific version of Python that supports the oldest operating system you need to run the script on.
  • Use only the built-in libraries in Python, avoiding the use of external libraries.
  • Assume that external programs like curl, kubectl, systemctl, etc., are available and accessible, just like in a bash script.

By adhering to these guidelines, your script will be portable and self-contained, without the need to manage dependencies.

Example: Automated Docker Image Deletion

In this example, we’ll create a script using Python3 to delete all Docker imaged created (not downloaded) more than 2 weeks ago. There are several ways to accomplish this task, but the following script provides a straightforward solution.

#! /bin/env python3

import subprocess
import json
from datetime import datetime, timedelta

docker_list = subprocess.run(
    ["docker images --format '{{json .}}'"],
    shell=True, capture_output=True)

if docker_list.returncode != 0:
    print("error listing docker images", docker_list.stderr)

# Each line is a JSON object, last line is empty
lines = docker_list.stdout.decode().split("\n")[:-1]

for line in lines:
    parsed = json.loads(line)
    # Docker outputs a non-standard format, so we'll do a small conversion
    #   from 2023-01-18 14:06:21 +0100 CET to 2023-01-18T14:06:21 .
    # We ignore the timezone for simplicity
    created_at = "T".join(parsed["CreatedAt"].split(" ")[0:2])

    delta = datetime.now() - datetime.fromisoformat(created_at)
    if delta > timedelta(days=14):
        print(f"deleting image {parsed['Repository']}:{parsed['Tag']}")
        docker_remove = subprocess.run(
            [f"docker image rm --force {parsed['ID']}"],
            shell=True, capture_output=True)
        if docker_remove.returncode != 0:
            print("error removing docker image:", docker_remove.stderr)

Here is the same script in Bash.

#! /bin/env bash

set -e  # Ensure a failed command stops the script

docker images --format '{{json .}}' | while read line
do
  # Docker outputs a non-standard format, so we'll do a small conversion
  #   from 2023-01-18 14:06:21 +0100 CET to 2023-01-18T14:06:21 .
  # We ignore the timezone for simplicity
  created_epoch=$(echo "$line" \
    | jq --raw-output '.CreatedAt' \
    | awk 'BEGIN { OFS = "T" } { print $1, $2 }' \
    | date -f - +%s)
  now_epoch=$(date +%s)
  delta=$((now_epoch - created_epoch))

  if [ $delta -gt 1209600 ]; then #14 days
    repository=$(echo "$line" | jq --raw-output '.Repository')
    tag=$(echo "$line" | jq --raw-output '.Tag')
    id=$(echo "$line" | jq --raw-output '.ID')

    echo "deleting image ${repository}:${tag}"
    # An error in the removal should not stop the loop
    docker image rm --force "$id" || true 
  fi
done

The two implementations have a comparable level of functionality and line count. However, I believe that the Python version is more elegant and easier to read. In contrast to Bash, where I would have to rely on external programs like jq or awk (adding another syntax to learn), Python provides everything I need through its standard library. Additionally, error handling is more straightforward in Python.

Lastly, here is my solution as a Python program:

import docker
from datetime import datetime, timedelta, timezone
import dateutil.parser

client = docker.from_env()

images = client.images.list()
for image in images:
    # datetime.fromisoformat from Python versions < 3.11 
    #   cannot parse the docker timestamp string
    created = dateutil.parser.isoparse(image.attrs["Created"])
    delta = datetime.now(tz=timezone.utc) - created
    if delta < timedelta(days=14):
        print(f"deleting image {image.attrs['RepoTags'][0]}")
        client.images.remove(image.id, force=True)

The program utilizes the docker library for its heavy lifting tasks. However, this reliance on an external library means that the software is no longer self-contained and introduces additional complexity, as it requires the maintenance of a requirements file and an extra step to pip install dependencies.

While this additional complexity may be acceptable in some cases, it may not be necessary for simple scripts that are being transitioned from Bash. In these scenarios, it may be preferable to forgo the added complexity.

Conclusion

Bash and Unix scripting offer immense power to experienced developers, but the same can be said of other programming languages such as Python or JavaScript. While it may require a change in programming approach, scripting in these languages is both possible and worthwhile.