Olivier 'reivilibre 1be4b00d67
Some checks failed
ci/woodpecker/push/main Pipeline failed
fixup! CI: Try to fix DockerHub login
2025-10-09 22:26:01 +01:00
2025-10-08 22:17:14 +00:00
2025-09-05 21:48:49 +01:00
2025-09-05 21:59:30 +01:00
2025-09-05 21:59:30 +01:00
2025-09-05 21:59:30 +01:00
2025-09-05 21:40:31 +01:00

pgCrab

Note: this is alpha-grade software, bodged together quickly because I needed it quickly. Some modules were generated by LLM.

A simple but helpful linter and documentation tool for Postgres database schemas.

Helps cure crabbiness around schema exploration and some avoidable performance production issues. Don't get woken up in the middle of the night for something you could have avoided with a bit of static analysis.

What can pgCrab do?

Generate documentation for your schema

pgCrab can connect to a database loaded with your schema and generate documentation for it.

Comments for each table and column will be collected from the database's built-in description fields (set using COMMENT ON), but you can also harvest missing comments from SQL comments (--) in your schema files.

You can provide a custom template, but by default Markdown will be emitted, suitable for embedding in mdBook or other Markdown-based tools.

The documentation page includes all tables, with their:

  • comments;
  • columns (with datatypes and comments);
  • indices;
  • primary keys;
  • foreign keys; and
  • check constraints.

Foreign key relationships are documented in both directions, so it's easy to navigate via backreferences.

# uses libpq-compatible environment variables
pgcrab doc gen [--template <jinja template file>] [--harvest <path to SQL directory or files>]

# Postgres connection URI
pgcrab [-C postgres://user:pass@host/dbname] doc gen [--template <jinja template file>] [--harvest <path to SQL directory or files>]

A basic Markdown template is included in the repository: basic_md_template.md.j2.

List uncommented items

Follows the same command structure as pgcrab doc gen but instead provides a list of uncommented items in your database.

Exit code 1 when there is at least one uncommented item.

# uses libpq-compatible environment variables
pgcrab doc uncommented [--from-sql <path to SQL directory or files>]

# Postgres connection URI
pgcrab [-C postgres://user:pass@host/dbname] doc uncommented [--from-sql <path to SQL directory or files>] [postgres://user:pass@host/dbname]

Lint your database schema for problems and suspicious things

You will need to set up a local Postgres instance with your schema loaded for linting.

# uses libpq-compatible environment variables
pgcrab lint-schema [--add-concessions]

# Postgres connection URI
pgcrab [-C postgres://user:pass@host/dbname] lint-schema [--add-concessions]

Many thanks to pg-index-health-sql, from which most of the lint rules were taken!

List of schema lint rules (click to expand):

Lint ID Description References
dont_use_timestamp_without_time_zone Don't use timestamp (without time zone) - use timestamptz instead Postgres Wiki
dont_use_money Don't use the money type; there are usually more appropriate types. Postgres Wiki
dont_use_serial Don't use serial types - prefer GENERATED BY DEFAULT AS IDENTITY. Postgres Wiki
dont_use_varchar_n_by_default Don't use VARCHAR(n) by default (no benefits in Postgres) Postgres Wiki
column_requires_quotation Column name always requires quotation due to naming convention Postgres Wiki
object_requires_quotation Object name always requires quotation due to naming convention Postgres Wiki
possible_object_name_truncation Object name may have been truncated due to length limits Postgres Docs
duplicate_indexes Duplicate indexes: waste of space and disk bandwidth .
duplicate_foreign_keys Duplicate foreign keys: identical constraints .
foreign_key_without_index Foreign key without index: can cause poor delete performance .
foreign_key_with_unmatched_column_type Foreign key with mismatched column types: causes type coercions .
index_with_redundant_where_clause Index with redundant WHERE IS NOT NULL clause on NOT NULL column .
overlapping_indexes Overlapping indexes: one index is a prefix of another. .
overlapping_foreign_keys Overlapping foreign keys: one constraint is made redundant by another. .
table_without_primary_key Table without primary key: should almost always have one. .
btree_index_on_array_column B-tree index on array column: consider GIN index instead. .
index_with_boolean Non-unique index on just a boolean column: low selectivity. .
table_without_replica_identity Table without replica identity: affects logical replication. Postgres Docs

Have a lot of triggered lint rules from your existing schema and won't change them? Use --add-concessions to add concession rules to the pgCrab config file.

Lint SQL queries in your Rust or Python source code

This feature is still quite primitive.

pgCrab can read your Rust and Python source code, looking for embedded SQL.

It can then check for a very small number of antipatterns.

pgcrab lint-sql path/to/code/directory

List of SQL lint rules (click to expand):

Lint ID Description References
TODO TODO TODO

Known issues and caveats:

  • the reporting of error locations is not entirely exact, but hopefully close enough to give you an idea. Essentially: escape sequences such as \n will

Convert SQL comments (--) to real schema COMMENTs (descriptions)

It's quite painful to use the COMMENT ON feature in Postgres, because you have to write a statement for every item you are describing. You also can't do it in the same statement as creating every item.

To help with this, you can have pgCrab harvest comments from your SQL migrations and convert them to COMMENT ON statements.

This is not as portable as using COMMENT ON but it might help some projects that currently don't use COMMENT ON anyway because it is verbose and not interoperable with SQLite.

pgcrab doc convert path/to/schema/migration/directory/ > comment.sql

If you want to skip lines that would apply a COMMENT that already exists, then use the --compare flag and ensure to either pass a database connection via -C or by libpq environment variables.

pgcrab [-C postgres://user:pass@host/dbname] doc convert path/to/schema/migration/directory/ --compare > comments.sql

Installation

Nix

via Nix flake

There is a Nix flake in this repository, with a pgcrab package in it.

nix build will produce a build of pgCrab at ./result/bin/pgcrab.

TODO How to add to a NixOS system.

Docker/Podman/Containers

from container registry

TODO

self-build

You can build with:

<podman|docker> build -t localhost/pgcrab .

Run using:

<podman|docker> run localhost/pgcrab --help

Use in GitHub Actions

TODO

Build Locally from source

With cargo installed, you can choose one of the following:

  • cargo install --path . to install to your user;
  • or cargo build --release to just get a release build in your target directory.

You can use cargo run -- <args> to run from the source checkout.

Configuration

Normally you should keep your configuration file in the root of your project repository, under .config/pgcrab.toml. However, pgCrab will look for a configuration file by ascending directories, looking for .config/pgcrab.toml at any level until hitting either a repository root, filesystem root or your user's home directory.

[schema.concessions]
# You can ignore diagnostics produced by a rule on some tables, indexes or foreign keys.
# The `--add-concessions` flag will automatically produce these rules for you, if desired.
rule_id = [
  "my_table",
  # globs are supported
  "legacy_table_*"
]
rule_id2 = [
  # to specify an index, use `.`
  "my_table.my_index"
  # to specify all indexes on a table, use `.*`
  "my_table.*",
  # to specify all indexes whatsoever, use `**`
  "**"
]
# you can also disable all rules for a given object:
"*" = [
  "my_table"
]

Licence

pgCrab is available under the GPL 3.0 or later; see attached licence source.

Some of the SQL lint rules are available under Apache 2.0.

Contributions

TODO

Thanks

Description
A linter and documentation generator for Postgres schemas.
Readme 321 KiB
Languages
Rust 95.9%
Jinja 2%
Nix 1.4%
Dockerfile 0.7%