pgCrab
Note: this is alpha-grade software, bodged together quickly because I needed it quickly. Some modules were generated by LLM.
A simple but helpful linter and documentation tool for Postgres database schemas.
Helps cure crabbiness around schema exploration and some avoidable performance production issues. Don't get woken up in the middle of the night for something you could have avoided with a bit of static analysis.
What can pgCrab do?
Generate documentation for your schema
pgCrab can connect to a database loaded with your schema and generate documentation for it.
Comments for each table and column will be collected from the database's built-in description
fields (set using COMMENT ON), but you can also harvest missing comments
from SQL comments (--) in your schema files.
You can provide a custom template, but by default Markdown will be emitted, suitable for embedding in mdBook or other Markdown-based tools.
The documentation page includes all tables, with their:
- comments;
 - columns (with datatypes and comments);
 - indices;
 - primary keys;
 - foreign keys; and
 - check constraints.
 
Foreign key relationships are documented in both directions, so it's easy to navigate via backreferences.
# uses libpq-compatible environment variables
pgcrab doc gen [--template <jinja template file>] [--harvest <path to SQL directory or files>]
# Postgres connection URI
pgcrab [-C postgres://user:pass@host/dbname] doc gen [--template <jinja template file>] [--harvest <path to SQL directory or files>]
A basic Markdown template is included in the repository: basic_md_template.md.j2.
List uncommented items
Follows the same command structure as pgcrab doc gen but instead provides a list of uncommented items
in your database.
Exit code 1 when there is at least one uncommented item.
# uses libpq-compatible environment variables
pgcrab doc uncommented [--from-sql <path to SQL directory or files>]
# Postgres connection URI
pgcrab [-C postgres://user:pass@host/dbname] doc uncommented [--from-sql <path to SQL directory or files>] [postgres://user:pass@host/dbname]
Lint your database schema for problems and suspicious things
You will need to set up a local Postgres instance with your schema loaded for linting.
# uses libpq-compatible environment variables
pgcrab lint-schema [--add-concessions]
# Postgres connection URI
pgcrab [-C postgres://user:pass@host/dbname] lint-schema [--add-concessions]
Many thanks to pg-index-health-sql, from which most of the lint rules were taken!
List of schema lint rules (click to expand):
| Lint ID | Description | References | 
|---|---|---|
dont_use_timestamp_without_time_zone | 
Don't use timestamp (without time zone) - use timestamptz instead | 
Postgres Wiki | 
dont_use_money | 
Don't use the money type; there are usually more appropriate types. | 
Postgres Wiki | 
dont_use_serial | 
Don't use serial types - prefer GENERATED BY DEFAULT AS IDENTITY. | 
Postgres Wiki | 
dont_use_varchar_n_by_default | 
Don't use VARCHAR(n) by default (no benefits in Postgres) | 
Postgres Wiki | 
column_requires_quotation | 
Column name always requires quotation due to naming convention | Postgres Wiki | 
object_requires_quotation | 
Object name always requires quotation due to naming convention | Postgres Wiki | 
possible_object_name_truncation | 
Object name may have been truncated due to length limits | Postgres Docs | 
duplicate_indexes | 
Duplicate indexes: waste of space and disk bandwidth | . | 
duplicate_foreign_keys | 
Duplicate foreign keys: identical constraints | . | 
foreign_key_without_index | 
Foreign key without index: can cause poor delete performance | . | 
foreign_key_with_unmatched_column_type | 
Foreign key with mismatched column types: causes type coercions | . | 
index_with_redundant_where_clause | 
Index with redundant WHERE IS NOT NULL clause on NOT NULL column | 
. | 
overlapping_indexes | 
Overlapping indexes: one index is a prefix of another. | . | 
overlapping_foreign_keys | 
Overlapping foreign keys: one constraint is made redundant by another. | . | 
table_without_primary_key | 
Table without primary key: should almost always have one. | . | 
btree_index_on_array_column | 
B-tree index on array column: consider GIN index instead. | . | 
index_with_boolean | 
Non-unique index on just a boolean column: low selectivity. | . | 
table_without_replica_identity | 
Table without replica identity: affects logical replication. | Postgres Docs | 
Have a lot of triggered lint rules from your existing schema and won't change them?
Use --add-concessions to add concession rules to the pgCrab config file.
Lint SQL queries in your Rust or Python source code
This feature is still quite primitive.
pgCrab can read your Rust and Python source code, looking for embedded SQL.
It can then check for a very small number of antipatterns.
pgcrab lint-sql path/to/code/directory
List of SQL lint rules (click to expand):
| Lint ID | Description | References | 
|---|---|---|
| TODO | TODO | TODO | 
Known issues and caveats:
- the reporting of error locations is not entirely exact,
but hopefully close enough to give you an idea.
Essentially: escape sequences such as 
\nwill 
Convert SQL comments (--) to real schema COMMENTs (descriptions)
It's quite painful to use the COMMENT ON feature in Postgres, because you
have to write a statement for every item you are describing.
You also can't do it in the same statement as creating every item.
To help with this, you can have pgCrab harvest comments from your SQL migrations
and convert them to COMMENT ON statements.
This is not as portable as using COMMENT ON but it might help some projects
that currently don't use COMMENT ON anyway because it is verbose and not
interoperable with SQLite.
pgcrab doc convert path/to/schema/migration/directory/ > comment.sql
If you want to skip lines that would apply a COMMENT that already exists, then
use the --compare flag and ensure to either pass a database connection via -C
or by libpq environment variables.
pgcrab [-C postgres://user:pass@host/dbname] doc convert path/to/schema/migration/directory/ --compare > comments.sql
Installation
Nix
via Nix flake
There is a Nix flake in this repository, with a pgcrab package in it.
nix build will produce a build of pgCrab at ./result/bin/pgcrab.
TODO How to add to a NixOS system.
Docker/Podman/Containers
from container registry
TODO
self-build
You can build with:
<podman|docker> build -t localhost/pgcrab .
Run using:
<podman|docker> run localhost/pgcrab --help
Use in GitHub Actions
TODO
Build Locally from source
With cargo installed, you can choose one of the following:
cargo install --path .to install to your user;- or 
cargo build --releaseto just get a release build in your target directory. 
You can use cargo run -- <args> to run from the source checkout.
Configuration
Normally you should keep your configuration file in the root of your project repository, under .config/pgcrab.toml.
However, pgCrab will look for a configuration file by ascending directories, looking for .config/pgcrab.toml at any level
until hitting either a repository root, filesystem root or your user's home directory.
[schema.concessions]
# You can ignore diagnostics produced by a rule on some tables, indexes or foreign keys.
# The `--add-concessions` flag will automatically produce these rules for you, if desired.
rule_id = [
  "my_table",
  # globs are supported
  "legacy_table_*"
]
rule_id2 = [
  # to specify an index, use `.`
  "my_table.my_index"
  # to specify all indexes on a table, use `.*`
  "my_table.*",
  # to specify all indexes whatsoever, use `**`
  "**"
]
# you can also disable all rules for a given object:
"*" = [
  "my_table"
]
Licence
pgCrab is available under the GPL 3.0 or later; see attached licence source.
Some of the SQL lint rules are available under Apache 2.0.
Contributions
TODO
Thanks
- pg-index-health-sql, from which many lint rules were takens