Skip to content

Configuration

Settings come from CLI flags (bound into Viper) and environment variables for sensitive fields. The environment prefix is P2PG. Nested keys use underscores instead of dots (for example db.passwordP2PG_DB_PASSWORD).

Environment variables

These are required; the process exits with an error if any are empty:

Variable Maps to Description
P2PG_DB_PASSWORD db.password PostgreSQL password
P2PG_DB_USER db.user PostgreSQL user
P2PG_S3_ACCESS_KEY s3.access.key S3 access key
P2PG_S3_SECRET_ACCESS_KEY s3.secret.access.key S3 secret key

No other configuration is read exclusively from the environment in the current code path; non-secret options are set via flags below.

CLI flags (dataloader)

Database

Flag Default Description
--db-host localhost PostgreSQL host
--db-port 5432 PostgreSQL port
--db-name postgres Database name
--db-pool-size 1 Connection pool size; also caps how many Parquet files are processed in parallel

S3-compatible storage

Flag Default Description
--s3-endpoint localhost:9000 Endpoint as host:port (no scheme)
--s3-region none Region string passed to the client
--s3-secure false Use TLS when talking to the object store

Data loader

Flag Default Required Description
--schema (empty) Yes Target PostgreSQL schema
--table (empty) Yes Target table name
--bucket (empty) Yes Bucket containing Parquet objects
--path / No Key prefix inside the bucket; if it does not end with /, a trailing slash is appended internally
--batch-size 10000 No Maximum rows per read/upload batch (must be > 0)
--table-create false No If set, create the table with inferred columns when it does not already exist
--table-truncate false No If set and the table exists, truncate it before loading

Validation rules

  • data-loader.batch-size must be greater than zero.
  • data-loader.path must not be empty after configuration (the CLI default / satisfies this).

If no Parquet files are found under the prefix, the command logs a warning and exits successfully without loading data.