Overview

Purpose

parquet2postgres-go is a small CLI for copying analytics-style data from object storage into Postgres. You point it at a bucket and key prefix where one or more .parquet files live; it loads their rows into a target schema and table.

Typical use cases:

Landing zone or warehouse files in MinIO, AWS S3, or another S3 API–compatible store
One-off or scheduled loads without a separate ETL framework
Quick hydration of a Postgres table that should mirror Parquet layout

What you can do

Capability	Description
List Parquet objects	Reads all objects under the configured S3 prefix (`--path`).
Infer DDL	Reads the first Parquet file and derives column names and types for Postgres.
Create table	With `--table-create`, runs `CREATE TABLE IF NOT EXISTS` when the table is missing.
Truncate	With `--table-truncate`, truncates the table before load if it already exists.
Batch load	Streams rows in chunks of `--batch-size` and inserts via the Postgres data loader (batched copy-style pipeline).
Parallel files	Processes multiple Parquet files concurrently; concurrency is capped at `--db-pool-size` so it stays within the connection pool limit.

Constraints

Same schema for all files: Every Parquet file under the prefix must share the same schema. The tool does not merge incompatible layouts.
Secrets via environment: Database password and user, and S3 access keys, must be supplied through P2PG_* environment variables (see Configuration).
S3-compatible endpoint: The client expects a MinIO/S3-style endpoint and credentials.

Command

The main subcommand is dataloader:

parquet2postgres-go dataloader [flags]

See Quick start and Examples for concrete invocations.