Complex PostgreSQL Example

Using all the features that PostgreSQL has

January 2, 2025

Common Table Expressions (CTEs)
Recursive CTEs
Window Functions
JSONB Processing
Lateral Joins
Subqueries
Aggregations
Indexes & Full-Text Search
Partitioning
Foreign Data Wrappers (FDW)
Custom Functions
Triggers
Geospatial Queries (PostGIS)
Event Triggers

Scenario:

You run an e-commerce platform and want to analyze customer purchases, product metadata (JSONB), warehouse distances (PostGIS), and a foreign table (FDW) for external suppliers.

The Query

Finds customers with high-value purchases (> $1000).
Uses a recursive CTE to compute referral-based purchase impact.
Extracts JSONB metadata for product attributes.
Uses full-text search to find trending products.
Retrieves nearest warehouse using PostGIS.
Queries external suppliers using FDW.
Uses window functions for ranking.
Partitions results for performance.

-- Enable necessary extensions
CREATE EXTENSION IF NOT EXISTS postgis;
CREATE EXTENSION IF NOT EXISTS postgres_fdw;

-- Step 1: Recursive CTE for Referral Purchase Analysis
WITH RECURSIVE ReferralTree AS (
    SELECT c.customer_id, c.referrer_id, o.order_id, o.total_amount
    FROM customers c
    JOIN orders o ON c.customer_id = o.customer_id
    WHERE o.total_amount > 1000
    UNION ALL
    SELECT c.customer_id, c.referrer_id, rt.order_id, rt.total_amount
    FROM customers c
    JOIN ReferralTree rt ON c.referrer_id = rt.customer_id
)
-- Step 2: Extract Product Data from JSONB
, ProductData AS (
    SELECT
        p.product_id,
        p.name,
        p.metadata ->> 'category' AS category,
        p.metadata -> 'specs' ->> 'weight' AS weight
    FROM products p
)
-- Step 3: Find Popular Products using Full-Text Search
, PopularProducts AS (
    SELECT p.product_id, p.name, p.description
    FROM products p
    WHERE to_tsvector('english', p.description) @@ to_tsquery('best & selling')
)
-- Step 4: Find Nearest Warehouse using PostGIS
, NearestWarehouse AS (
    SELECT
        w.warehouse_id,
        w.name AS warehouse_name,
        c.customer_id,
        ST_Distance(c.location::geography, w.location::geography) AS distance_km
    FROM customers c
    JOIN warehouses w
    ON ST_DWithin(c.location::geography, w.location::geography, 50000) -- 50km range
)
-- Step 5: Query External Suppliers (Foreign Data Wrapper)
, ExternalSuppliers AS (
    SELECT
        s.supplier_id, s.product_id, s.price, s.stock
    FROM external_supplier_db.supplier_data s
)
-- Final Query Combining All Data
SELECT
    c.customer_id,
    c.name AS customer_name,
    rt.total_amount AS total_spent,
    np.name AS trending_product,
    nw.warehouse_name,
    nw.distance_km,
    es.supplier_id,
    es.price AS supplier_price,
    RANK() OVER (PARTITION BY nw.warehouse_id ORDER BY rt.total_amount DESC) AS purchase_rank
FROM ReferralTree rt
JOIN customers c ON c.customer_id = rt.customer_id
LEFT JOIN PopularProducts np ON np.product_id IN (
    SELECT product_id FROM orders WHERE customer_id = c.customer_id
)
LEFT JOIN NearestWarehouse nw ON nw.customer_id = c.customer_id
LEFT JOIN ExternalSuppliers es ON es.product_id IN (
    SELECT product_id FROM orders WHERE customer_id = c.customer_id
)
ORDER BY rt.total_amount DESC, nw.distance_km ASC
LIMIT 50;

Key Features Used

Recursive CTE (ReferralTree): Traces purchases made by referred customers.
JSONB Handling (metadata ->> 'category'): Extracts structured product details.
Full-Text Search (to_tsvector @@ to_tsquery): Finds trending products dynamically.
PostGIS (ST_Distance, ST_DWithin): Determines nearest warehouses.
Foreign Data Wrapper (postgres_fdw): Queries external supplier data.
Window Functions (RANK() OVER PARTITION): Ranks customers based on spending.
Partitioned Queries (ORDER BY, LIMIT): Improves performance.

Additional Enhancements

Indexing for Speed:

CREATE INDEX idx_orders_total ON orders(total_amount);
CREATE INDEX idx_products_tsv ON products USING gin(to_tsvector('english', description));

Trigger to Auto-Update Referral Impact:

CREATE FUNCTION update_referral_spend() RETURNS TRIGGER AS $$
BEGIN
    UPDATE customers
    SET referral_spend = referral_spend + NEW.total_amount
    WHERE customer_id = NEW.referrer_id;
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER referral_trigger
AFTER INSERT ON orders
FOR EACH ROW EXECUTE FUNCTION update_referral_spend();

JSONB Syntax

PostgreSQL provides powerful JSON/JSONB operators for working with JSON data. Let's break them down with examples.

JSONB Operators in PostgreSQL

Operator	Description	Example
`->`	Extracts a JSON object/array element by key (returns JSON)	`jsonb_column->'key'`
`->>`	Extracts a JSON object/array element as text (returns text)	`jsonb_column->>'key'`
`#>`	Extracts a nested JSON object/array (returns JSON)	`jsonb_column#>'{key1, key2}'`
`#>>`	Extracts a nested JSON object/array as text (returns text)	`jsonb_column#>>'{key1, key2}'`
`@>`	Checks if the left JSONB contains the right JSONB	`jsonb_column @> '{"key": "value"}'`
`<@`	Checks if the right JSONB contains the left JSONB	`'{"key": "value"}' <@ jsonb_column`
`?`	Checks if a key exists in the JSONB object	`jsonb_column ? 'key'`
`?(bar)`	Checks if any key in an array exists in the JSONB object	`jsonb_column ?(bar) array['key1', 'key2']`
`?&`	Checks if all keys in an array exist in the JSONB object	`jsonb_column ?& array['key1', 'key2']`
`-`	Removes a key from a JSONB object	`jsonb_column - 'key'`
`-`	Removes an element by index from a JSONB array	`jsonb_column - 1`
`#-`	Removes a nested key from JSONB	`jsonb_column #- '{key1, key2}'`

Examples Using JSONB... Assume we have a products table:

CREATE TABLE products (
    id SERIAL PRIMARY KEY,
    name TEXT,
    metadata JSONB
);

INSERT INTO products (name, metadata) VALUES
('Laptop', '{"brand": "Dell", "specs": {"ram": "16GB", "cpu": "Intel i7"}, "tags": ["electronics", "computer"]}'),
('Phone', '{"brand": "Apple", "specs": {"ram": "8GB", "cpu": "A15 Bionic"}, "tags": ["electronics", "mobile"]}');

Extracting a Top-Level Key. (Returns JSON, so it has quotes)

SELECT metadata->'brand' FROM products;

 "Dell"
 "Apple"

Extracting a Key as Text (->>). (Returns plain text, no quotes)

SELECT metadata->>'brand' FROM products;

 Dell
 Apple

Extracting a Nested JSON Object (#>). (Returns JSON, still in quotes)

SELECT metadata#>'{specs, ram}' FROM products;

 "16GB"
 "8GB"

Extracting a Nested Value as Text (#>>). (Returns plain text)

SELECT metadata#>>'{specs, cpu}' FROM products;

 Intel i7
 A15 Bionic

Checking If JSONB Contains a Value (@>). Find all products that contain "brand": "Apple":

SELECT name FROM products WHERE metadata @> '{"brand": "Apple"}';

 Phone

Checking If a Key Exists (?): (Finds all products that have a "brand" key in metadata)

SELECT name FROM products WHERE metadata ? 'brand';

Checking for Multiple Keys (?| and ?&). Find products that have either "brand" or "specs":

SELECT name FROM products WHERE metadata ?| array['brand', 'specs'];

Find products that have both "brand" and "tags":

SELECT name FROM products WHERE metadata ?& array['brand', 'tags'];

Removing a Key (-). Remove "brand" from JSONB - (Returns JSON without "brand" key)

SELECT metadata - 'brand' FROM products;

Removing a Nested Key (#-) - Remove "cpu" from "specs". (Removes the nested key inside specs)

SELECT metadata #- '{specs, cpu}' FROM products;

Searching in JSONB Arrays: - find products that have "mobile" in tags:

SELECT name FROM products WHERE metadata->'tags' @> '["mobile"]';

Summary

-> → Extracts a key as JSON.
->> → Extracts a key as text.
#> → Extracts a nested key as JSON.
#>> → Extracts a nested key as text.
@> → Checks if JSONB contains a value.
? → Checks if a key exists.
?| / ?& → Checks if any/all keys exist.
- / #- → Removes keys.

Text Search

PostgreSQL provides full-text search (FTS) capabilities using to_tsvector and to_tsquery. These functions allow you to efficiently search text fields by normalizing words, removing stop words, and providing ranking capabilities.

to_tsvector converts text into a searchable vector by tokenizing and normalizing words.

SELECT to_tsvector('english', 'PostgreSQL is an amazing open-source database!');

'postgresql':1 'amaz':4 'databas':6 'open-sourc':5

Words are stemmed (e.g., "amazing" → "amaz", "database" → "databas").
Common words ("is", "an") are removed (stop words).
The number represents the word position.

to_tsquery is used to search in a to_tsvector.

SELECT to_tsquery('english', 'postgres & database');

'postgres' & 'databas'

The & operator means both words must appear.
Words are automatically stemmed.

Matching to_tsquery Against to_tsvector - to perform a search, you use the @@ operator. The text matches because it contains both "postgres" and "database".

SELECT 'PostgreSQL is an amazing database'::tsvector @@ to_tsquery('postgres & database');

 true

Indexing for Fast Searches - full-text search works best with an index on to_tsvector:

CREATE INDEX idx_products_search ON products USING GIN (to_tsvector('english', description));

Then, you can efficiently search:

SELECT * FROM products WHERE to_tsvector('english', description) @@ to_tsquery('laptop & dell');

Operators in to_tsquery

Operator	Description	Example
`&`	AND (both words must appear)	`'laptop & dell'`
`(bar)`	OR (either word can appear)	`'laptop (bar) dell'`
`!`	NOT (word must NOT appear)	`'!tablet'`
`<->`	Proximity search (words must be close together)	`'laptop <-> dell'`

Example: Full-Text Search on a Table - consider this articles table:

CREATE TABLE articles (
    id SERIAL PRIMARY KEY,
    title TEXT,
    content TEXT
);

Insert data:

INSERT INTO articles (title, content) VALUES
('PostgreSQL FTS', 'PostgreSQL full-text search is powerful.'),
('Database Indexing', 'Indexes speed up queries in databases.');

Perform a search:

SELECT * FROM articles
WHERE to_tsvector('english', content) @@ to_tsquery('search & powerful');

id | title            | content
---+-----------------+----------------------------------
1  | PostgreSQL FTS  | PostgreSQL full-text search is powerful.

Ranking Search Results (ts_rank). If multiple results match, you can rank them by relevance:

SELECT title, ts_rank(to_tsvector('english', content), to_tsquery('search & powerful')) AS rank
FROM articles
ORDER BY rank DESC;

Summary

to_tsvector(text) → Converts text into a searchable vector.
to_tsquery(query) → Searches in a to_tsvector.
@@ → Matches to_tsquery against to_tsvector.
Use GIN indexes for fast searches.
Operators: & (AND), | (OR), ! (NOT), <-> (proximity).
Use ts_rank() to rank search results.

Originally posted: January 2, 2025

Filed Under:

data

sql