Complex PostgreSQL Example

Using all the features that PostgreSQL has

  • Common Table Expressions (CTEs)
  • Recursive CTEs
  • Window Functions
  • JSONB Processing
  • Lateral Joins
  • Subqueries
  • Aggregations
  • Indexes & Full-Text Search
  • Partitioning
  • Foreign Data Wrappers (FDW)
  • Custom Functions
  • Triggers
  • Geospatial Queries (PostGIS)
  • Event Triggers

Scenario:

You run an e-commerce platform and want to analyze customer purchases, product metadata (JSONB), warehouse distances (PostGIS), and a foreign table (FDW) for external suppliers.

The Query

  1. Finds customers with high-value purchases (> $1000).
  2. Uses a recursive CTE to compute referral-based purchase impact.
  3. Extracts JSONB metadata for product attributes.
  4. Uses full-text search to find trending products.
  5. Retrieves nearest warehouse using PostGIS.
  6. Queries external suppliers using FDW.
  7. Uses window functions for ranking.
  8. Partitions results for performance.
-- Enable necessary extensions CREATE EXTENSION IF NOT EXISTS postgis; CREATE EXTENSION IF NOT EXISTS postgres_fdw; -- Step 1: Recursive CTE for Referral Purchase Analysis WITH RECURSIVE ReferralTree AS ( SELECT c.customer_id, c.referrer_id, o.order_id, o.total_amount FROM customers c JOIN orders o ON c.customer_id = o.customer_id WHERE o.total_amount > 1000 UNION ALL SELECT c.customer_id, c.referrer_id, rt.order_id, rt.total_amount FROM customers c JOIN ReferralTree rt ON c.referrer_id = rt.customer_id ) -- Step 2: Extract Product Data from JSONB , ProductData AS ( SELECT p.product_id, p.name, p.metadata ->> 'category' AS category, p.metadata -> 'specs' ->> 'weight' AS weight FROM products p ) -- Step 3: Find Popular Products using Full-Text Search , PopularProducts AS ( SELECT p.product_id, p.name, p.description FROM products p WHERE to_tsvector('english', p.description) @@ to_tsquery('best & selling') ) -- Step 4: Find Nearest Warehouse using PostGIS , NearestWarehouse AS ( SELECT w.warehouse_id, w.name AS warehouse_name, c.customer_id, ST_Distance(c.location::geography, w.location::geography) AS distance_km FROM customers c JOIN warehouses w ON ST_DWithin(c.location::geography, w.location::geography, 50000) -- 50km range ) -- Step 5: Query External Suppliers (Foreign Data Wrapper) , ExternalSuppliers AS ( SELECT s.supplier_id, s.product_id, s.price, s.stock FROM external_supplier_db.supplier_data s ) -- Final Query Combining All Data SELECT c.customer_id, c.name AS customer_name, rt.total_amount AS total_spent, np.name AS trending_product, nw.warehouse_name, nw.distance_km, es.supplier_id, es.price AS supplier_price, RANK() OVER (PARTITION BY nw.warehouse_id ORDER BY rt.total_amount DESC) AS purchase_rank FROM ReferralTree rt JOIN customers c ON c.customer_id = rt.customer_id LEFT JOIN PopularProducts np ON np.product_id IN ( SELECT product_id FROM orders WHERE customer_id = c.customer_id ) LEFT JOIN NearestWarehouse nw ON nw.customer_id = c.customer_id LEFT JOIN ExternalSuppliers es ON es.product_id IN ( SELECT product_id FROM orders WHERE customer_id = c.customer_id ) ORDER BY rt.total_amount DESC, nw.distance_km ASC LIMIT 50;

Key Features Used

  1. Recursive CTE (ReferralTree): Traces purchases made by referred customers.
  2. JSONB Handling (metadata ->> 'category'): Extracts structured product details.
  3. Full-Text Search (to_tsvector @@ to_tsquery): Finds trending products dynamically.
  4. PostGIS (ST_Distance, ST_DWithin): Determines nearest warehouses.
  5. Foreign Data Wrapper (postgres_fdw): Queries external supplier data.
  6. Window Functions (RANK() OVER PARTITION): Ranks customers based on spending.
  7. Partitioned Queries (ORDER BY, LIMIT): Improves performance.

Additional Enhancements

  1. Indexing for Speed:

    CREATE INDEX idx_orders_total ON orders(total_amount); CREATE INDEX idx_products_tsv ON products USING gin(to_tsvector('english', description));
  2. Trigger to Auto-Update Referral Impact:

    CREATE FUNCTION update_referral_spend() RETURNS TRIGGER AS $$ BEGIN UPDATE customers SET referral_spend = referral_spend + NEW.total_amount WHERE customer_id = NEW.referrer_id; RETURN NEW; END; $$ LANGUAGE plpgsql; CREATE TRIGGER referral_trigger AFTER INSERT ON orders FOR EACH ROW EXECUTE FUNCTION update_referral_spend();

JSONB Syntax

PostgreSQL provides powerful JSON/JSONB operators for working with JSON data. Let's break them down with examples.

JSONB Operators in PostgreSQL

OperatorDescriptionExample
->Extracts a JSON object/array element by key (returns JSON)jsonb_column->'key'
->>Extracts a JSON object/array element as text (returns text)jsonb_column->>'key'
#>Extracts a nested JSON object/array (returns JSON)jsonb_column#>'{key1, key2}'
#>>Extracts a nested JSON object/array as text (returns text)jsonb_column#>>'{key1, key2}'
@>Checks if the left JSONB contains the right JSONBjsonb_column @> '{"key": "value"}'
<@Checks if the right JSONB contains the left JSONB'{"key": "value"}' <@ jsonb_column
?Checks if a key exists in the JSONB objectjsonb_column ? 'key'
?(bar)Checks if any key in an array exists in the JSONB objectjsonb_column ?(bar) array['key1', 'key2']
?&Checks if all keys in an array exist in the JSONB objectjsonb_column ?& array['key1', 'key2']
-Removes a key from a JSONB objectjsonb_column - 'key'
-Removes an element by index from a JSONB arrayjsonb_column - 1
#-Removes a nested key from JSONBjsonb_column #- '{key1, key2}'

Examples Using JSONB... Assume we have a products table:

CREATE TABLE products ( id SERIAL PRIMARY KEY, name TEXT, metadata JSONB ); INSERT INTO products (name, metadata) VALUES ('Laptop', '{"brand": "Dell", "specs": {"ram": "16GB", "cpu": "Intel i7"}, "tags": ["electronics", "computer"]}'), ('Phone', '{"brand": "Apple", "specs": {"ram": "8GB", "cpu": "A15 Bionic"}, "tags": ["electronics", "mobile"]}');

Extracting a Top-Level Key. (Returns JSON, so it has quotes)

SELECT metadata->'brand' FROM products;
"Dell" "Apple"

Extracting a Key as Text (->>). (Returns plain text, no quotes)

SELECT metadata->>'brand' FROM products;
Dell Apple

Extracting a Nested JSON Object (#>). (Returns JSON, still in quotes)

SELECT metadata#>'{specs, ram}' FROM products;
"16GB" "8GB"

Extracting a Nested Value as Text (#>>). (Returns plain text)

SELECT metadata#>>'{specs, cpu}' FROM products;
Intel i7 A15 Bionic

Checking If JSONB Contains a Value (@>). Find all products that contain "brand": "Apple":

SELECT name FROM products WHERE metadata @> '{"brand": "Apple"}';
Phone

Checking If a Key Exists (?): (Finds all products that have a "brand" key in metadata)

SELECT name FROM products WHERE metadata ? 'brand';

Checking for Multiple Keys (?| and ?&). Find products that have either "brand" or "specs":

SELECT name FROM products WHERE metadata ?| array['brand', 'specs'];

Find products that have both "brand" and "tags":

SELECT name FROM products WHERE metadata ?& array['brand', 'tags'];

Removing a Key (-). Remove "brand" from JSONB - (Returns JSON without "brand" key)

SELECT metadata - 'brand' FROM products;

Removing a Nested Key (#-) - Remove "cpu" from "specs". (Removes the nested key inside specs)

SELECT metadata #- '{specs, cpu}' FROM products;

Searching in JSONB Arrays: - find products that have "mobile" in tags:

SELECT name FROM products WHERE metadata->'tags' @> '["mobile"]';

Summary

  • -> → Extracts a key as JSON.
  • ->> → Extracts a key as text.
  • #> → Extracts a nested key as JSON.
  • #>> → Extracts a nested key as text.
  • @> → Checks if JSONB contains a value.
  • ? → Checks if a key exists.
  • ?| / ?& → Checks if any/all keys exist.
  • - / #- → Removes keys.

Text Search

PostgreSQL provides full-text search (FTS) capabilities using to_tsvector and to_tsquery. These functions allow you to efficiently search text fields by normalizing words, removing stop words, and providing ranking capabilities.

to_tsvector converts text into a searchable vector by tokenizing and normalizing words.

SELECT to_tsvector('english', 'PostgreSQL is an amazing open-source database!');
'postgresql':1 'amaz':4 'databas':6 'open-sourc':5
  • Words are stemmed (e.g., "amazing" → "amaz", "database" → "databas").
  • Common words ("is", "an") are removed (stop words).
  • The number represents the word position.

to_tsquery is used to search in a to_tsvector.

SELECT to_tsquery('english', 'postgres & database');
'postgres' & 'databas'
  • The & operator means both words must appear.
  • Words are automatically stemmed.

Matching to_tsquery Against to_tsvector - to perform a search, you use the @@ operator. The text matches because it contains both "postgres" and "database".

SELECT 'PostgreSQL is an amazing database'::tsvector @@ to_tsquery('postgres & database');
true

Indexing for Fast Searches - full-text search works best with an index on to_tsvector:

CREATE INDEX idx_products_search ON products USING GIN (to_tsvector('english', description));

Then, you can efficiently search:

SELECT * FROM products WHERE to_tsvector('english', description) @@ to_tsquery('laptop & dell');

Operators in to_tsquery

OperatorDescriptionExample
&AND (both words must appear)'laptop & dell'
(bar)OR (either word can appear)'laptop (bar) dell'
!NOT (word must NOT appear)'!tablet'
<->Proximity search (words must be close together)'laptop <-> dell'

Example: Full-Text Search on a Table - consider this articles table:

CREATE TABLE articles ( id SERIAL PRIMARY KEY, title TEXT, content TEXT );

Insert data:

INSERT INTO articles (title, content) VALUES ('PostgreSQL FTS', 'PostgreSQL full-text search is powerful.'), ('Database Indexing', 'Indexes speed up queries in databases.');

Perform a search:

SELECT * FROM articles WHERE to_tsvector('english', content) @@ to_tsquery('search & powerful');
id | title | content ---+-----------------+---------------------------------- 1 | PostgreSQL FTS | PostgreSQL full-text search is powerful.

Ranking Search Results (ts_rank). If multiple results match, you can rank them by relevance:

SELECT title, ts_rank(to_tsvector('english', content), to_tsquery('search & powerful')) AS rank FROM articles ORDER BY rank DESC;

Summary

  • to_tsvector(text) → Converts text into a searchable vector.
  • to_tsquery(query)Searches in a to_tsvector.
  • @@ → Matches to_tsquery against to_tsvector.
  • Use GIN indexes for fast searches.
  • Operators: & (AND), | (OR), ! (NOT), <-> (proximity).
  • Use ts_rank() to rank search results.
Originally posted:
Filed Under:
data
sql