- Common Table Expressions (CTEs)
- Recursive CTEs
- Window Functions
- JSONB Processing
- Lateral Joins
- Subqueries
- Aggregations
- Indexes & Full-Text Search
- Partitioning
- Foreign Data Wrappers (FDW)
- Custom Functions
- Triggers
- Geospatial Queries (PostGIS)
- Event Triggers
Scenario:
You run an e-commerce platform and want to analyze customer purchases, product metadata (JSONB), warehouse distances (PostGIS), and a foreign table (FDW) for external suppliers.
The Query
- Finds customers with high-value purchases (
> $1000
). - Uses a recursive CTE to compute referral-based purchase impact.
- Extracts JSONB metadata for product attributes.
- Uses full-text search to find trending products.
- Retrieves nearest warehouse using PostGIS.
- Queries external suppliers using FDW.
- Uses window functions for ranking.
- Partitions results for performance.
-- Enable necessary extensions CREATE EXTENSION IF NOT EXISTS postgis; CREATE EXTENSION IF NOT EXISTS postgres_fdw; -- Step 1: Recursive CTE for Referral Purchase Analysis WITH RECURSIVE ReferralTree AS ( SELECT c.customer_id, c.referrer_id, o.order_id, o.total_amount FROM customers c JOIN orders o ON c.customer_id = o.customer_id WHERE o.total_amount > 1000 UNION ALL SELECT c.customer_id, c.referrer_id, rt.order_id, rt.total_amount FROM customers c JOIN ReferralTree rt ON c.referrer_id = rt.customer_id ) -- Step 2: Extract Product Data from JSONB , ProductData AS ( SELECT p.product_id, p.name, p.metadata ->> 'category' AS category, p.metadata -> 'specs' ->> 'weight' AS weight FROM products p ) -- Step 3: Find Popular Products using Full-Text Search , PopularProducts AS ( SELECT p.product_id, p.name, p.description FROM products p WHERE to_tsvector('english', p.description) @@ to_tsquery('best & selling') ) -- Step 4: Find Nearest Warehouse using PostGIS , NearestWarehouse AS ( SELECT w.warehouse_id, w.name AS warehouse_name, c.customer_id, ST_Distance(c.location::geography, w.location::geography) AS distance_km FROM customers c JOIN warehouses w ON ST_DWithin(c.location::geography, w.location::geography, 50000) -- 50km range ) -- Step 5: Query External Suppliers (Foreign Data Wrapper) , ExternalSuppliers AS ( SELECT s.supplier_id, s.product_id, s.price, s.stock FROM external_supplier_db.supplier_data s ) -- Final Query Combining All Data SELECT c.customer_id, c.name AS customer_name, rt.total_amount AS total_spent, np.name AS trending_product, nw.warehouse_name, nw.distance_km, es.supplier_id, es.price AS supplier_price, RANK() OVER (PARTITION BY nw.warehouse_id ORDER BY rt.total_amount DESC) AS purchase_rank FROM ReferralTree rt JOIN customers c ON c.customer_id = rt.customer_id LEFT JOIN PopularProducts np ON np.product_id IN ( SELECT product_id FROM orders WHERE customer_id = c.customer_id ) LEFT JOIN NearestWarehouse nw ON nw.customer_id = c.customer_id LEFT JOIN ExternalSuppliers es ON es.product_id IN ( SELECT product_id FROM orders WHERE customer_id = c.customer_id ) ORDER BY rt.total_amount DESC, nw.distance_km ASC LIMIT 50;
Key Features Used
- Recursive CTE (
ReferralTree
): Traces purchases made by referred customers. - JSONB Handling (
metadata ->> 'category'
): Extracts structured product details. - Full-Text Search (
to_tsvector @@ to_tsquery
): Finds trending products dynamically. - PostGIS (
ST_Distance
,ST_DWithin
): Determines nearest warehouses. - Foreign Data Wrapper (
postgres_fdw
): Queries external supplier data. - Window Functions (
RANK() OVER PARTITION
): Ranks customers based on spending. - Partitioned Queries (
ORDER BY, LIMIT
): Improves performance.
Additional Enhancements
-
Indexing for Speed:
CREATE INDEX idx_orders_total ON orders(total_amount); CREATE INDEX idx_products_tsv ON products USING gin(to_tsvector('english', description));
-
Trigger to Auto-Update Referral Impact:
CREATE FUNCTION update_referral_spend() RETURNS TRIGGER AS $$ BEGIN UPDATE customers SET referral_spend = referral_spend + NEW.total_amount WHERE customer_id = NEW.referrer_id; RETURN NEW; END; $$ LANGUAGE plpgsql; CREATE TRIGGER referral_trigger AFTER INSERT ON orders FOR EACH ROW EXECUTE FUNCTION update_referral_spend();
JSONB Syntax
PostgreSQL provides powerful JSON/JSONB operators for working with JSON data. Let's break them down with examples.
JSONB Operators in PostgreSQL
Operator | Description | Example |
---|---|---|
-> | Extracts a JSON object/array element by key (returns JSON) | jsonb_column->'key' |
->> | Extracts a JSON object/array element as text (returns text) | jsonb_column->>'key' |
#> | Extracts a nested JSON object/array (returns JSON) | jsonb_column#>'{key1, key2}' |
#>> | Extracts a nested JSON object/array as text (returns text) | jsonb_column#>>'{key1, key2}' |
@> | Checks if the left JSONB contains the right JSONB | jsonb_column @> '{"key": "value"}' |
<@ | Checks if the right JSONB contains the left JSONB | '{"key": "value"}' <@ jsonb_column |
? | Checks if a key exists in the JSONB object | jsonb_column ? 'key' |
?(bar) | Checks if any key in an array exists in the JSONB object | jsonb_column ?(bar) array['key1', 'key2'] |
?& | Checks if all keys in an array exist in the JSONB object | jsonb_column ?& array['key1', 'key2'] |
- | Removes a key from a JSONB object | jsonb_column - 'key' |
- | Removes an element by index from a JSONB array | jsonb_column - 1 |
#- | Removes a nested key from JSONB | jsonb_column #- '{key1, key2}' |
Examples Using JSONB... Assume we have a products
table:
CREATE TABLE products ( id SERIAL PRIMARY KEY, name TEXT, metadata JSONB ); INSERT INTO products (name, metadata) VALUES ('Laptop', '{"brand": "Dell", "specs": {"ram": "16GB", "cpu": "Intel i7"}, "tags": ["electronics", "computer"]}'), ('Phone', '{"brand": "Apple", "specs": {"ram": "8GB", "cpu": "A15 Bionic"}, "tags": ["electronics", "mobile"]}');
Extracting a Top-Level Key. (Returns JSON, so it has quotes)
SELECT metadata->'brand' FROM products;
"Dell" "Apple"
Extracting a Key as Text (->>
). (Returns plain text, no quotes)
SELECT metadata->>'brand' FROM products;
Dell Apple
Extracting a Nested JSON Object (#>
). (Returns JSON, still in quotes)
SELECT metadata#>'{specs, ram}' FROM products;
"16GB" "8GB"
Extracting a Nested Value as Text (#>>
). (Returns plain text)
SELECT metadata#>>'{specs, cpu}' FROM products;
Intel i7 A15 Bionic
Checking If JSONB Contains a Value (@>
). Find all products that contain "brand": "Apple"
:
SELECT name FROM products WHERE metadata @> '{"brand": "Apple"}';
Phone
Checking If a Key Exists (?
): (Finds all products that have a "brand"
key in metadata
)
SELECT name FROM products WHERE metadata ? 'brand';
Checking for Multiple Keys (?|
and ?&
). Find products that have either "brand" or "specs":
SELECT name FROM products WHERE metadata ?| array['brand', 'specs'];
Find products that have both "brand" and "tags":
SELECT name FROM products WHERE metadata ?& array['brand', 'tags'];
Removing a Key (-
). Remove "brand"
from JSONB - (Returns JSON without "brand"
key)
SELECT metadata - 'brand' FROM products;
Removing a Nested Key (#-
) - Remove "cpu"
from "specs"
. (Removes the nested key inside specs
)
SELECT metadata #- '{specs, cpu}' FROM products;
Searching in JSONB Arrays: - find products that have "mobile"
in tags
:
SELECT name FROM products WHERE metadata->'tags' @> '["mobile"]';
Summary
->
→ Extracts a key as JSON.->>
→ Extracts a key as text.#>
→ Extracts a nested key as JSON.#>>
→ Extracts a nested key as text.@>
→ Checks if JSONB contains a value.?
→ Checks if a key exists.?|
/?&
→ Checks if any/all keys exist.-
/#-
→ Removes keys.
Text Search
PostgreSQL provides full-text search (FTS) capabilities using to_tsvector
and to_tsquery
. These functions allow you to efficiently search text fields by normalizing words, removing stop words, and providing ranking capabilities.
to_tsvector
converts text into a searchable vector by tokenizing and normalizing words.
SELECT to_tsvector('english', 'PostgreSQL is an amazing open-source database!');
'postgresql':1 'amaz':4 'databas':6 'open-sourc':5
- Words are stemmed (e.g., "amazing" → "amaz", "database" → "databas").
- Common words ("is", "an") are removed (stop words).
- The number represents the word position.
to_tsquery
is used to search in a to_tsvector
.
SELECT to_tsquery('english', 'postgres & database');
'postgres' & 'databas'
- The
&
operator means both words must appear. - Words are automatically stemmed.
Matching to_tsquery
Against to_tsvector
- to perform a search, you use the @@
operator. The text matches because it contains both "postgres" and "database".
SELECT 'PostgreSQL is an amazing database'::tsvector @@ to_tsquery('postgres & database');
true
Indexing for Fast Searches - full-text search works best with an index on to_tsvector
:
CREATE INDEX idx_products_search ON products USING GIN (to_tsvector('english', description));
Then, you can efficiently search:
SELECT * FROM products WHERE to_tsvector('english', description) @@ to_tsquery('laptop & dell');
Operators in to_tsquery
Operator | Description | Example |
---|---|---|
& | AND (both words must appear) | 'laptop & dell' |
(bar) | OR (either word can appear) | 'laptop (bar) dell' |
! | NOT (word must NOT appear) | '!tablet' |
<-> | Proximity search (words must be close together) | 'laptop <-> dell' |
Example: Full-Text Search on a Table - consider this articles
table:
CREATE TABLE articles ( id SERIAL PRIMARY KEY, title TEXT, content TEXT );
Insert data:
INSERT INTO articles (title, content) VALUES ('PostgreSQL FTS', 'PostgreSQL full-text search is powerful.'), ('Database Indexing', 'Indexes speed up queries in databases.');
Perform a search:
SELECT * FROM articles WHERE to_tsvector('english', content) @@ to_tsquery('search & powerful');
id | title | content ---+-----------------+---------------------------------- 1 | PostgreSQL FTS | PostgreSQL full-text search is powerful.
Ranking Search Results (ts_rank
). If multiple results match, you can rank them by relevance:
SELECT title, ts_rank(to_tsvector('english', content), to_tsquery('search & powerful')) AS rank FROM articles ORDER BY rank DESC;
Summary
to_tsvector(text)
→ Converts text into a searchable vector.to_tsquery(query)
→ Searches in ato_tsvector
.@@
→ Matchesto_tsquery
againstto_tsvector
.- Use GIN indexes for fast searches.
- Operators:
&
(AND),|
(OR),!
(NOT),<->
(proximity). - Use
ts_rank()
to rank search results.