Note for following along

Before we get to macros, there is an exercise from the previous post to discuss. Let’s look at the final league table positions for season 1996-1997:

SELECT
  ccode,
  club_name,
  league_position
FROM vw_ltables v
JOIN clubs c ON v.ccode = c.club_code
WHERE season = '1996_1997'
  AND league_position > 17
ORDER BY 3;

Produces the following:

ccode club_name league_position
COV Coventry City 18
SUN Sunderland 19
NFO Nottingham Forest 20

However, if we compare this to the OUTER JOIN query on season 1997-1997 to season 1998-1999, we can see that Coventry City was not relegated, instead, it was Middlesbrough (club code MID):

WITH s1 AS(
  SELECT ccode
  FROM vw_ltables
WHERE season = '1996_1997'
),
s2 AS(
  SELECT ccode
  FROM vw_ltables
  WHERE season = '1997_1998'
)
SELECT
  COALESCE(s1.ccode, 'Promoted') season1,
  COALESCE(s2.ccode, 'Relegated') season2
FROM s1
FULL OUTER JOIN s2 ON s1.ccode = s2.ccode
WHERE s1.ccode IS NULL OR s2.ccode IS NULL;

Produces:

season1 season2
SUN Relegated
NFO Relegated
MID Relegated
Promoted BOL
Promoted BAR
Promoted CRY

At first I thought the discrepancy was a data error but the Wikipedia page for the season states that:

“Middlesbrough finished in 19th place, but would have been placed 14th without a three-point deduction imposed for unilaterally postponing a 21 December 1996 fixture at Blackburn Rovers, with the Middlesbrough board making the decision due to the absence of 23 players ill or injured.”.

I have not changed the source for this points deduction and left Middlesbrough with their 42 points but this highlights a general point about data: there are always exceptions to business rules due special cases like this. Therefore discrepancies cannot always be attributed to data errors.

Macros

In the exercise above we have different queries to inform us of club relegations and promotions between seasons. However, we have to hard-code the season text values for the seasons we are comparing. Hard-coding values that have to be changed frequently should be avoided if possible. Thankfully, DuckDB introduced macros which provide a type of user-defined functionality to DuckDB. Macros in DuckDB are similar to SQL user-defined functions in PostgreSQL, that is when you set LANGUAGE SQL in the function definition. Where DuckDB’s macros are especially useful is in defining what in Microsoft Access is called a parameter query. We could create a view for relegations and promotions for each season but that would be extremely repetitive and a maintenance headache; this is where macros come to the rescue so let’s see how they work and how we can use them.

From the documentation:

“The CREATE MACRO statement can create a scalar or table macro (function) in the catalog. A macro may only be a single SELECT statement (similar to a VIEW), but it has the benefit of accepting parameters.”

We are going to create a TABLE MACRO for our promotions/relegations and we will use the outer join version for this. We will create the macro in a schema we will call macros, I am using a PostgreSQL practice here where I create functions and procedures in their own schemas. I have enhanced the query to produce the more readable club names rather than the slightly cryptic club codes by using inner joins in both the CTEs.

CREATE SCHEMA macros;
USE macros;
CREATE OR REPLACE MACRO releg_promo(p_season1, p_season2) AS TABLE
WITH s1 AS(
  SELECT c.club_name
  FROM main.vw_ltables v
  JOIN main.clubs c
    ON v.ccode = c.club_code
WHERE season = p_season1
),
s2 AS(
  SELECT c.club_name
  FROM main.vw_ltables v
  JOIN main.clubs c
    ON v.ccode = c.club_code
  WHERE season = p_season2
)
SELECT
  COALESCE(s1.club_name, 'Promoted') season1,
  COALESCE(s2.club_name, 'Relegated') season2
FROM s1
FULL OUTER JOIN s2 ON s1.club_name = s2.club_name
WHERE s1.club_name IS NULL OR s2.club_name IS NULL;

Code notes

  • I use the p_ prefix for parameters; another PostgreSQL habit that helps me distinguish parameters from, say, column names.
  • Note, unlike in PostgreSQL functions, the parameters are not qualified with a data type. An equivalent SQL function in PostgreSQL would define them as “p_season1 TEXT, p_season2 TEXT”
  • We can now call our macro like so:
USE main;
SELECT * FROM macros.releg_promo('1996_1997', '1997_1998');

And we get the following output:

season1 season2
Middlesbrough Relegated
Nottingham Forest Relegated
Sunderland Relegated
Promoted Crystal Palace
Promoted Barnsley
Promoted Bolton Wanderers
  • We now have convenient mechanism for generating the same reports for any seasons we choose.

Wrap-up

Macros are a great feature of DuckDB and they really help cut down on repetive code (always a good thing). It will be interesting to see if DuckDB goes on to implement a full stored procedure language like PL/pgSQL in PostgreSQL. That said, the macros are really nice to have.

Next up

Generating running totals using window functions.


<
Previous Post
Promotions and relegations - Sets and Joins 🦆⚽
>
Next Post
Running totals with a window function 🦆⚽