結構化查詢語言/遞迴
有時,一個表的行以這樣一種方式結構化,即它們表示此表內部的層次結構或網路。典型用例是管理結構、物料清單(一臺機器由多臺更小的機器組成,…)或網路結構(例如:航班計劃)。
為了檢索特定行以及與它們相關的全部行,可以使用集合運算結合子查詢將它們合併到一個結果集中。但這種技術是有限的,因為必須精確地知道級別數。除了級別數因情況而異之外,子查詢語法在不同級別之間也不同。為了克服這些限制,SQL 提供了一種語法以遞迴方式表達查詢。它們檢索所有受影響級別的行,而與它們的級別數無關。
SQL 標準使用其WITH 子句的一種特殊形式(在上一頁中進行了解釋)來定義遞迴查詢。該子句出現在 SELECT、INSERT、UPDATE 或 DELETE 關鍵字之前,並且是相應命令的一部分。
提示:WITH 子句(帶或不帶“RECURSIVE”關鍵字)通常被稱為“公用表表達式 (CTE)”。
提示:Oracle 從 11.2 版本開始支援 SQL 標準的語法。MySQL 8.0 支援 RECURSIVE 關鍵字。早期的 MySQL 版本根本不支援遞迴,並建議使用過程式解決方法。
-- The command starts with a 'with clause', which contains the optional 'RECURSIVE' key word.
WITH [RECURSIVE] intermediate_table (temp_column_name [,...]) AS
(SELECT ... FROM real_table -- initial query to a real table (1)
UNION ALL (3)
SELECT ... FROM intermediate_table -- repetitive query using the intermediate table (2)
)
-- The 'with clause' is part of a regular SELECT.
-- This SELECT refers to the final result of the 'with clause'. (4)
SELECT ... FROM intermediate_table
; -- consider the semicolon: the command runs from the 'WITH' up to here.
評估順序如下
- 對真實表或檢視的初始查詢將被執行,併為步驟 2 建立起點。
- 通常,重複查詢包含對真實表或檢視與迄今為止構建的結果集的連線。重複此步驟,直到沒有找到新行。
- 將步驟 1 和 2 中的結果集合並在一起。
- 最終的 SELECT 對步驟 3 的結果進行操作。
為了演示遞迴查詢,我們定義了一個示例表。它儲存有關人員及其祖先的資訊。由於祖先始終是人,因此所有內容都儲存在同一張表中。father_id 和 mother_id 充當對儲存父親和母親資訊的行的引用。father_id、mother_id 和 firstname 的組合充當標準,根據這三個值唯一標識行(我們假設父母給孩子起了不同的名字)。
CREATE TABLE family_tree (
-- define columns (name / type / default value / nullable)
id DECIMAL NOT NULL,
firstname VARCHAR(50) NOT NULL,
lastname VARCHAR(50) NOT NULL,
year_of_birth DECIMAL NOT NULL,
year_of_death DECIMAL,
father_id DECIMAL,
mother_id DECIMAL,
-- the primary key
CONSTRAINT family_tree_pk PRIMARY KEY (id),
-- an additional criterion to uniquely distinguish rows from each other
CONSTRAINT family_tree_uniq UNIQUE (father_id, mother_id, firstname),
-- two foreign keys (to the same table in this special case) to ensure that no broken links arise
CONSTRAINT family_tree_fk1 FOREIGN KEY (father_id) REFERENCES family_tree(id),
CONSTRAINT family_tree_fk2 FOREIGN KEY (mother_id) REFERENCES family_tree(id),
-- plausibility checks
CONSTRAINT family_tree_check1 CHECK ( year_of_birth >= 1800 AND year_of_birth < 2100),
CONSTRAINT family_tree_check2 CHECK ((year_of_death >= 1800 AND year_of_death < 2100) OR year_of_death IS NULL)
);
-- a fictional couple
INSERT INTO family_tree VALUES ( 1, 'Karl', 'Miller', 1855, 1905, null, null);
INSERT INTO family_tree VALUES ( 2, 'Lisa', 'Miller', 1851, 1912, null, null);
-- their children
INSERT INTO family_tree VALUES ( 3, 'Ruth', 'Miller', 1878, 1888, 1, 2);
INSERT INTO family_tree VALUES ( 4, 'Helen', 'Miller', 1880, 1884, 1, 2);
INSERT INTO family_tree VALUES ( 5, 'Carl', 'Miller', 1882, 1935, 1, 2);
INSERT INTO family_tree VALUES ( 6, 'John', 'Miller', 1883, 1900, 1, 2);
-- some more people; some of them are descendants of the Millers
INSERT INTO family_tree VALUES ( 7, 'Emily', 'Newton', 1880, 1940, null, null);
INSERT INTO family_tree VALUES ( 8, 'Charly', 'Miller', 1908, 1978, 5, 7);
INSERT INTO family_tree VALUES ( 9, 'Deborah','Brown', 1910, 1980, null, null);
INSERT INTO family_tree VALUES (10, 'Chess', 'Miller', 1948, null, 8, 9);
COMMIT;
作為第一個例子,我們檢索卡爾·米勒先生及其所有後代。為此,我們必須檢索他自己的行並定義一條規則,說明如何在此家譜中從一級到一級“導航”。
-- Choose a name for the intermediate table and its columns. The column names may differ from the names in the real table.
WITH intermediate_table (id, firstname, lastname) AS
(
-- Retrieve the starting row (or rows)
SELECT id, firstname, lastname
FROM family_tree
WHERE firstname = 'Karl'
AND lastname = 'Miller'
UNION ALL
-- Define the rule for querying the next level. In most cases this is done with a join operation.
SELECT f.id, f.firstname, f.lastname -- the alias 'f' refers to the real table
FROM intermediate_table i -- the alias 'i' refers to the intermediate table
JOIN family_tree f ON f.father_id = i.id OR f.mother_id = i.id -- the join operation defines, how to reach the next level
)
-- The final SELECT
SELECT * FROM intermediate_table;
可以使用 SQL 的所有語言功能來進一步處理中間表。(它不是一個真正的表,它只是一個具有表結構的中間結果)。例如,要計算後代的數量。
WITH ... -- The 'with clause' as above
-- The final SELECT
SELECT count(*) FROM intermediate_table
;
為了演示在沒有遞迴 SELECT 的情況下出現的問題,我們將顯示一個使用子查詢的語法。
-- This query retrieves only Mr. Karl Miller ...
SELECT *
FROM family_tree
WHERE firstname = 'Karl'
AND lastname = 'Miller'
UNION ALL
-- ... and his children
SELECT *
FROM family_tree
WHERE father_id IN (SELECT id
FROM family_tree
WHERE firstname = 'Karl'
AND lastname = 'Miller'
)
;
每一級都有自己的語法,例如,要檢索孫子輩,我們需要在子查詢中使用子查詢。
作為第二個例子,我們反向遍歷層次結構:從一個人到他們的父系(男性血統)祖先。與第一個例子相比,兩件事發生了變化。查詢的起點不再是卡爾·米勒先生,因為他在我們的示例表中沒有祖先。我們必須透過交換 id 和 father_id 來更改連線條件。
-- Retrieve ancestors
WITH intermediate_table (id, father_id, firstname, lastname) AS
(
-- Retrieve the starting row (or rows)
SELECT id, father_id, firstname, lastname -- now we need the 'father_id'
FROM family_tree
WHERE firstname = 'Chess'
AND lastname = 'Miller'
UNION ALL
-- Define the rule for querying the next level.
SELECT f.id, f.father_id, f.firstname, f.lastname
FROM intermediate_table i
JOIN family_tree f ON f.id = i.father_id -- compared with the first example this join operation defines the opposite direction
)
-- The final SELECT
SELECT * FROM intermediate_table
;
有時我們需要知道行屬於層次結構或網路中的哪個級別。為了顯示此級別,我們在查詢中包含一個帶任意名稱的偽列。我們選擇名稱 hier_level(因為 level 是儲存點上下文中保留的詞)。
-- We extend the above example to show the hierarchy level
WITH intermediate_table (id, firstname, lastname, hier_level) AS
( SELECT id, firstname, lastname, 0 as hier_level -- set the level of the start point to a fix number
FROM family_tree
WHERE firstname = 'Karl'
AND lastname = 'Miller'
UNION ALL
SELECT f.id, f.firstname, f.lastname, i.hier_level + 1 -- increment the level
FROM intermediate_table i
JOIN family_tree f ON f.father_id = i.id OR f.mother_id = i.id
)
SELECT * FROM intermediate_table;
現在可以使用該級別,並且可以將其用作附加條件,例如,限制為前兩個級別。
-- The with clause remains unchanged
...
SELECT * FROM intermediate_table WHERE hier_level < 2; -- restrict the result to the first two levels
-- or, as with the above solution the intermediate result set is computed over ALL levels and later restricted to the first two:
WITH intermediate_table (id, firstname, lastname, hier_level) AS
( SELECT id, firstname, lastname, 0 as hier_level
FROM family_tree
WHERE firstname = 'Karl'
AND lastname = 'Miller'
UNION ALL
SELECT f.id, f.firstname, f.lastname, i.hier_level + 1
FROM intermediate_table i
JOIN family_tree f ON f.father_id = i.id OR f.mother_id = i.id
WHERE hier_level < 1 -- restrict the join to the expected result
)
SELECT * FROM intermediate_table;
有時我們想從層次結構或網路的起點構建到實際行的路徑,例如,對於多面分類(如 1.5.3)或簡單地對訪問的節點進行編號。這可以透過與計算級別類似的方式實現。我們需要一個帶任意名稱的偽列,並將實際值附加到已經形成的值。
-- Save the path from person to person in an additional column. We choose the name 'hier_path' as its name.
WITH intermediate_table (id, firstname, lastname, hier_level, hier_path) AS
( SELECT id, firstname, lastname, 0 as hier_level, firstname as hier_path -- we collect the given names
FROM family_tree
WHERE firstname = 'Karl'
AND lastname = 'Miller'
UNION ALL
-- The SQL standard knows only a two-parameter function concat(). We us it twice.
SELECT f.id, f.firstname, f.lastname, i.hier_level + 1, concat (concat (i.hier_path, ' / ' ), f.firstname)
FROM intermediate_table i
JOIN family_tree f ON f.father_id = i.id OR f.mother_id = i.id
)
SELECT * FROM intermediate_table;
有兩種遍歷層次結構和網路的方式。必須決定要首先處理哪種節點:子節點(下一級的節點)或兄弟節點(同一級的節點)。這兩種方法分別稱為深度優先 和 廣度優先。使用關鍵字DEPTH FIRST 和 BREADTH FIRST(預設值)可以在兩種變體之間進行選擇。
<with_clause>
SEARCH [DEPTH FIRST|BREADTH FIRST] BY <column_name> SET <sequence_number>
<select_clause>
這些關鍵字出現在WITH 子句和SELECT 子句之間。由於與 JAVA 或 C++ 等程式語言中的樹不同,或者與 XML 例項不同,表的行沒有隱式順序,因此必須為節點定義其級別內的順序。這在BY 關鍵字後面完成。在SET 關鍵字之後,定義一個額外的偽列的名稱,其中自動儲存對所有行的編號。
WITH intermediate_table (id, firstname, lastname, hier_level) AS
( SELECT id, firstname, lastname, 0 AS hier_level
FROM family_tree
WHERE firstname = 'Karl'
AND lastname = 'Miller'
UNION ALL
SELECT f.id, f.firstname, f.lastname, i.hier_level + 1
FROM intermediate_table i
JOIN family_tree f ON f.father_id = i.id OR f.mother_id = i.id
)
-- SEARCH BREADTH FIRST BY firstname SET sequence_number
SEARCH DEPTH FIRST BY firstname SET sequence_number
SELECT * FROM intermediate_table;
對上述查詢有一些值得注意的說明
- 與本頁上的其他查詢(我們隱式使用了預設的
BREADTH FIRST)相反,家譜以這樣一種方式遍歷:在每行之後,都處理其“子”行。這在級別 1 處很重要。 - 如果每級有多行,則根據
BY定義對這些行進行排序:在本例中為 firstname。 - 這些行有一個序列號:在本例中為 sequence_number。可以使用此號碼進行任何額外的處理。
檢索切斯·米勒及其所有女性祖先。
WITH intermediate_table (id, mother_id, firstname, lastname) AS
(
SELECT id, mother_id, firstname, lastname
FROM family_tree
WHERE firstname = 'Chess'
AND lastname = 'Miller'
UNION ALL
SELECT f.id, f.mother_id, f.firstname, f.lastname
FROM intermediate_table i
JOIN family_tree f ON f.id = i.mother_id
)
SELECT * FROM intermediate_table;
檢索切斯·米勒及其所有祖先:男性和女性。
WITH intermediate_table (id, father_id, mother_id, firstname, lastname) AS
(
SELECT id, father_id, mother_id, firstname, lastname
FROM family_tree
WHERE firstname = 'Chess'
AND lastname = 'Miller'
UNION ALL
SELECT f.id, f.father_id, f.mother_id, f.firstname, f.lastname
FROM intermediate_table i
-- extend the JOIN condition!
JOIN family_tree f ON (f.id = i.mother_id OR f.id = i.father_id)
)
SELECT * FROM intermediate_table;
為了使情況更清晰,在前面的查詢中新增一個數字,顯示實際級別。
WITH intermediate_table (id, father_id, mother_id, firstname, lastname, hier_level) AS
(
SELECT id, father_id, mother_id, firstname, lastname, 0 -- we start with '0'
FROM family_tree
WHERE firstname = 'Chess'
AND lastname = 'Miller'
UNION ALL
SELECT f.id, f.father_id, f.mother_id, f.firstname, f.lastname, i.hier_level + 1
FROM intermediate_table i
JOIN family_tree f ON (f.id = i.mother_id OR f.id = i.father_id)
)
SELECT * FROM intermediate_table;
為了使情況完全透明,用某種路徑(孩子/父母/祖父母/...)替換級別。
WITH intermediate_table (id, father_id, mother_id, firstname, lastname, ancestry) AS
(
SELECT id, father_id, mother_id, firstname, lastname, firstname
FROM family_tree
WHERE firstname = 'Chess'
AND lastname = 'Miller'
UNION ALL
SELECT f.id, f.father_id, f.mother_id, f.firstname, f.lastname, concat (concat (i.ancestry, ' / '), f.firstname)
FROM intermediate_table i
JOIN family_tree f ON (f.id = i.mother_id OR f.id = i.father_id)
)
SELECT * FROM intermediate_table;
檢索卡爾·米勒的所有孫輩。
WITH intermediate_table (id, father_id, mother_id, firstname, lastname, hier_level) AS
(
SELECT id, father_id, mother_id, firstname, lastname, 0 -- we start with '0'
FROM family_tree
WHERE firstname = 'Karl'
AND lastname = 'Miller'
UNION ALL
SELECT f.id, f.father_id, f.mother_id, f.firstname, f.lastname, i.hier_level + 1
FROM intermediate_table i
JOIN family_tree f ON (f.father_id = i.id AND hier_level < 2) -- performance: abort joining after the second level
)
SELECT * FROM intermediate_table WHERE hier_level = 2; -- this is the restriction to the grandchildren
檢索 family_tree 表中的每個人,並顯示其姓氏及其男性血統中第一個已知祖先的姓氏。
WITH intermediate_table (id, father_id, firstname, lastname, initial_row, hier_level) AS
( -- The starting points are persons (more than one in our example table) for which no father is known.
SELECT id, father_id, firstname, lastname, firstname, 0
FROM family_tree
WHERE father_id IS NULL
UNION ALL
-- The start name is preserved from level to level
SELECT f.id, f.father_id, f.firstname, f.lastname, i.initial_row, i.hier_level + 1
FROM intermediate_table i
JOIN family_tree f ON f.father_id = i.id
)
SELECT * FROM intermediate_table;
-- or:
... unchanged 'with clause'
SELECT id, firstname, '-->', initial_row, 'in ', hier_level, 'generation(s)' FROM intermediate_table;
a) 示例表中儲存了多少卡爾·米勒的後代?
b) 與之前的問題相同,但按級別區分。
-- a)
WITH intermediate_table (id, firstname, lastname, hier_level) AS
( SELECT id, firstname, lastname, 0 AS hier_level
FROM family_tree
WHERE firstname = 'Karl'
AND lastname = 'Miller'
UNION ALL
SELECT f.id, f.firstname, f.lastname, i.hier_level + 1
FROM intermediate_table i
JOIN family_tree f ON f.father_id = i.id
)
SELECT count(*) FROM intermediate_table where hier_level > 0;
-- b) Use the same WITH clause. Only the final SELECT changes.
...
SELECT hier_level, count(hier_level) FROM intermediate_table WHERE hier_level > 0 GROUP BY hier_level;