跳轉到內容

結構化查詢語言/遞迴

來自華夏公益教科書,開放的書籍,開放的世界

有時,一個表的行以這樣一種方式結構化,即它們表示此表內部的層次結構或網路。典型用例是管理結構、物料清單(一臺機器由多臺更小的機器組成,…)或網路結構(例如:航班計劃)。

為了檢索特定行以及與它們相關的全部行,可以使用集合運算結合子查詢將它們合併到一個結果集中。但這種技術是有限的,因為必須精確地知道級別數。除了級別數因情況而異之外,子查詢語法在不同級別之間也不同。為了克服這些限制,SQL 提供了一種語法以遞迴方式表達查詢。它們檢索所有受影響級別的行,而與它們的級別數無關。

SQL 標準使用其WITH 子句的一種特殊形式(在上一頁中進行了解釋)來定義遞迴查詢。該子句出現在 SELECT、INSERT、UPDATE 或 DELETE 關鍵字之前,並且是相應命令的一部分。

提示:WITH 子句(帶或不帶“RECURSIVE”關鍵字)通常被稱為“公用表表達式 (CTE)”。

提示:Oracle 從 11.2 版本開始支援 SQL 標準的語法。MySQL 8.0 支援 RECURSIVE 關鍵字。早期的 MySQL 版本根本不支援遞迴,並建議使用過程式解決方法。

-- The command starts with a 'with clause', which contains the optional 'RECURSIVE' key word.
WITH [RECURSIVE] intermediate_table (temp_column_name [,...]) AS 
  (SELECT ... FROM real_table          -- initial query to a real table                          (1)
     UNION ALL                                                                                   (3)
   SELECT ... FROM intermediate_table  -- repetitive query using the intermediate table          (2)
  )
-- The 'with clause' is part of a regular SELECT.
-- This SELECT refers to the final result of the 'with clause'.                                  (4)
SELECT ... FROM intermediate_table
; -- consider the semicolon: the command runs from the 'WITH' up to here.

評估順序如下

  1. 對真實表或檢視的初始查詢將被執行,併為步驟 2 建立起點。
  2. 通常,重複查詢包含對真實表或檢視與迄今為止構建的結果集的連線。重複此步驟,直到沒有找到新行。
  3. 將步驟 1 和 2 中的結果集合並在一起。
  4. 最終的 SELECT 對步驟 3 的結果進行操作。

示例表

[編輯 | 編輯原始碼]

為了演示遞迴查詢,我們定義了一個示例表。它儲存有關人員及其祖先的資訊。由於祖先始終是人,因此所有內容都儲存在同一張表中。father_idmother_id 充當對儲存父親和母親資訊的行的引用。father_idmother_idfirstname 的組合充當標準,根據這三個值唯一標識行(我們假設父母給孩子起了不同的名字)。

CREATE TABLE family_tree (
  -- define columns (name / type / default value / nullable)
  id             DECIMAL      NOT NULL,
  firstname      VARCHAR(50)  NOT NULL,
  lastname       VARCHAR(50)  NOT NULL,
  year_of_birth  DECIMAL      NOT NULL,
  year_of_death  DECIMAL,
  father_id      DECIMAL,
  mother_id      DECIMAL,
  -- the primary key
  CONSTRAINT family_tree_pk   PRIMARY KEY (id),
  -- an additional criterion to uniquely distinguish rows from each other
  CONSTRAINT family_tree_uniq UNIQUE (father_id, mother_id, firstname),
  -- two foreign keys (to the same table in this special case) to ensure that no broken links arise
  CONSTRAINT family_tree_fk1  FOREIGN KEY (father_id) REFERENCES family_tree(id),
  CONSTRAINT family_tree_fk2  FOREIGN KEY (mother_id) REFERENCES family_tree(id),
  -- plausibility checks
  CONSTRAINT family_tree_check1 CHECK ( year_of_birth >= 1800 AND year_of_birth < 2100),
  CONSTRAINT family_tree_check2 CHECK ((year_of_death >= 1800 AND year_of_death < 2100) OR year_of_death IS NULL)
);

-- a fictional couple
INSERT INTO family_tree VALUES ( 1, 'Karl',   'Miller', 1855, 1905, null, null);
INSERT INTO family_tree VALUES ( 2, 'Lisa',   'Miller', 1851, 1912, null, null);
-- their children
INSERT INTO family_tree VALUES ( 3, 'Ruth',   'Miller', 1878, 1888, 1,    2);
INSERT INTO family_tree VALUES ( 4, 'Helen',  'Miller', 1880, 1884, 1,    2);
INSERT INTO family_tree VALUES ( 5, 'Carl',   'Miller', 1882, 1935, 1,    2);
INSERT INTO family_tree VALUES ( 6, 'John',   'Miller', 1883, 1900, 1,    2);
-- some more people; some of them are descendants of the Millers
INSERT INTO family_tree VALUES ( 7, 'Emily',  'Newton', 1880, 1940, null, null);
INSERT INTO family_tree VALUES ( 8, 'Charly', 'Miller', 1908, 1978, 5,    7);
INSERT INTO family_tree VALUES ( 9, 'Deborah','Brown',  1910, 1980, null, null);
INSERT INTO family_tree VALUES (10, 'Chess',  'Miller', 1948, null, 8,    9);
COMMIT;

基本查詢

[編輯 | 編輯原始碼]

作為第一個例子,我們檢索卡爾·米勒先生及其所有後代。為此,我們必須檢索他自己的行並定義一條規則,說明如何在此家譜中從一級到一級“導航”。

-- Choose a name for the intermediate table and its columns. The column names may differ from the names in the real table.
WITH intermediate_table (id, firstname, lastname) AS
(
  -- Retrieve the starting row (or rows)
  SELECT id, firstname, lastname
  FROM   family_tree
  WHERE  firstname = 'Karl'
  AND    lastname  = 'Miller'
    UNION ALL
  -- Define the rule for querying the next level. In most cases this is done with a join operation.
  SELECT f.id, f.firstname, f.lastname        -- the alias 'f' refers to the real table
  FROM   intermediate_table i                 -- the alias 'i' refers to the intermediate table
  JOIN   family_tree f ON f.father_id = i.id OR f.mother_id = i.id  -- the join operation defines, how to reach the next level
)
-- The final SELECT
SELECT * FROM intermediate_table;


可以使用 SQL 的所有語言功能來進一步處理中間表。(它不是一個真正的表,它只是一個具有表結構的中間結果)。例如,要計算後代的數量。

WITH ...  -- The 'with clause' as above
-- The final SELECT
SELECT count(*) FROM intermediate_table
;

為了演示在沒有遞迴 SELECT 的情況下出現的問題,我們將顯示一個使用子查詢的語法。

-- This query retrieves only Mr. Karl Miller ...
SELECT * 
FROM   family_tree
WHERE  firstname = 'Karl'
AND    lastname  = 'Miller'
  UNION ALL
-- ... and his children
SELECT * 
FROM   family_tree
WHERE  father_id IN (SELECT id
                     FROM   family_tree
                     WHERE  firstname = 'Karl'
                     AND    lastname  = 'Miller'
                    )
;

每一級都有自己的語法,例如,要檢索孫子輩,我們需要在子查詢中使用子查詢。

作為第二個例子,我們反向遍歷層次結構:從一個人到他們的父系(男性血統)祖先。與第一個例子相比,兩件事發生了變化。查詢的起點不再是卡爾·米勒先生,因為他在我們的示例表中沒有祖先。我們必須透過交換 id 和 father_id 來更改連線條件。

-- Retrieve ancestors
WITH intermediate_table (id, father_id, firstname, lastname) AS
(
  -- Retrieve the starting row (or rows)
  SELECT id, father_id, firstname, lastname  -- now we need the 'father_id'
  FROM   family_tree
  WHERE  firstname = 'Chess'
  AND    lastname  = 'Miller'
    UNION ALL
  -- Define the rule for querying the next level.
  SELECT f.id, f.father_id, f.firstname, f.lastname
  FROM   intermediate_table i
  JOIN   family_tree f ON f.id = i.father_id  -- compared with the first example this join operation defines the opposite direction
)
-- The final SELECT
SELECT * FROM intermediate_table
;

注意級別

[編輯 | 編輯原始碼]

有時我們需要知道行屬於層次結構或網路中的哪個級別。為了顯示此級別,我們在查詢中包含一個帶任意名稱的偽列。我們選擇名稱 hier_level(因為 level 是儲存點上下文中保留的詞)。

-- We extend the above example to show the hierarchy level
WITH intermediate_table (id, firstname, lastname, hier_level) AS
( SELECT id, firstname, lastname, 0 as hier_level         -- set the level of the start point to a fix number
  FROM   family_tree
  WHERE  firstname = 'Karl'
  AND    lastname  = 'Miller'
    UNION ALL
  SELECT f.id, f.firstname, f.lastname, i.hier_level + 1  -- increment the level
  FROM   intermediate_table i
  JOIN   family_tree f ON f.father_id = i.id OR f.mother_id = i.id
)
SELECT * FROM intermediate_table;

現在可以使用該級別,並且可以將其用作附加條件,例如,限制為前兩個級別。

-- The with clause remains unchanged
...
SELECT * FROM intermediate_table WHERE hier_level < 2; -- restrict the result to the first two levels

-- or, as with the above solution the intermediate result set is computed over ALL levels and later restricted to the first two:
WITH intermediate_table (id, firstname, lastname, hier_level) AS
( SELECT id, firstname, lastname, 0 as hier_level
  FROM   family_tree
  WHERE  firstname = 'Karl'
  AND    lastname  = 'Miller'
    UNION ALL
  SELECT f.id, f.firstname, f.lastname, i.hier_level + 1
  FROM   intermediate_table i
  JOIN   family_tree f ON f.father_id = i.id OR f.mother_id = i.id
  WHERE  hier_level < 1   -- restrict the join to the expected result
)
SELECT * FROM intermediate_table;

建立路徑

[編輯 | 編輯原始碼]

有時我們想從層次結構或網路的起點構建到實際行的路徑,例如,對於多面分類(如 1.5.3)或簡單地對訪問的節點進行編號。這可以透過與計算級別類似的方式實現。我們需要一個帶任意名稱的偽列,並將實際值附加到已經形成的值。

-- Save the path from person to person in an additional column. We choose the name 'hier_path' as its name. 
WITH intermediate_table (id, firstname, lastname, hier_level, hier_path) AS
( SELECT id, firstname, lastname, 0 as hier_level, firstname as hier_path -- we collect the given names
  FROM   family_tree
  WHERE  firstname = 'Karl'
  AND    lastname  = 'Miller'
    UNION ALL
  -- The SQL standard knows only a two-parameter function concat(). We us it twice.
  SELECT f.id, f.firstname, f.lastname, i.hier_level + 1, concat (concat (i.hier_path, ' / ' ), f.firstname)
  FROM   intermediate_table i
  JOIN   family_tree f ON f.father_id = i.id OR f.mother_id = i.id
)
SELECT * FROM intermediate_table;

深度優先/廣度優先

[編輯 | 編輯原始碼]

有兩種遍歷層次結構和網路的方式。必須決定要首先處理哪種節點:子節點(下一級的節點)或兄弟節點(同一級的節點)。這兩種方法分別稱為深度優先廣度優先。使用關鍵字DEPTH FIRSTBREADTH FIRST(預設值)可以在兩種變體之間進行選擇。

<with_clause>
SEARCH [DEPTH FIRST|BREADTH FIRST] BY <column_name> SET <sequence_number>
<select_clause>

這些關鍵字出現在WITH 子句SELECT 子句之間。由於與 JAVA 或 C++ 等程式語言中的樹不同,或者與 XML 例項不同,表的行沒有隱式順序,因此必須為節點定義其級別內的順序。這在BY 關鍵字後面完成。在SET 關鍵字之後,定義一個額外的偽列的名稱,其中自動儲存對所有行的編號。

WITH intermediate_table (id, firstname, lastname, hier_level) AS
( SELECT id, firstname, lastname, 0 AS hier_level
  FROM   family_tree
  WHERE  firstname = 'Karl'
  AND    lastname  = 'Miller'
    UNION ALL
  SELECT f.id, f.firstname, f.lastname, i.hier_level + 1
  FROM   intermediate_table i
  JOIN   family_tree f ON f.father_id = i.id OR f.mother_id = i.id
)
-- SEARCH BREADTH FIRST BY firstname SET sequence_number
SEARCH DEPTH FIRST BY firstname SET sequence_number
SELECT * FROM intermediate_table;

對上述查詢有一些值得注意的說明

  1. 與本頁上的其他查詢(我們隱式使用了預設的BREADTH FIRST)相反,家譜以這樣一種方式遍歷:在每行之後,都處理其“子”行。這在級別 1 處很重要。
  2. 如果每級有多行,則根據BY 定義對這些行進行排序:在本例中為 firstname
  3. 這些行有一個序列號:在本例中為 sequence_number。可以使用此號碼進行任何額外的處理。

檢索切斯·米勒及其所有女性祖先。

點選檢視解決方案
WITH intermediate_table (id, mother_id, firstname, lastname) AS
(
  SELECT id, mother_id, firstname, lastname
  FROM   family_tree
  WHERE  firstname = 'Chess'
  AND    lastname  = 'Miller'
    UNION ALL
  SELECT f.id, f.mother_id, f.firstname, f.lastname
  FROM   intermediate_table i
  JOIN   family_tree f ON f.id = i.mother_id
)
SELECT * FROM intermediate_table;

檢索切斯·米勒及其所有祖先:男性和女性。

點選檢視解決方案
WITH intermediate_table (id, father_id, mother_id, firstname, lastname) AS
(
  SELECT id, father_id, mother_id, firstname, lastname
  FROM   family_tree
  WHERE  firstname = 'Chess'
  AND    lastname  = 'Miller'
    UNION ALL
  SELECT f.id, f.father_id, f.mother_id, f.firstname, f.lastname
  FROM   intermediate_table i
  -- extend the JOIN condition!
  JOIN   family_tree f ON (f.id = i.mother_id OR f.id = i.father_id)
)
SELECT * FROM intermediate_table;

為了使情況更清晰,在前面的查詢中新增一個數字,顯示實際級別。

點選檢視解決方案
WITH intermediate_table (id, father_id, mother_id, firstname, lastname, hier_level) AS
(
  SELECT id, father_id, mother_id, firstname, lastname, 0  -- we start with '0'
  FROM   family_tree
  WHERE  firstname = 'Chess'
  AND    lastname  = 'Miller'
    UNION ALL
  SELECT f.id, f.father_id, f.mother_id, f.firstname, f.lastname, i.hier_level + 1
  FROM   intermediate_table i
  JOIN   family_tree f ON (f.id = i.mother_id OR f.id = i.father_id)
)
SELECT * FROM intermediate_table;

為了使情況完全透明,用某種路徑(孩子/父母/祖父母/...)替換級別。

點選檢視解決方案
WITH intermediate_table (id, father_id, mother_id, firstname, lastname, ancestry) AS
(
  SELECT id, father_id, mother_id, firstname, lastname, firstname
  FROM   family_tree
  WHERE  firstname = 'Chess'
  AND    lastname  = 'Miller'
    UNION ALL
  SELECT f.id, f.father_id, f.mother_id, f.firstname, f.lastname, concat (concat (i.ancestry, ' / '), f.firstname)
  FROM   intermediate_table i
  JOIN   family_tree f ON (f.id = i.mother_id OR f.id = i.father_id)
)
SELECT * FROM intermediate_table;

檢索卡爾·米勒的所有孫輩。

點選檢視解決方案
WITH intermediate_table (id, father_id, mother_id, firstname, lastname, hier_level) AS
(
  SELECT id, father_id, mother_id, firstname, lastname, 0   -- we start with '0'
  FROM   family_tree
  WHERE  firstname = 'Karl'
  AND    lastname  = 'Miller'
    UNION ALL
  SELECT f.id, f.father_id, f.mother_id, f.firstname, f.lastname, i.hier_level + 1
  FROM   intermediate_table i
  JOIN   family_tree f ON (f.father_id = i.id AND hier_level < 2) -- performance: abort joining after the second level
)
SELECT * FROM intermediate_table WHERE hier_level = 2; -- this is the restriction to the grandchildren

檢索 family_tree 表中的每個人,並顯示其姓氏及其男性血統中第一個已知祖先的姓氏。

點選檢視解決方案
WITH intermediate_table (id, father_id, firstname, lastname, initial_row, hier_level) AS
( -- The starting points are persons (more than one in our example table) for which no father is known.
  SELECT id, father_id, firstname, lastname, firstname, 0
  FROM   family_tree
  WHERE  father_id IS NULL
    UNION ALL
  -- The start name is preserved from level to level
  SELECT f.id, f.father_id, f.firstname, f.lastname, i.initial_row, i.hier_level + 1
  FROM   intermediate_table i
  JOIN   family_tree f ON f.father_id = i.id
)
SELECT * FROM intermediate_table;
-- or:
... unchanged 'with clause'
SELECT id, firstname, '-->', initial_row, 'in ', hier_level, 'generation(s)' FROM intermediate_table;

a) 示例表中儲存了多少卡爾·米勒的後代?
b) 與之前的問題相同,但按級別區分。

點選檢視解決方案
-- a)
WITH intermediate_table (id, firstname, lastname, hier_level) AS
( SELECT id, firstname, lastname, 0 AS hier_level
  FROM   family_tree
  WHERE  firstname = 'Karl'
  AND    lastname  = 'Miller'
    UNION ALL
  SELECT f.id, f.firstname, f.lastname, i.hier_level + 1
  FROM   intermediate_table i
  JOIN   family_tree f ON f.father_id = i.id
)
SELECT count(*) FROM intermediate_table where hier_level > 0;
-- b) Use the same WITH clause. Only the final SELECT changes.
...
SELECT hier_level, count(hier_level) FROM intermediate_table WHERE hier_level > 0 GROUP BY hier_level;


華夏公益教科書