柚子快報(bào)邀請(qǐng)碼778899分享:數(shù)據(jù)庫(kù) MySQL 窗口函數(shù)
柚子快報(bào)邀請(qǐng)碼778899分享:數(shù)據(jù)庫(kù) MySQL 窗口函數(shù)
MySQL 窗口函數(shù)
1,窗口函數(shù)1.1,什么是窗口函數(shù)1.2,基本語(yǔ)法
2,函數(shù)詳解2.1,聚合函數(shù)2.2,排序函數(shù)2.3,偏移函數(shù)2.4,值函數(shù)
3,進(jìn)階用法
1,窗口函數(shù)
1.1,什么是窗口函數(shù)
MySQL窗口函數(shù)是一種強(qiáng)大的工具,用于在查詢中執(zhí)行復(fù)雜的統(tǒng)計(jì)分析,而不需要改變表的結(jié)構(gòu)或數(shù)據(jù)。MySQL從8.0版本開始支持窗口函數(shù),這些函數(shù)也被稱為分析函數(shù),因?yàn)樗鼈兡軌蛱幚硐鄬?duì)復(fù)雜的報(bào)表統(tǒng)計(jì)分析場(chǎng)景。
窗口的意思是將數(shù)據(jù)進(jìn)行分組,每個(gè)分組即是一個(gè)窗口,這和使用聚合函數(shù)時(shí)的group by分組類似,但與聚合函數(shù)不同的地方是: 聚合函數(shù)(例如:sum/avg/min/max)會(huì)針對(duì)每個(gè)分組(窗口)聚合出一個(gè)結(jié)果(每一組返回一個(gè)結(jié)果)。 窗口函數(shù)會(huì)對(duì)每一條數(shù)據(jù)進(jìn)行計(jì)算,并不會(huì)使返回的數(shù)據(jù)變少(每一行返回一個(gè)結(jié)果)
1.2,基本語(yǔ)法
-- 匿名窗口
SELECT
<窗口函數(shù)> over (partition by <分組列名> order by <排序列名>)
FROM `表名`
-- 顯式窗口
SELECT
<窗口函數(shù)> OVER w
FROM `表名`
WINDOW w AS (partition by <分組列名> order by <排序列名>)
<窗口函數(shù)>的位置,可以放以下兩種函數(shù):
聚合函數(shù):如SUM、AVG、COUNT、MAX、MIN等,可以在不合并行的情況下計(jì)算每行的聚合值。專用窗口函數(shù):
排序函數(shù):包括RANK、DENSE_RANK、ROW_NUMBER等,用于為數(shù)據(jù)集中的每行分配一個(gè)唯一的排名或編號(hào)。偏移函數(shù):包括LAG和LEAD等,用于獲取當(dāng)前行之前的或之后的指定偏移量的值值函數(shù):FIRST_VALUE和LAST_VALUE返回窗口分區(qū)中第一行或最后一行的值,而NTH_VALUE則返回窗口內(nèi)偏移指定offset后的值。
因?yàn)榇翱诤瘮?shù)是對(duì)where或者group by子句處理后的結(jié)果進(jìn)行操作,所以窗口函數(shù)一般出現(xiàn)在select子句或者order by子句中。 where, group by, having都不可引用該列,因?yàn)檫@些語(yǔ)句執(zhí)行在select之前,此時(shí)函數(shù)尚未計(jì)算出值。
2,函數(shù)詳解
原始數(shù)據(jù)如下,表名:class
2.1,聚合函數(shù)
窗口操作不會(huì)將多組查詢行折疊成單個(gè)輸出行。相反,它們?yōu)槊恳恍挟a(chǎn)生一個(gè)結(jié)果:
SELECT
*,
-- 總計(jì)
SUM(score) OVER () AS sum1,
-- 按course分組求和
SUM(score) OVER (PARTITION BY course) AS sum2,
-- 按course分組累計(jì)求和
SUM(score) OVER (PARTITION BY course ORDER BY score DESC) AS sum3
FROM `class`
SELECT
*,
SUM(score) OVER w AS sum,
AVG(score) OVER w AS avg,
MIN(score) OVER w AS min,
MAX(score) OVER w AS max,
COUNT(score) OVER w AS count
FROM `class`
WINDOW w AS (PARTITION BY course ORDER BY score DESC)
注意分?jǐn)?shù)相同時(shí),分組累計(jì)(標(biāo)黃處)的處理邏輯(見:《3,進(jìn)階用法》)
2.2,排序函數(shù)
SELECT
*,
ROW_NUMBER() OVER w AS 'row_number',
RANK() OVER w AS 'rank',
DENSE_RANK() OVER w AS 'dense_rank'
FROM `class`
WINDOW w AS (PARTITION BY course ORDER BY score DESC)
三者的區(qū)別如下: row_number() 排序相同時(shí)不會(huì)重復(fù),會(huì)根據(jù)順序排序,即:1、2、3、4; rank() 排序相同時(shí)會(huì)重復(fù),序號(hào)有空隙,即1、2、2、4這樣的排序結(jié)果; dense_rank() 排序相同時(shí)會(huì)重復(fù),序號(hào)無(wú)空隙,即1、2、2、3這樣的排序結(jié)果;
求每門課程的前兩名:
SELECT * FROM (
SELECT
*,
RANK() OVER (PARTITION BY course ORDER BY score DESC) AS `rank`
FROM `class` ) f
WHERE `rank` <= 2
// 窗口函數(shù)得到的列別名不能用于where, group by, having等子句,
// 因?yàn)檫@些語(yǔ)句執(zhí)行在select之前,此時(shí)函數(shù)尚未計(jì)算出值。
// 以下寫法是錯(cuò)誤的:
SELECT
*,
RANK() OVER (PARTITION BY course ORDER BY score DESC) AS `rank`
FROM `class`
WHERE `rank` <= 2
如果每門課程只需要前兩條數(shù)據(jù),可把RANK() 函數(shù)換成 ROW_NUMBER()
2.3,偏移函數(shù)
語(yǔ)法:LEAD(字段, 偏移量, 填充值) 偏移量默認(rèn)為1,填充值默認(rèn)為NULL
SELECT
*,
-- 獲取前面一行的score
LAG(score) OVER W AS `lag`,
-- 獲取后面第二行score,且無(wú)數(shù)據(jù)填充0
LEAD(score, 2, 0) OVER W AS `lead`
FROM `class`
WINDOW w AS (PARTITION BY course ORDER BY score DESC)
2.4,值函數(shù)
SELECT
*,
-- 獲取第一行的score
FIRST_VALUE(score) OVER w AS `first`,
-- 截止到當(dāng)前行,獲取最后一行score
LAST_VALUE(score) OVER w AS `last`,
-- 截止到當(dāng)前行,獲取最后2行score
NTH_VALUE(score, 2) OVER w AS `second`,
-- 截止到當(dāng)前行,獲取最后3行score
NTH_VALUE(score, 3) OVER w AS `third`
FROM `class`
WINDOW w AS (PARTITION BY course ORDER BY score DESC)
注意了:從結(jié)果看,我們對(duì)FIRST_VALUE()很清晰,就是獲取的第一個(gè)值,但是LAST_VALUE()和NTH_VALUE獲取的值跟我們想象中的不太一樣呢? 沒(méi)錯(cuò),LAST_VALUE()和NTH_VALUE是獲取的截止到當(dāng)前為止的值,而不是整個(gè)組的最后一個(gè)值后指定的值(見:《3,進(jìn)階用法》)。
3,進(jìn)階用法
<窗口函數(shù)> over (
partition by <用于分組的列名>
order by <用于排序的列名>
rows/range 窗口子句
)
rows/range:窗口子句,主要用來(lái)限制分組(也稱窗口)的行數(shù)和數(shù)據(jù)范圍。
窗口子句必須和order by 子句同時(shí)使用,如果指定了order by 子句未指定窗口子句,則默認(rèn)為RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW,即從當(dāng)前分組起點(diǎn)到當(dāng)前行。
行比較分析函數(shù)lead和lag無(wú)窗口子句。
窗口子句常用語(yǔ)法:
CURRENT ROW:當(dāng)前行UNBOUNDED:無(wú)界限(起點(diǎn)或終點(diǎn))PRECEDING:往前FOLLOWING:往后
如上文《2.4,值函數(shù)》,如果想獲取整個(gè)窗口的LAST_VALUE()和NTH_VALUE:
SELECT
*,
-- 獲取第一行的score
FIRST_VALUE(score) OVER w AS `first`,
-- 獲取最后一行score
LAST_VALUE(score) OVER w AS `last`,
-- 獲取最后2行score
NTH_VALUE(score, 2) OVER w AS `second`,
-- 獲取最后3行score
NTH_VALUE(score, 3) OVER w AS `third`
FROM `class`
WINDOW w AS (
PARTITION BY course
ORDER BY score DESC
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
)
rows 和range區(qū)別:
rows是物理窗口,即根據(jù)order by 子句排序后,取的前N行及后N行的數(shù)據(jù)計(jì)算(與當(dāng)前行的值無(wú)關(guān),只與排序后的行號(hào)相關(guān))。range是邏輯窗口,即根據(jù)order by 子句排序后,取的前N行及和當(dāng)前行有相同order by值的所有行數(shù)據(jù)計(jì)算。
例如在《2.1,聚合函數(shù)》飄黃部分,因?yàn)槟J(rèn)窗口字句是RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW,所以改成把"RANGE"改成"ROWS"就是逐條統(tǒng)計(jì):
SELECT
*,
-- 默認(rèn)RANGE
SUM(score) OVER w AS sum1,
-- 指定ROWS
SUM(score) OVER (w ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS sum2,
-- 默認(rèn)RANGE
COUNT(score) OVER w AS count1,
-- 指定ROWS
COUNT(score) OVER (w ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS count2
FROM `class`
WINDOW w AS (PARTITION BY course ORDER BY score DESC)
柚子快報(bào)邀請(qǐng)碼778899分享:數(shù)據(jù)庫(kù) MySQL 窗口函數(shù)
參考文章
本文內(nèi)容根據(jù)網(wǎng)絡(luò)資料整理,出于傳遞更多信息之目的,不代表金鑰匙跨境贊同其觀點(diǎn)和立場(chǎng)。
轉(zhuǎn)載請(qǐng)注明,如有侵權(quán),聯(lián)系刪除。