柚子快報激活碼778899分享:sql 窗口函數(shù)-詳細講解分析
柚子快報激活碼778899分享:sql 窗口函數(shù)-詳細講解分析
目錄
窗口函數(shù)
1.窗口函數(shù)概述
2.窗口函數(shù)的語法
1.窗口函數(shù)語法解釋-Function(arg1,..., argn)
1.聚合函數(shù)
2.排序函數(shù)
3.跨行函數(shù)
2.窗口函數(shù)語法解釋-OVER [PARTITION BY <...>]?
3.窗口函數(shù)語法解釋-?[ORDER BY <....>]?
4.窗口函數(shù)語法解釋-[window_expression]
窗口函數(shù)練習
建庫建表語句:
題目:
答案:
窗口函數(shù)
1.窗口函數(shù)概述
????????窗口函數(shù)是一種SQL函數(shù),非常適合于數(shù)據(jù)分析,其最大的特點就是:輸入值是從SELECT語句的結(jié)果集中的一行或者多行的"窗口"中獲取的,也可以理解為窗口有大有?。ㄐ袛?shù)有多有少)。
? ? ? ? 通過OVER子句,窗口函數(shù)與其他的SQL函數(shù)有所區(qū)別,如果函數(shù)具有OVER子句,則它是窗口函數(shù)。如果它缺少了OVER子句,則他就是個普通的聚合函數(shù)。
? ? ? ? 窗口函數(shù)可以簡單地解釋為類似于聚合函數(shù)的計算函數(shù),但是通過GROUP BY子句組合的常規(guī)聚合會隱去正在聚合的各個行,最終輸出稱為一行。但是窗口函數(shù)聚合完之后還可以訪問當前行的其他數(shù)據(jù),并且可以將這些行的某些屬性添加到結(jié)果當中去。
下面可以通過兩個圖來區(qū)分普通的聚合函數(shù)和窗口函數(shù)
?
為了更加直觀的反映窗口函數(shù)和普通聚合函數(shù)的區(qū)別,讓我們通過代碼的形式感受一下?
?首先讓我們先添加測試數(shù)據(jù),并查看表。
CREATE DATABASE IF NOT EXISTS EmployeeDB;
USE EmployeeDB;
CREATE TABLE Employees (
EmployeeID INT AUTO_INCREMENT PRIMARY KEY,
Name VARCHAR(100),
DepartmentID INT,
Salary DECIMAL(10, 2)
);
INSERT INTO Employees (Name, DepartmentID, Salary) VALUES
('Alice', 1, 50000),
('Bob', 1, 55000),
('Charlie', 2, 60000),
('David', 1, 50000),
('Eve', 2, 65000),
('Frank', 3, 45000),
('Grace', 3, 47000),
('Hannah', 3, 48000),
('Ian', 2, 70000),
('Jack', 1, 52000);
-------------------------------------sum+group by常規(guī)的聚合函數(shù)操作--------------------------------------
select DepartmentID, sum(salary) as total
from employees
group by DepartmentID;
我們可以看的出來,常規(guī)聚合函數(shù)把id進行分組然后把每組的薪資綜合計算出來放在最后面。
--------------------------------------------sum+窗口函數(shù)的聚合操作-------------------------------------------
select *,
sum(Salary) over (partition by DepartmentID) total
from employees
?我們可以通過這兩個例子看出來,聚合函數(shù)和窗口聚合函數(shù)的區(qū)別。就是窗口函數(shù)會進行分組,但不會把行進行合并。對于每一組窗口函數(shù)返回出來的結(jié)果都會重復的放在最后面。
?2.窗口函數(shù)的語法
Function(arg1,..., argn) OVER ([PARTITION BY <...>] [ORDER BY <....>] [
對于以上的窗口函數(shù)的語法[? ?]中的語法是可以根據(jù)自己的需求進行選擇(非必須寫入語法),并且此語法嚴格按照上面的順序來規(guī)定。
Function(arg1,..., argn)是表示函數(shù)的分類,可以是下面分類中的任何一組。
----------聚合函數(shù),例如sum,min,avg,count等函數(shù)(常用)
----------排序函數(shù),例如rank row_number? dense_rank()等函數(shù)(常用)
----------跨行函數(shù),lag? lead 函數(shù)
OVER [PARTITION BY <...>] 類似于group by 用于指定分組
--每個分組你可以把它叫做窗口
--不分組的情況可以寫成partition by null 或者直接不寫partition by,所有列為一個大組
--分組的情況下,partition by 后面可以跟多個列,例如partition by cid,cname
[ORDER BY <....>] ?用于指定每個分組內(nèi)的數(shù)據(jù)排序規(guī)則 支持ASC、DESC
[
1.窗口函數(shù)語法解釋-Function(arg1,..., argn)
通常和partition by分組使用。當然也可以不分組使用,但也不分組使用通常沒有意義。
----------聚合函數(shù),例如sum,min,avg,count等函數(shù)(常用)
----------排序函數(shù),例如rank row_number? dense_rank()等函數(shù)(常用)
----------跨行函數(shù),lag? lead 函數(shù)
?1.聚合函數(shù)
我們還通過上文的測試數(shù)據(jù)進行演示。我們就演示2個函數(shù),其他的聚合類函數(shù)都是相同的用法。
sum函數(shù):求和
select *,
sum(Salary) over (partition by DepartmentID) total
from employees
min函數(shù) :最小值
select *,
min(Salary) over (partition by DepartmentID) total
from employees;
其他的聚合函數(shù)都是同樣的用法。
2.排序函數(shù)
rank? ?row_number? ?dense_rank()等函數(shù),通常與order by函數(shù)一起使用。
row_number()函數(shù):對分組之后按照某些規(guī)則從高到低或者從低到高進行排序(order by),然后打上序號,不考慮并列的情況。
select *,
row_number() over (partition by DepartmentID order by Salary desc ) total
from employees;
rank()函數(shù):對分組之后按照某些規(guī)則從高到低或者從低到高進行排序(order by),然后打上序號,考慮并列情況并且跳躍排名,對此我們需要增添一組數(shù)據(jù)。
INSERT INTO Employees (Name, DepartmentID, Salary) VALUES('css',1,45000);
select *,
rank() over (partition by DepartmentID order by Salary desc ) total
from employees;
dense_rank()函數(shù):
select *,
dense_rank() over (partition by DepartmentID order by Salary desc ) total
from employees;
3.跨行函數(shù)
LAG(col,n,DEFAULT)?用于統(tǒng)計窗口內(nèi)往上第n行值
第一個參數(shù)為列名,第二個參數(shù)為往上第n行(可選,默認為1),第三個參數(shù)為默認值(當往上第n行為NULL時候,取默認值,如不指定,則為NULL);
select *,
lag(Salary,1) over (partition by DepartmentID order by Salary desc ) total
from employees;
LEAD(col,n,DEFAULT)?用于統(tǒng)計窗口內(nèi)往下第n行值
第一個參數(shù)為列名,第二個參數(shù)為往下第n行(可選,默認為1),第三個參數(shù)為默認值(當往下第n行為NULL時候,取默認值,如不指定,則為NULL);
select *,
lead(Salary,1) over (partition by DepartmentID order by Salary desc ) total
from employees;
FIRST_VALUE?取分組內(nèi)排序后,截止到當前行,第一個值;
select *,
first_value(Salary) over (partition by DepartmentID order by Salary desc ) total
from employees;
LAST_VALUE?取分組內(nèi)排序后,截止到當前行,最后一個值;
select *,
last_value(Salary) over (partition by DepartmentID order by Salary desc ) total
from employees;
從這個數(shù)據(jù)我們有個疑問,為啥不是去分組內(nèi)的最后一個值呢?
在這里我給大家解釋一下,對于我們分的窗口(比如部門id=1)里面還有個小窗口row函數(shù)
對于我們沒有指定小窗口默認是當之前所有行到當前行,這樣理解可以很抽象,我們舉個例子。對于部門id=1來說,我們從第一行來看(心里默念從之前所有行到當前行)從之前所有行到當前行來看確實輸出的值應(yīng)該是55000.00,那么我們看第二行(心里默念從之前所有行到當前行)那么確實輸出的是52000.00。這樣我們通過row函數(shù)來改變一下小窗口的范圍。更清晰的感受一下這個函數(shù)。
select *,
last_value(Salary) over (partition by DepartmentID order by Salary desc
rows between unbounded preceding and unbounded following ) total
from employees;
解釋一下設(shè)置小窗口的含義:rows between unbounded preceding and unbounded following
之前所有的行到之后所有的行,那么讓我們輸出一下。
我們可以很清晰的看出來,輸出的是每一組里面最后一個的薪資。
?2.窗口函數(shù)語法解釋-OVER [PARTITION BY <...>]?
over是窗口函數(shù)的標志,partition by 用來指定分組,把partition by 后面跟的字段相同的放在一起
3.窗口函數(shù)語法解釋-?[ORDER BY <....>]?
用于指定每個分組內(nèi)的數(shù)據(jù)排序規(guī)則 支持ASC、DESC, 跟group by 中的order by是一樣的用法
4.窗口函數(shù)語法解釋-[window_expression]
用于指定每個窗口中 操作的數(shù)據(jù)范圍 默認是窗口中所有行
窗口子句操作的數(shù)據(jù)范圍:1)起始行:N preceding/unbounded preceding 2)當前行:currentrow 3)終止行:N following/unbounded following 舉例:rows between unbounded preceding and current row 從之前所有的行到當前行
rows between 2 preceding and current row 從前面兩行到當前行
rows between current row and unbounded following 從當前行到之后所有的行
rows between current row and 1following 從當前行到后面一行
注意: 排序子句后面缺少窗口子句,窗口規(guī)范默認是rows between unbounded preceding and current row
排序子句和窗口子句都缺失,窗口規(guī)范默認是 rows between unbounded preceding and unbounded following
總體流程 1)通過partition by和 order by 子句確定大窗口(定義出上界unbounded preceding和下界unbounded following)
2)通過row 子句針對每一行數(shù)據(jù)確定小窗口(滑動窗口) 3)對每行的小窗口內(nèi)的數(shù)據(jù)執(zhí)行函數(shù)并生成新的列
窗口函數(shù)練習
建庫建表語句:
CREATE DATABASE IF NOT EXISTS sales_db;
USE sales_db;
CREATE TABLE IF NOT EXISTS sales (
id INT AUTO_INCREMENT PRIMARY KEY,
product_id INT comment '商品id',
sale_date DATE comment '銷售日期',
amount DECIMAL(10, 2)comment '銷售額',
region VARCHAR(50) comment '地區(qū)'
)comment '銷售';
-- 插入一些示例數(shù)據(jù)
INSERT INTO sales (product_id, sale_date, amount, region) VALUES
(1, '2023-01-01', 100.00, 'East'),
(2, '2023-01-01', 150.00, 'West'),
(1, '2023-01-02', 200.00, 'East'),
(3, '2023-01-02', 120.00, 'South'),
(2, '2023-01-03', 180.00, 'West'),
(1, '2023-01-03', 220.00, 'East'),
(3, '2023-01-04', 140.00, 'South'),
(4, '2023-01-04', 300.00, 'North'),
(2, '2023-01-05', 250.00, 'West'),
(1, '2023-01-05', 280.00, 'East');
insert into sales(product_id, sale_date, amount, region) values
(1,'2023-01-01',200.00,'East');
題目:
1.計算每日銷售額總和(分區(qū)按日期)
2.計算每個區(qū)域的總銷售額
3.為每個產(chǎn)品計算其銷售排名(按銷售額降序)
4.計算每個區(qū)域每天的平均銷售額
5.計算每個產(chǎn)品的銷售累計總額
6.計算每個區(qū)域每個產(chǎn)品的銷售總額
7.計算每個區(qū)域最近7天的平均銷售額
8.為每個產(chǎn)品的銷售記錄添加序列號(按日期排序)
9.計算每個區(qū)域每天相對于前一天的銷售額變化
10.計算每個產(chǎn)品的銷售占比(銷售額/總銷售額)
自己可以先把這些最基礎(chǔ)的窗口函數(shù)做完之后,再看下面的解析
?答案:
# 計算每日銷售額總和(分區(qū)按日期)
SELECT *,sale_date, SUM(amount) OVER (PARTITION BY sale_date order by sale_date) AS total_daily_sales
FROM sales;
# 計算每個區(qū)域的總銷售額
SELECT region, SUM(amount) OVER (PARTITION BY region) AS total_regional_sales
FROM sales;
# 為每個產(chǎn)品計算其銷售排名(按銷售額降序)
select product_id,rank() over (order by sum(amount) desc )as '銷售排名' from sales group by product_id ;
# 計算每個區(qū)域每天的平均銷售額
select *,avg(amount)over (partition by region,sale_date rows between unbounded preceding and unbounded following) from sales;
# 計算每個產(chǎn)品的銷售累計總額
select *,sum(amount)over (partition by product_id) from sales;
# 計算每個區(qū)域每個產(chǎn)品的銷售總額
select *,sum(amount)over (partition by product_id,region)from sales;
# 計算每個區(qū)域最近7天的平均銷售額
with t1 as ( select *,dense_rank() over(partition by region order by sale_date)as ttime from sales )
select *,avg(amount)over(partition by region) from t1 where ttime<7;
;
# 為每個產(chǎn)品的銷售記錄添加序列號(按日期排序)
select *,dense_rank() over (partition by product_id order by sale_date)from sales;
# 計算每個區(qū)域每天相對于前一天的銷售額變化
SELECT
a.sale_date,
a.region,
a.amount,
a.amount - LAG(a.amount) OVER (PARTITION BY a.region ORDER BY a.sale_date) AS daily_change
FROM sales a;
# 計算每個產(chǎn)品的銷售占比(按總銷售額
select *,sales.amount/sum(amount)over (partition by product_id)from sales;
窗口函數(shù)很重要,希望大家可以好好琢磨琢磨,也希望發(fā)布的文章可以幫到大家。
柚子快報激活碼778899分享:sql 窗口函數(shù)-詳細講解分析
推薦閱讀
本文內(nèi)容根據(jù)網(wǎng)絡(luò)資料整理,出于傳遞更多信息之目的,不代表金鑰匙跨境贊同其觀點和立場。
轉(zhuǎn)載請注明,如有侵權(quán),聯(lián)系刪除。