柚子快報邀請碼778899分享:hive 列轉(zhuǎn)行和行轉(zhuǎn)列
柚子快報邀請碼778899分享:hive 列轉(zhuǎn)行和行轉(zhuǎn)列
數(shù)據(jù)準備
CREATE table student_score(
class_id string comment '班級id',
stu_id string comment '學號',
stu_name string comment '姓名',
courses string comment '各個科目',
scores string comment '各個分數(shù)'
) comment '班級學生成績表';
insert into student_score values
("A", "1001", "張三","語文,數(shù)學,英語,歷史,地理", "88,87,94,86,84"),
("A", "1002", "李四", "語文,數(shù)學,英語,歷史,地理", "78,89,75,79,68"),
("B", "1003", "王五", "語文,數(shù)學,英語,歷史,地理", "98,97,91,93,92"),
("B", "1004", "朱六", "語文,數(shù)學,英語,歷史,地理", "66,63,64,67,68");
CREATE table tmp_pv_uv(
hour string comment 'hour',
uv bigint comment 'uv',
pv bigint comment 'pv',
ip bigint comment 'ip'
) comment 'pu uv數(shù)據(jù)';
insert into tmp_pv_uv values
("00", 100, 200,50),
("01", 101, 201,51),
("02", 102, 202,52),
("03", 103, 203,53)
;
列轉(zhuǎn)行:單列轉(zhuǎn)多行
explode
explode 爆炸的意思,可用于list 和map字段的列轉(zhuǎn)行 explode(array)將array列表里的每個元素生成一行; explode(map)將map里的每一對元素作為一行,其中key為一列,value\
select explode(`array`(11,22,33)) as item;
輸出:
select explode(`map`("id",10086,"name","zhangsan","age",18));
輸出:
select 1, explode(`map`("id",10086,"name","zhangsan","age",18));
報錯: Semantic analysis exception - only a single expression in the SELECT clause is supported with UDTF’s 錯誤原因是: 1.explode函數(shù)屬于UDTF函數(shù),既表生成函數(shù) 2.explode函數(shù)執(zhí)行返回的結(jié)果可以理解為一張?zhí)摂M的表,起數(shù)據(jù)來源于表 3.在select中只查詢源數(shù)據(jù)沒有問題,只查詢explode生成的虛擬表數(shù)據(jù)也沒問題 4.但是不能在只查詢源表的時候,既想返回源表字段又想返回explode生成的虛擬表字段 5.通俗點講,有兩張表,不能只查詢一張表但是返回屬于兩張表的字段 6.從SQL層面上來說應該對兩張表進行關聯(lián)查詢 7.hive專門提供了語法lateral view 側(cè)視圖,專門用于搭配explode這樣的UDTF函數(shù),以滿足上述需要
posexplode
對一列進行炸裂可以使用 explode()函數(shù),但是如果想實現(xiàn)對兩列都進行多行轉(zhuǎn)換, 那么用explode()函數(shù)就不能實現(xiàn)了,可以用posexplode()函數(shù),因為該函數(shù)可以將index和數(shù)據(jù)都取出來, 使用兩次posexplode并令兩次取到的index相等就行了。
select a.stu_id, a.stu_name,
table_view1.course, table_view2.score
from student_score a
lateral view explode(split(courses, ',')) table_view1 as `course`
lateral view explode(split(scores, ',')) table_view2 as `score`
where stu_id='1001'
;
輸出:
select stu_id, stu_name, course, score
from student_score
lateral view posexplode(split(courses, ',')) table_view1 as a, course
lateral view posexplode(split(scores, ',')) table_view2 as b, score
where a = b and stu_id='1001'
;
輸出:
行列互轉(zhuǎn)
with p as(
select hour, key, value
from (
select
hour
,map('uv', uv, 'pv', pv, 'ip', ip) m
from tmp_pv_uv
) LATERAL view explode(m) t1 as key, value
)
select
key
, sum(case when `hour`='00' then value else 0 end) `00`
, sum(case when `hour`='01' then value else 0 end) `01`
, sum(case when `hour`='02' then value else 0 end) `02`
, sum(case when `hour`='03' then value else 0 end) `03`
from p
group by key
;
輸出過程如下:
解析多個json字段 json_tuple
select
json,device,deviceType,signal,stime
from tb_json_test1
lateral view json_tuple(json,"device","deviceType","signal","time") b
as device,deviceType,signal,stime;
另一種寫法是使用GET_JSON_OBJECT()
select
json,GET_JSON_OBJECT(json,'$.device') device
,GET_JSON_OBJECT(json,'$.deviceType') deviceType
,GET_JSON_OBJECT(json,'$.signal') signal
,GET_JSON_OBJECT(json,'$.time') time
from tb_json_test1
lateral view json_tuple(json,"device","deviceType","signal","time") b
as device,deviceType,signal,stime;
行轉(zhuǎn)列:多行轉(zhuǎn)單列
concat(element1,element2,element3……):用于實現(xiàn)字符串拼接,不可指定分隔符 concat_ws(SplitChar,element1,element2……) :實現(xiàn)字符串拼接,可以指定分隔符 collect_list(colName):用于將一列中的多行合并為一行,不進行去重 collect_set(colName):用于將一列中的多行合并為一行,并進行去重
參考文獻
https://zhuanlan.zhihu.com/p/514996706https://blog.csdn.net/qq_36039236/article/details/108446974?spm=1001.2101.3001.6650.8&utm_medium=distribute.pc_relevant.none-task-blog-2%7Edefault%7EBlogCommendFromBaidu%7ERate-8-108446974-blog-124449548.235%5Ev43%5Epc_blog_bottom_relevance_base8&depth_1-utm_source=distribute.pc_relevant.none-task-blog-2%7Edefault%7EBlogCommendFromBaidu%7ERate-8-108446974-blog-124449548.235%5Ev43%5Epc_blog_bottom_relevance_base8&utm_relevant_index=17https://zhuanlan.zhihu.com/p/686089605
柚子快報邀請碼778899分享:hive 列轉(zhuǎn)行和行轉(zhuǎn)列
精彩鏈接
本文內(nèi)容根據(jù)網(wǎng)絡資料整理,出于傳遞更多信息之目的,不代表金鑰匙跨境贊同其觀點和立場。
轉(zhuǎn)載請注明,如有侵權,聯(lián)系刪除。