#2 - Z Chart Analysis

์‹œ์ž‘ํ•˜๊ธฐ ์ „์—

๊ทœ๋ชจ์žˆ๋Š” ์ฟผ๋ฆฌ๋ฅผ ์ž‘์„ฑํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ์ผ๊ด€๋œ ์ฝ”๋“œ ๊ตฌ์กฐ์™€ ์Šคํƒ€์ผ์˜ ์œ ์ง€๊ฐ€ ์ค‘์š”ํ•˜๋‹ค. ์ด ๊ธ€์˜ ์˜ˆ์‹œ๋Š” ๋‹ค์Œ BigQuery ์Šคํƒ€์ผ ๊ฐ€์ด๋“œ๋ฅผ ์ฐธ๊ณ ํ•˜๊ณ  ์žˆ๋‹ค.

BI ๋ถ„์„ ํ•ธ์ฆˆ์˜จ

๊ธฐ์—…์˜ ๋งค์ถœ๊ณผ ๊ฐ™์ด ์‹œ๊ฐ„์˜ ๊ฒฝ๊ณผ์— ๋”ฐ๋ผ ๋ณ€ํ™”ํ•˜๋Š” ๋ฐ์ดํ„ฐ๋กœ๋ถ€ํ„ฐ ๊ณผ๊ฑฐ ์‹ค์ ์ด๋‚˜ ์ถ”์ด ๋“ฑ์„ ํŒŒ์•…ํ•˜๊ณ  ๋ฏธ๋ž˜์˜ ์‹ค์ ์„ ์˜ˆ์ธกํ•˜๊ธฐ ์œ„ํ•œ ๋ชฉ์ ์œผ๋กœ ์‹œ๊ณ„์—ด ๋ถ„์„์ด ํ™œ์šฉ๋œ๋‹ค.

์ด๋ฒˆ ํ•ธ์ฆˆ์˜จ์€ ์‹œ๊ณ„์—ด ๋ถ„์„ ๊ธฐ๋ฒ• ์ค‘์˜ ํ•˜๋‚˜์ธ Z-Chart ๋ถ„์„์„ BigQuery ์–ธ์–ด๋กœ ์ง„ํ–‰ํ•˜๋ฉฐ SQL ํ™œ์šฉ ๋Šฅ๋ ฅ์„ ํ‚ค์šฐ๋Š” ๊ฒƒ์„ ๋ชฉํ‘œ๋กœ ํ•œ๋‹ค.

์•„๋ž˜์˜ ๋‹จ๊ณ„๋ณ„๋กœ Z-Chart ๋ถ„์„์— ์š”๊ตฌ๋˜๋Š” ์ง€ํ‘œ๋“ค์„ ์‚ฐ์ถœํ•˜๊ณ  ์‹œ๊ฐํ™”ํ•˜๋Š” ๊ณผ์ •์„ ๋”ฐ๋ผ๊ฐ€ ๋ณด์ž.

raw data --> data mart --> visualization

๋ฐ์ดํ„ฐ ์ค€๋น„

ํ•ธ์ฆˆ์˜จ์— ์‚ฌ์šฉํ•  ๋งค์ถœ ๋ฐ์ดํ„ฐ๋Š” ์•„๋ž˜ ์ฐธ๊ณ ๋„์„œ์—์„œ ๊ณต๊ฐœํ•œ ๋ถ€๋ก/์˜ˆ์ œ์†Œ์Šค๋ฅผ ์ฐธ๊ณ ํ•˜์—ฌ BigQuery ํ™˜๊ฒฝ์—์„œ ์‹ค์Šต ๊ฐ€๋Šฅํ•˜๋„๋ก ์ •๋ฆฌํ•˜์˜€๋‹ค.

[์ฐธ๊ณ ๋„์„œ] ๋ฐ์ดํ„ฐ ๋ถ„์„์„ ์œ„ํ•œ SQL ๋ ˆ์‹œํ”ผ - 4์žฅ

๋งค์ถœ ๋ฐ์ดํ„ฐ ๋ถ„์„

1. ์˜ˆ์‹œ ํ…Œ์ด๋ธ” ์ƒ์„ฑ

์šฐ์„  ์˜ˆ์ œ ์†Œ์Šค ์ค‘ 9-1-data.sql ์„ ์ตœ์†Œํ•œ์˜ ๋ณ€๊ฒฝ์œผ๋กœ BigQuery ํ…Œ์ด๋ธ”๋กœ ์ƒ์„ฑํ•ด ๋ณด๋„๋ก ํ•˜์ž.

CREATE OR REPLACE TABLE learning_club.purchase_log ( 
    dt              STRING,
    order_id        INT64,
    user_id         STRING,
    purchase_amount INT64,
);
INSERT INTO learning_club.purchase_log VALUES 
  ('2014-01-01',  1, 'rhwpvvitou', 13900),
  ('2014-01-01',  2, 'hqnwoamzic', 10616),
  ('2014-01-02',  3, 'tzlmqryunr', 21156),
  ('2014-01-10', 26, 'cyxfgumkst', 11339)
-- ...

๋ฌด๋ฃŒ๋กœ ์‚ฌ์šฉ์ค‘์ธ GCP Project free tier ์—์„œ ์œ„์˜ ์ฟผ๋ฆฌ๋Š” INSERT ๋ฌธ์„ ์‚ฌ์šฉํ•˜์ง€ ๋ชปํ•œ๋‹ค๋Š” ์—๋Ÿฌ๋ฅผ ๋Œ๋ ค์ค€๋‹ค.

๋ฐฐ์›€์˜ ๊ธธ์€ ๋ฉ€๊ณ ๋„ ํ—˜ํ•˜๋‹ค. ์–ด์ฉ” ์ˆ˜ ์—†๋‹ค. ์•ฝ๊ฐ„์˜ ๊ณ ๊ธ‰์ง„ ๊ธฐ๋Šฅ์„ ์‚ฌ์šฉํ•˜์—ฌ INSERT๊ฐ™์€ DML ๋ฌธ์„ ์‚ฌ์šฉํ•˜์ง€ ์•Š๊ณ  ์˜ˆ์‹œ ํ…Œ์ด๋ธ”์„ ๋งŒ๋“ค์–ด ๋ณด์ž.

BigQuery Scripting์€ ๋ฒ”์šฉ ํ”„๋กœ๊ทธ๋ž˜๋ฐ ์–ธ์–ด์ฒ˜๋Ÿผ ๋ฌธ์žฅ๋‚ด์—์„œ ๋ณ€์ˆ˜๋ฅผ ์„ ์–ธํ•˜๊ณ  ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก ํ•ด์ค€๋‹ค.

์ด ๊ฐ€์šด๋ฐ DECLARE๋Š” ๋ณ€์ˆ˜๋ฅผ ์„ ์–ธํ•˜๋Š” ๊ตฌ๋ฌธ์ธ๋ฐ ์ด๋Ÿฐ๊ฒŒ ์žˆ๊ตฌ๋‚˜ ์ •๋„๋กœ ๋„˜์–ด๊ฐ€์ž. BigQuery Scripting์€ ๋ณ„๋„์˜ ๋ฌธ๋ฒ• ๊ฐ•์˜์—์„œ ๋‹ค๋ฃฐ ์˜ˆ์ •์ด๋‹ค.

๋‹ค๋งŒ, ๋ฐ์ดํ„ฐ ํ˜•์‹์ด ๊ตฌ์กฐ์ฒด์˜ ๋ฐฐ์—ด(Array of Struct)์ธ ๋ณ€์ˆ˜๋Š” BigQuery๋‚ด์—์„œ ํ…Œ์ด๋ธ”๊ณผ ๊ฑฐ์˜ ๋™๋“ฑํ•˜๊ฒŒ ๋‹ค๋ฃจ์–ด์ง„๋‹ค๋Š” ๊ฒƒ๋งŒ ๊ธฐ์–ตํ•ด ๋‘๋ฉด ์ข‹๊ฒ ๋‹ค.

์•„๋ž˜ ๊ทธ๋ฆผ์—์„œ ๋ฐฐ์—ด์˜ ๊ฐ ์š”์†Œ(element)๋Š” ํ…Œ์ด๋ธ”์—์„œ ํ•˜๋‚˜์˜ ํ–‰(row)์œผ๋กœ, ๊ฐ ์š”์†Œ์— ํ•ด๋‹นํ•˜๋Š” ๊ตฌ์กฐ์ฒด๋Š” 4๊ฐœ์˜ ๋ฉค๋ฒ„์„ ๊ฐ€์ง€๋Š”๋ฐ ํ…Œ์ด๋ธ”์˜ ์ปฌ๋Ÿผ(column)์œผ๋กœ ์ƒ๊ฐํ•  ์ˆ˜ ์žˆ๋‹ค.

[๊ทธ๋ฆผ 1.] ARRAY<STRUCT<fruit, coffee, cake>> ๊ตฌ์กฐ์ฒด ๋ฐฐ์—ด

Nested Table ์ƒ์„ฑ ์ฟผ๋ฆฌ

DECLARE ๊ตฌ๋ฌธ์„ ํ†ตํ•ด์„œ purchase_log๋ผ๋Š” ๊ตฌ์กฐ์ฒด ๋ฐฐ์—ด ๋ณ€์ˆ˜๋ฅผ ํ•˜๋‚˜ ์„ ์–ธํ•˜์ž. ๊ตฌ์กฐ์ฒด๋Š” ๊ตฌ๋งค๋‚ ์งœ, ์ฃผ๋ฌธ๋ฒˆํ˜ธ, ์‚ฌ์šฉ์ž์‹๋ณ„์ž, ๊ตฌ๋งค๊ธˆ์•ก์„ ํฌํ•จํ•˜๋Š” ๋ ˆ์ฝ”๋“œ์ด๋‹ค.

DECLARE purchase_log ARRAY<STRUCT<
    dt STRING,
    order_id INT64,
    user_id STRING,
    purchase_amount INT64
>>
DEFAULT [
  ('2014-01-01',  1, 'rhwpvvitou', 13900),
  ('2014-01-01',  2, 'hqnwoamzic', 10616),
  ('2014-01-02',  3, 'tzlmqryunr', 21156),
  ('2014-01-02',  4, 'wkmqqwbyai', 14893),
  ('2014-01-03',  5, 'ciecbedwbq', 13054),
  ('2014-01-03',  6, 'svgnbqsagx', 24384),
  ('2014-01-03',  7, 'dfgqftdocu', 15591),
  ('2014-01-04',  8, 'sbgqlzkvyn',  3025),
  ('2014-01-04',  9, 'lbedmngbol', 24215),
  ('2014-01-04', 10, 'itlvssbsgx',  2059),
  ('2014-01-05', 11, 'jqcmmguhik',  4235),
  ('2014-01-05', 12, 'jgotcrfeyn', 28013),
  ('2014-01-05', 13, 'pgeojzoshx', 16008),
  ('2014-01-06', 14, 'msjberhxnx',  1980),
  ('2014-01-06', 15, 'tlhbolohte', 23494),
  ('2014-01-06', 16, 'gbchhkcotf',  3966),
  ('2014-01-07', 17, 'zfmbpvpzvu', 28159),
  ('2014-01-07', 18, 'yauwzpaxtx',  8715),
  ('2014-01-07', 19, 'uyqboqfgex', 10805),
  ('2014-01-08', 20, 'hiqdkrzcpq',  3462),
  ('2014-01-08', 21, 'zosbvlylpv', 13999),
  ('2014-01-08', 22, 'bwfbchzgnl',  2299),
  ('2014-01-09', 23, 'zzgauelgrt', 16475),
  ('2014-01-09', 24, 'qrzfcwecge',  6469),
  ('2014-01-10', 25, 'njbpsrvvcq', 16584),
  ('2014-01-10', 26, 'cyxfgumkst', 11339)
];
-- ํ…Œ์ด๋ธ” ์ƒ์„ฑ ์ฟผ๋ฆฌ
CREATE OR REPLACE TABLE learning_club.purchase_log AS
SELECT log.* FROM UNNEST(purchase_log) log;

UNNEST() ํ•จ์ˆ˜๋กœ ๊ตฌ์กฐ์ฒด ๋ฐฐ์—ด(Array of Struct)์˜ ์š”์†Œ์ธ ๊ตฌ์กฐ์ฒด๋ฅผ ๊ฐ๊ฐ์˜ ํ–‰(row)์œผ๋กœ ํ’€์–ด๋‚ด๊ณ , ์ด์–ด ๊ตฌ์กฐ์ฒด ์•ˆ์˜ ํ•„๋“œ๋“ค์„ dot operator ๋˜๋Š” member field access operator(๋ฉค๋ฒ„ ์ ‘๊ทผ ์—ฐ์‚ฐ์ž) ๋ผ๊ณ  ๋ถˆ๋ฆฌ์šฐ๋Š” ํ‘œ๊ธฐ๋ฒ•๋ฅผ ํ†ตํ•ด์„œ ์—ด(column)๋กœ ๋ถ„๋ฆฌ์‹œํ‚ค๋ฉด ์ผ๋ฐ˜ ํ…Œ์ด๋ธ”์„ SELECT ํ•˜๋Š” ๊ฒƒ๊ณผ ๋™์ผํ•œ ๊ฒฐ๊ณผ๊ฐ€ ๋งŒ๋“ค์–ด์ง„๋‹ค.

์œ„ FROM ์ ˆ์˜ UNNEST() ๊ตฌ๋ฌธ๊ณผ SELECT ์ ˆ์˜ log.* ํ‘œ๊ธฐ์— ์ฃผ๋ชฉํ•˜์ž.

์ดํ›„์— CTAS(Create Table As Select) ๊ตฌ๋ฌธ์œผ๋กœ Nested Table์ด BigQuery์˜ Permanant Table์ด ๋˜๋„๋ก ๋งŒ๋“ ๋‹ค.

์ด์ œ ๋ฐ์ดํ„ฐ๊ฐ€ ์ค€๋น„๋˜์—ˆ์œผ๋‹ˆ ๋ถ„์„์„ ์œ„ํ•œ ๋‹ค์Œ ๋‹จ๊ณ„๋กœ ๋„˜์–ด๊ฐ€์ž.

2. ๋‚ ์งœ๋ณ„ ๋งค์ถœ ์ง‘๊ณ„

๋‹ค์Œ์œผ๋กœ ์•ž์„œ ๋งŒ๋“ค์–ด์ง„ ๊ตฌ๋งค์ด๋ ฅ ๋ฐ์ดํ„ฐ๋ฅผ ํƒ์ƒ‰ํ•˜๊ณ  ์ง‘๊ณ„ํ•จ์ˆ˜(aggregate function)์™€ ๋ถ„์„ํ•จ์ˆ˜(analytic function)์„ ์‚ฌ์šฉํ•˜์—ฌ ๋ช‡๊ฐ€์ง€ ์ง€ํ‘œ๋“ค์„ ์‚ฐ์ถœํ•ด ๋ณด์ž.

GROUP BY์™€ ์ง‘๊ณ„ํ•จ์ˆ˜์„ ์‚ฌ์šฉํ•˜์—ฌ ๋‚ ์งœ๋ณ„ ๋งค์ถœ ์ง€ํ‘œ๋ฅผ ์‚ฐ์ถœํ•˜๋Š” ์ฟผ๋ฆฌ๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.

dt๋Š” ๋‚ ์งœ๋ฅผ ๋‚˜ํƒ€๋‚ด๋Š” ํ•˜๋‚˜์˜ ์ฐจ์›(dimension)์ด ๋˜๊ณ , COUNT(), SUM(), AVG() ์ง‘๊ณ„ํ•จ์ˆ˜์˜ ๊ฒฐ๊ณผ๋“ค์€ ๊ฐ๊ฐ ํ•ด๋‹น ์ฐจ์›์—์„œ ๊ด€์ธก๋œ ๊ฐ’์ธ ๋ฉ”ํŠธ๋ฆญ(metric)์ด ๋œ๋‹ค.

SELECT dt,
       COUNT(1) AS purchase_count,
       SUM(purchase_amount) AS total_amount, -- ์ผ๋ณ„ ๋งค์ถœ
       AVG(purchase_amount) AS avg_amount, -- ํ‰๊ท  ๊ตฌ๋งค์•ก
  FROM learning_club.purchase_log
 GROUP BY 1
 ORDER BY dt;

์ฟผ๋ฆฌ ์ˆ˜ํ–‰์˜ ๊ฒฐ๊ณผ๋ฅผ Explore Data (๋ฐ์ดํ„ฐ ํƒ์ƒ‰) ๋ฅผ ํ†ตํ•ด Data Studio์—์„œ ๊ฐ„๋‹จํ•˜๊ฒŒ ์‹œ๊ฐํ™”๋ฅผ ํ•˜๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.

3. ์ด๋™ํ‰๊ท ์„ ์‚ฌ์šฉํ•œ ๋‚ ์งœ๋ณ„ ์ถ”์ด

์•ž์„  ์‹œ๊ฐํ™”๋ฅผ ํ†ตํ•ด์„œ ์ฃผ์ค‘๊ณผ ์ฃผ๋ง์˜ ๋งค์ถœ ์ฐจ์ด๋ฅผ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค. ์ด์ œ ์‹œ๊ณ„์—ด ๋ฐ์ดํ„ฐ์˜ ๊ณ„์ ˆ์„ฑ(seasonality)์„ ์™„ํ™”์‹œํ‚ค๊ณ  ์ถ”์„ธ๋ฅผ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋„๋ก 7์ผ ์ด๋™ํ‰๊ท  ์œผ๋กœ ํ‘œํ˜„ํ•ด ๋ณด์ž.

SELECT dt,
       COUNT(1) AS purchase_count,
       SUM(purchase_amount) AS purchase_amount,
       AVG(SUM(purchase_amount)) OVER (ORDER BY dt ROWS 6 PRECEDING) AS avg_seven_day,
  FROM learning_club.purchase_log
 GROUP BY 1
 ORDER BY dt;

์ด๋™ ํ‰๊ท ์„ ๊ตฌํ•˜๊ธฐ ์œ„ํ•ด์„œ๋Š” ๋ถ„์„ํ•จ์ˆ˜(๋˜๋Š” ์œˆ๋„์šฐํ•จ์ˆ˜)๋ฅผ ํ•„์š”๋กœ ํ•˜๋Š”๋ฐ ์•„๋ž˜ ๊ตฌ๋ฌธ์€ ๋ญ”๊ฐ€ ๋ณต์žกํ•˜๋‹ค.

AVG(SUM(purchase_amount)) OVER (ORDER BY dt ROWS 6 PRECEDING)

  1. purchase_amount๋Š” purchase_log ํ…Œ์ด๋ธ”์˜ ์ปฌ๋Ÿผ์ผ๊นŒ ์•„๋‹ˆ๋ฉด ์ง์ „์˜ AS purchase_amount์— ์˜ํ•ด์„œ ์ƒˆ๋กœ ์ด๋ฆ„์ง€์–ด์ง„ ๋ณ„์นญ(alias)์ผ๊นŒ? ์ •๋‹ต: ํ…Œ์ด๋ธ”์˜ ์ปฌ๋Ÿผ

  2. AVG(SUM(....)) ์—์„œ SUM()์€ ์ง‘๊ณ„ํ•จ์ˆ˜์ผ๊นŒ? ๋ถ„์„ํ•จ์ˆ˜์ผ๊นŒ? ์ •๋‹ต: ์ง‘๊ณ„ํ•จ์ˆ˜

  3. AVG()๋Š” ๊ทธ๋Ÿผ ๋ฌด์Šจ ํ•จ์ˆ˜์ผ๊นŒ? ์ •๋‹ต: ๋ถ„์„ํ•จ์ˆ˜

SELECT ๋ฌธ(statment)์„ ์ด๋ฃจ๋Š” ๊ฐ ์ ˆ(clause)์˜ ํ•ด์„์ˆœ์„œ์™€ ์ปฌ๋Ÿผ ์ด๋ฆ„์˜ ํ†ต์šฉ๋ฒ”์œ„(scope) ๋•Œ๋ฌธ์ธ๋ฐ ๋ณต์žกํ•˜๋‹ˆ ์ผ๋‹จ ๋„˜์–ด๊ฐ€์ž. BigQuery ๋ฌธ๋ฒ• ๊ฐ•์˜์—์„œ ์ด๊ฒƒ๋„ ๋‹ค๋ค„๋ณด๋„๋ก ํ•˜๊ฒ ๋‹ค.

์œ„ ๊ตฌ๋ฌธ์€ ํ…Œ์ด๋ธ” ์ „์ฒด๋ฅผ ๋‚ ์งœ๋ณ„๋กœ ์˜ค๋ฆ„ ์ฐจ์ˆœ์œผ๋กœ ์ •๋ ฌ์‹œํ‚จ ํ›„์— ๊ฐ ํ–‰์˜ ์—ฌ์„ฏ ๋ฒˆ์งธ ์ „์— ์œ„์น˜ํ•œ ํ–‰๋ถ€ํ„ฐ ํ˜„์žฌ ํ–‰๊นŒ์ง€ 7์ผ๊ฐ„์˜ SUM(purchase_amount) ๊ฐ’์˜ ํ‰๊ท ์„ ๊ตฌํ•˜๋Š” ๊ตฌ๋ฌธ์ด๋‹ค. ์—ฌ๊ธฐ์„œ SUM(purchase_amount)๋Š” GROUP BY์— ์˜ํ•ด์„œ ๋จผ์ € ๊ณ„์‚ฐ๋˜์–ด์ง€๋Š” ๋‚ ์งœ๋ณ„ ๊ตฌ๋งค์•ก์— ํ•ด๋‹นํ•œ๋‹ค.

Data Studio์—์„œ ์œ„์˜ ๊ฒฐ๊ณผ๋ฅผ ์‹œ๊ฐํ™”ํ•˜๋ฉด ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.

4. ๋‹น์›” ๋ˆ„๊ณ„ ๋งค์ถœ

Z Chart ๋ถ„์„์—์„œ ์š”๊ตฌํ•˜๋Š” ์ง€ํ‘œ๋“ค์„ ๊ตฌํ•˜๋Š” ๋ฐฉ๋ฒ•๋“ค์„ ํ•˜๋‚˜์”ฉ ์‚ดํŽด๋ณด์ž.

๋‹ค์Œ์€ ์ผ๋ณ„ ๋งค์ถœ ๋ฐ์ดํ„ฐ๊ฐ€ ์ฃผ์–ด์กŒ์„ ๋•Œ ํ•ด๋‹น ์›”์˜ ๋ˆ„๊ณ„ ๋งค์ถœ์„ ๊ตฌํ•˜๋Š” ์ฟผ๋ฆฌ์ด๋‹ค.

์œˆ๋„์šฐ ํ•จ์ˆ˜ ๊ตฌ๋ฌธ์—์„œ PARTITION BY SUBSTR(dt, 1, 7)์€ ์œˆ๋„์šฐ ํ•จ์ˆ˜๊ฐ€ ์ ์šฉ๋  ํŒŒํ‹ฐ์…˜์„ ์›” ๋‹จ์œ„๋กœ ๋‚˜๋ˆˆ๋‹ค๋Š” ์˜๋ฏธ์ด๊ณ , ROWS UNBOUNDED PRECEDING์€ ์›” ๋‹จ์œ„๋กœ ๋‚˜๋‰˜์–ด์ง„ ํŒŒํ‹ฐ์…˜์˜ ์ฒซ๋ฒˆ์งธ ํ–‰ (๋งค์›” 1์ผ์— ํ•ด๋‹น)๋ถ€ํ„ฐ ํ˜„์žฌ ํ–‰ (๋งค์›” ํ˜„์žฌ์ผ)๊นŒ์ง€ ์œˆ๋„์šฐ๋ฅผ ์„ค์ •ํ•œ๋‹ค. ์„ค์ •๋œ ์œˆ๋„์šฐ์— ํฌํ•จ๋œ ํ–‰๋“ค์„ SUM() ๋ถ„์„ํ•จ์ˆ˜๋ฅผ ํ˜ธ์ถœํ•˜๊ฒŒ ๋˜๋Š”๋ฐ ์—ฌ๊ธฐ์„œ๋Š” ์ผ๋ณ„ ๋งค์ถœ(SUM(purchase_amount))์„ ๋‹ค์‹œ SUM() ๋ถ„์„ํ•จ์ˆ˜๋กœ ๋ˆ„๊ณ„์‹œํ‚ค๊ณ  ์žˆ๋‹ค.

-- total_amount๋Š” ์ผ๋ณ„ ๋งค์ถœ์•ก, agg_amount๋Š” ํ•ด๋‹น ์›”์˜ ํ•ด๋‹น ์ผ๊นŒ์ง€์˜ ๋งค์ถœ ๋ˆ„๊ณ„
SELECT dt,
       SUBSTR(dt, 1, 7) AS year_month,
       SUM(purchase_amount) AS total_amount,
       SUM(SUM(purchase_amount)) OVER (
         PARTITION BY SUBSTR(dt, 1, 7) ORDER BY dt ROWS UNBOUNDED PRECEDING
       ) AS agg_amount
  FROM learning_club.purchase_log
 GROUP BY dt
 ORDER BY dt       
;

-- ์œ„์™€ ๋™๋“ฑํ•œ ๊ฒฐ๊ณผ์˜ ์ฟผ๋ฆฌ, daily_purchase ์„œ๋ธŒ์ฟผ๋ฆฌ์—์„œ ์ผ๋ณ„ ๋งค์ถœ์•ก์„ ๋จผ์ € ๊ตฌํ•˜๊ณ ,
-- ๋ฉ”์ธ ์ฟผ๋ฆฌ์—์„œ ์œˆ๋„์šฐ ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ํ•ด๋‹น์ผ๊นŒ์ง€์˜ ๋งค์ถœ ๋ˆ„๊ณ„ ์‚ฐ์ถœ
WITH daily_purchase AS (
  SELECT dt,
         SUBSTR(dt, 1, 4) AS year,
         SUBSTR(dt, 6, 2) AS month,
         SUBSTR(dt, 9, 2) AS day,
         SUM(purchase_amount) AS purchase_amount,
         COUNT(order_id) AS orders
    FROM learning_club.purchase_log
   GROUP BY 1
)
SELECT dt,
       year || month AS year_month,
       purchase_amount,
       SUM(purchase_amount) OVER (
         PARTITION BY year, month ORDER BY dt ROWS UNBOUNDED PRECEDING
       ) AS agg_amount
  FROM daily_purchase
;

5. ์›”๋ณ„ ๋งค์ถœ์˜ ์ž‘๋Œ€๋น„(์ž‘๋…„๋Œ€๋น„๋น„์œจ)

์ง€๊ธˆ๊นŒ์ง€ ์‚ฌ์šฉํ•œ ๋ฐ์ดํ„ฐ๋Š” ์—ดํ˜๊ฐ„์˜ ๋งค์ถœ ์‹ค์  ๋ฐ์ดํ„ฐ๋กœ ์›”๋ณ„ ๋งค์ถœ ์ถ”์ด๋ฅผ ๋ถ„์„ํ•˜๊ธฐ์—๋Š” ๋ถ€์กฑํ•˜๋‹ค.

๋‹ค์Œ์€ 2014๋…„ 1์›”๋ถ€ํ„ฐ 2015๋…„ 12์›”๊นŒ์ง€ ๋งค์ถœ ๋ฐ์ดํ„ฐ๋กœ ์ด๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์›”๋ณ„ ๋งค์ถœ์˜ ์ž‘๋Œ€๋น„๋ฅผ ๊ตฌํ•ด๋ณด๋„๋ก ํ•˜์ž.

์˜ˆ์‹œ ๋ฐ์ดํ„ฐ๋Š” ๋„์ž…๋ถ€์—์„œ ์–ธ๊ธ‰ํ•œ ๋„์„œ์˜ ๊ณต๊ฐœ ๋ฐ์ดํ„ฐ๋ฅผ ์‚ฌ์šฉํ•˜์˜€๋‹ค.

DECLARE purchase_log_yr ARRAY<STRUCT<
  dt STRING,
  order_id INT64,
  user_id STRING,
  purchase_amount INT64
>> DEFAULT [
  ('2014-01-01',    1, 'rhwpvvitou', 13900),
  ('2014-02-08',   95, 'chtanrqtzj', 28469),
  ('2014-03-09',  168, 'bcqgtwxdgq', 18899),
  ('2014-04-11',  250, 'kdjyplrxtk', 12394),
  ('2014-05-11',  325, 'pgnjnnapsc',  2282),
  ('2014-06-12',  400, 'iztgctnnlh', 10180),
  ('2014-07-11',  475, 'eucjmxvjkj',  4027),
  ('2014-08-10',  550, 'fqwvlvndef',  6243),
  ('2014-09-10',  625, 'mhwhxfxrxq',  3832),
  ('2014-10-11',  700, 'wyrgiyvaia',  6716),
  ('2014-11-10',  775, 'cwpdvmhhwh', 16444),
  ('2014-12-10',  850, 'eqeaqvixkf', 29199),
  ('2015-01-09',  925, 'efmclayfnr', 22111),
  ('2015-02-10', 1000, 'qnebafrkco', 11965),
  ('2015-03-12', 1075, 'gsvqniykgx', 20215),
  ('2015-04-12', 1150, 'ayzvjvnocm', 11792),
  ('2015-05-13', 1225, 'knhevkibbp', 18087),
  ('2015-06-10', 1291, 'wxhxmzqxuw', 18859),
  ('2015-07-10', 1366, 'krrcpumtzb', 14919),
  ('2015-08-08', 1441, 'lpglkecvsl', 12906),
  ('2015-09-07', 1516, 'mgtlsfgfbj',  5696),
  ('2015-10-07', 1591, 'trgjscaajt', 13398),
  ('2015-11-06', 1666, 'ccfbjyeqrb',  6213),
  ('2015-12-05', 1741, 'onooskbtzp', 26024)
];
CREATE OR REPLACE TABLE learning_club.purchase_log_yr AS
SELECT l.* FROM UNNEST(purchase_log_yr) l;

์œ„์™€ ๊ฐ™์ด UNNESTํ•จ์ˆ˜์™€ ๋ฉค๋ฒ„์ ‘๊ทผ์—ฐ์‚ฐ์ž๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ on-the-flyํ•˜๊ฒŒ ์ •์˜๋œ nested table์„ BigQuery ์˜๊ตฌํ…Œ์ด๋ธ”๋กœ ๋จผ์ € ๋งŒ๋“ ๋‹ค.

LAG(col, n) ๋„ค๋น„๊ฒŒ์ด์…˜ ํ•จ์ˆ˜๋Š” ํŒŒํ‹ฐ์…˜์˜ ํ˜„์žฌ ํ–‰์„ ๊ธฐ์ค€์œผ๋กœ n๋งŒํผ ์•ž์„  ํ–‰์˜ col๊ฐ’์„ ๊ฐ€์ ธ์˜จ๋‹ค. ๋”ฐ๋ผ์„œ LAG(purchase_amount, 12)๋Š” ๋‚ ์งœ์ˆœ์œผ๋กœ ์ •๋ ฌ๋œ ์›”๋ณ„ ๋งค์ถœ์•ก ํ…Œ์ด๋ธ”์—์„œ ์ผ๋…„ ์ „์˜ ๋งค์ถœ์•ก์„ ๋Œ๋ ค์ฃผ๊ฒŒ ๋œ๋‹ค. ์ด ๊ฐ’์œผ๋กœ ํ˜„์žฌ ์›”๋งค์ถœ์•ก์„ ๋‚˜๋ˆ„๊ฒŒ ๋˜๋ฉด ์ž‘๋Œ€๋น„(purchase_amount_rate)๊ฐ€ ์‚ฐ์ถœ๋œ๋‹ค.

CREATE OR REPLACE TABLE learning_club.purchase_log_dm AS
WITH daily_purchase AS (  
  SELECT dt,
         SUBSTR(dt, 1, 4) AS year,
         SUBSTR(dt, 6, 2) AS month,
         SUBSTR(dt, 9, 2) AS day,
         SUM(purchase_amount) AS purchase_amount,
         COUNT(order_id) AS orders
    FROM learning_club.purchase_log_yr
   GROUP BY 1
),
year_month_purchase AS (
  SELECT year,
         month,
         SUM(purchase_amount) AS purchase_amount,
         SUM(orders) AS orders
    FROM daily_purchase
   GROUP BY 1, 2
)
SELECT year, month, purchase_amount,
       LAG(purchase_amount, 12) OVER (ORDER BY year, month) AS purchase_amount_prev,
       ROUND(purchase_amount / LAG(purchase_amount, 12) OVER (ORDER BY year, month), 2) AS purchase_amount_rate,
  FROM year_month_purchase
 ORDER BY year, month;

๋ชธํ’€๊ธฐ๋ฅผ ๋งˆ์ณค์œผ๋‹ˆ ์ด์ œ ๋ณธ๊ฒฉ์ ์œผ๋กœ Z-Chart ๋ถ„์„์— ๋“ค์–ด๊ฐ€ ๋ณด๋„๋ก ํ•˜์ž.

Z-Chart ๋ถ„์„

Z-Chart ๋ถ„์„์€ ์›”๋ณ„๋งค์ถœ, (๋…„)๋งค์ถœ๋ˆ„๊ณ„, ์ด๋™(๋…„๊ฐ„)ํ•ฉ๊ณ„ ๋ผ๋Š” 3๊ฐ€์ง€ ์ง€ํ‘œ๋ฅผ ๊บพ์€์„  ๊ทธ๋ž˜ํ”„๋กœ ๋‚˜ํƒ€๋‚ธ ๊ฒƒ์œผ๋กœ Z ํ˜•ํƒœ๋ฅผ ์ด๋ฃฌ๋‹ค ํ•˜์—ฌ Z-Chart ๋กœ ๋ถˆ๋ฆฐ๋‹ค.

๊ฒฝ์˜ ๋ถ„์„์ด๋‚˜ ๋งˆ์ผ€ํŒ… ๋ถ„์•ผ์—์„œ ์›” ๋ณ€๋™์ด๋‚˜ ๊ณ„์ ˆ ๋ณ€๋™์„ ๋ฐฐ์ œํ•œ ๋งค์ถœ ๊ฒฝํ–ฅ์„ ๋ถ„์„ํ•˜๊ธฐ ์œ„ํ•ด ์ž์ฃผ ์ด์šฉ๋œ๋‹ค. Z-Chart ๋ถ„์„์„ ์œ„ํ•ด์„œ๋Š” ์•„๋ž˜ 3๊ฐ€์ง€ ์ง€ํ‘œ๊ฐ€ ํ•„์š”ํ•˜๋‹ค.

  1. ์›”๋ณ„๋งค์ถœ

  2. ๋งค์ถœ๋ˆ„๊ณ„

  3. ์ด๋™๋…„๊ณ„

๋งค์ถœ ์ด๋ ฅ ๋ฐ์ดํ„ฐ๋กœ๋ถ€ํ„ฐ ๊ฐ๊ฐ์˜ ์ง€ํ‘œ๋ฅผ ์‚ฐ์ถœํ•˜๋Š” ๊ณผ์ •์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.

1. Z-Chart ๋ถ„์„ ์ฟผ๋ฆฌ

CREATE OR REPLACE TABLE learning_club.zchart_dm AS
WITH daily_purchase AS (  
  SELECT dt,
         SUBSTR(dt, 1, 4) AS year,
         SUBSTR(dt, 6, 2) AS month,
         SUBSTR(dt, 9, 2) AS day,
         SUM(purchase_amount) AS purchase_amount,
         COUNT(order_id) AS orders
    FROM learning_club.purchase_log_yr
   GROUP BY 1
   ORDER BY 1
),
year_month_purchase AS (
  SELECT year,
         month,
         SUM(purchase_amount) AS purchase_amount,
         SUM(orders) AS orders
    FROM daily_purchase
   GROUP BY 1, 2
)
SELECT year, month,
       purchase_amount,
       SUM(purchase_amount) OVER (
         PARTITION BY year ORDER BY month ROWS UNBOUNDED PRECEDING
       ) AS purchase_cumulative,
       SUM(purchase_amount) OVER (
         ORDER BY year, month ROWS 11 PRECEDING
       ) AS purchase_moving_yr,
  FROM year_month_purchase
 ORDER BY year, month; 

[๊ทธ๋ฆผ] Z Chart Analysis Query

์›”๋ณ„๋งค์ถœ์€ GROUP BY ์ง‘๊ณ„๋ฅผ ํ†ตํ•ด ์‰ฝ๊ฒŒ ๊ณ„์‚ฐ๋œ๋‹ค. ๋งค์ถœ๋ˆ„๊ณ„์™€ ์ด๋™๋…„๊ณ„๋Š” ์•ž์„œ ๋ชธํ’€๊ธฐ์—์„œ ์‚ดํŽด๋ดค๋˜ ์ผ๋ณ„ ๋งค์ถœ ๋ˆ„๊ณ„์™€ ์ด๋™ํ‰๊ท ๋ฅผ ์‚ฐ์ถœํ•  ๋•Œ ์‚ฌ์šฉํ–ˆ๋˜ ๋ถ„์„ํ•จ์ˆ˜(์œˆ๋„์šฐํ•จ์ˆ˜)๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์‚ฐ์ถœ ๊ฐ€๋Šฅํ•˜๋‹ค.

์œˆ๋„์šฐ ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•  ๋•Œ๋Š” ํ•ญ์ƒ ํŒŒํ‹ฐ์…˜์„ ์–ด๋–ป๊ฒŒ ๋‚˜๋ˆŒ์ง€, ํŒŒํ‹ฐ์…˜ ์•ˆ์—์„œ ์œˆ๋„์šฐ์˜ ๋ฒ”์œ„๋ฅผ ์–ด๋–ป๊ฒŒ ์žก์„์ง€๋ฅผ ๊ณ ๋ฏผํ•ด์•ผ ํ•œ๋‹ค. ์•ž์˜ ๋‚ด์šฉ์„ ๋ฐ”ํƒ•์œผ๋กœ ๊ฐ์ž ์ƒ๊ฐํ•ด ๋ณด์ž.

2. Z-Chart ๋ถ„์„ ์‹œ๊ฐํ™”

์œ„์—์„œ ์‚ฐ์ถœํ•œ 3๊ฐ€์ง€ ์ง€ํ‘œ๋“ค๋“ค์„ ๋งˆํŠธํ™”ํ•˜์—ฌ BigQuery ํ…Œ์ด๋ธ”๋กœ ์ƒ์„ฑํ•œ ํ›„ Data Studio์—์„œ ์‹œ๊ฐํ™”๋ฅผ ํ•˜๋ฉด ์•„๋ž˜์™€ ๊ฐ™์ด Z ์ž ๋ชจ์–‘์˜ ๊ทธ๋ž˜ํ”„๊ฐ€ ๋งŒ๋“ค์–ด์ง„๋‹ค.

Last updated