Share
Go down
avatar
Admin
Posts : 155
Join date : 2017-11-11
Age : 27
View user profilehttp://net-media.yoo7.com

Smart grouping in Sql server

on Wed Dec 13, 2017 11:07 am
I need to make sort of cluster, tried to use row_number, rank and dense_rank but nothing worked.
Code:
USER    START_DATE  ID_TO_CLUSTER  END_DATE    NEW FIELD_SESSION
262 01/10/2017 00:01    4  01/10/2017 00:03    S1
262 01/10/2017 00:02    4  01/10/2017 00:03    S1
262 01/10/2017 00:03    0  01/10/2017 00:03    NO SESSION
262 01/10/2017 00:03    1  01/10/2017 00:03    NO SESSION
262 01/10/2017 00:03    7  01/10/2017 00:03    NO SESSION
262 01/10/2017 00:03    2  01/10/2017 00:07    NO SESSION
262 01/10/2017 00:07    3  01/10/2017 00:07    NO SESSION
262 01/10/2017 00:07    4  01/10/2017 00:11    S2
262 01/10/2017 00:07    4  01/10/2017 00:11    S2
262 01/10/2017 00:11    7  01/10/2017 00:11    NO SESSION
262 01/10/2017 00:11    9  01/10/2017 00:11    NO SESSION
262 01/10/2017 16:28    0  01/10/2017 16:30    NO SESSION
262 01/10/2017 16:28    1  01/10/2017 16:28    NO SESSION
262 01/10/2017 16:30    2  01/10/2017 16:30    NO SESSION
262 01/10/2017 16:30    3  01/10/2017 16:30    NO SESSION
262 01/10/2017 16:30    4  01/10/2017 16:36    S3
262 01/10/2017 16:30    4  01/10/2017 16:36    S3
262 01/10/2017 16:36    4  01/10/2017 16:36    S3

Basically I need to group ID_TO_CLUSTER in a Session (new_field_session) when id_to_cluster is duplicated in a subsequent time, in order to get the min start_date and the max end date for each cluster.

Answer
Code:
SELECT
  GroupNum,
  ID_TO_CLUSTER,
  NEW_FIELD_SESSION,
  MIN(START_DATE) MIN_START_DATE,
  MAX(END_DATE) MAX_END_DATE
FROM
  (
    SELECT
      *,
      CASE WHEN NEW_FIELD_SESSION=PrevSession THEN LAG(RowNum)OVER(ORDER BY RowNum) ELSE RowNum END GroupNum
    FROM
      (
        SELECT
          *,
          LAG(NEW_FIELD_SESSION)OVER(ORDER BY START_DATE,ID_TO_CLUSTER) PrevSession,
          LEAD(NEW_FIELD_SESSION)OVER(ORDER BY START_DATE,ID_TO_CLUSTER) NextSession,
          ROW_NUMBER()OVER(ORDER BY START_DATE,ID_TO_CLUSTER) RowNum
        FROM [Your table]
      ) q
    WHERE (PrevSession IS NULL OR NextSession IS NULL OR NEW_FIELD_SESSION<>PrevSession OR NEW_FIELD_SESSION<>NextSession)
  ) q
GROUP BY GroupNum,ID_TO_CLUSTER,NEW_FIELD_SESSION
Back to top
Permissions in this forum:
You can reply to topics in this forum