-
Notifications
You must be signed in to change notification settings - Fork 11
Description
Issue
In users_sessions_this_run we filter start_tstamp from the sessions on lower_limit and upper_limit from user_limits. By joining on start_tstamp we attempt to pull info from the first session per user:
FROM {{.output_schema}}.sessions{{.entropy}} a
INNER JOIN {{.scratch_schema}}.users_userids_this_run{{.entropy}} b
ON a.domain_userid = b.domain_userid
WHERE a.start_tstamp >= (SELECT lower_limit FROM {{.scratch_schema}}.users_limits{{.entropy}})
AND a.start_tstamp <= (SELECT upper_limit FROM {{.scratch_schema}}.users_limits{{.entropy}})The lower and upper limits are created by taking the MIN and MAX of start_tstamp from users_userids_this_run, where the start_tstamp is the first timestamp for each user based on all of their sessions (taken from {mobile_}sessions --> {mobile_}sessions_userid_manifest_staged -> users_userids_this_run).
In cases when we have very few users, the upper_limit can have a lower value than the start_tstamp of sessions that we need to process. As a result, the users_sessions_this_run table is artificially truncated, which leads to incorrect information being calculated downstream in users_aggs, users_lasts and users.
Proposed Fix 1 (breaking changes)
- In
{mobile_}sessions_userid_manifest_this_runadd a newend_tstamp TIMESTAMPcolumn and fill it asMAX(start_tstamp) AS max_tstamp - In
{mobile_}users_manifesttable, add a newend_tstamp TIMESTAMPcolumn and fill it as{{start_date}}::TIMESTAMP - In
{mobile_}users_limitschange the definition ofupper_limittoMAX(end_tstamp) AS upper_limit
Proposed Fix 2 (no breaking changes)
- In
users_sessions_this_runremove start_tstamp filter related to theupper_limit:
AND a.start_tstamp <= (SELECT upper_limit FROM {{.scratch_schema}}.{mobile_}users_limits{{.entropy}})