During data loss investigation we identified that the audit steps has missing steps and it has duplicated steps
Description
Issue Component(s) on Oct 25, 2023, 2:44:05 PM: CONNECTOR | DATA_INGESTION
PRDE - Bug default text according to the team DoR (Definition of Ready)
01 - PERSON OF CONTACT (PERSON THAT CAN ANSWER QUESTIONS ABOUT THE PROBLEM):
02 - PROBLEM (WHAT'S THE ISSUE?):
Report:
select c.createdReportDateTime, c.publish_time, c.auditId, string_agg(s.stepSequence, ", " order by s.stepSequence)
from (select distinct clockinsStruct.auditId as auditId, createdReportDateTime, clockinsStruct.publish_time as publish_time from `labs-app-mdm-production.a_poffo.divergences_clockinrecords` where createdReportDateTime = (select max(createdReportDateTime ) from `labs-app-mdm-production.a_poffo.divergences_clockinrecords`)) c
left join (select
(case
when step = 'LANDING' then '01. LANDING'
when step = 'SPLIT_RECORDS' then '02. SPLIT_RECORDS'
when step = 'PRODUCE_MESSAGE' then '03. PRODUCE_MESSAGE'
when step = 'NATS_PRODUCE' then '04. NATS_PRODUCE'
when step = 'NATS_SENT' then '05. NATS_SENT'
when step = 'NATS_SUCCESS' then '06. NATS_SUCCESS'
when step = 'NATS_ERROR' then '07. NATS_ERROR'
when step = 'PUBSUB_SENT' then '08. PUBSUB_SENT'
when step = 'DATAFLOW_SENT' then '09. DATAFLOW_SENT'
when step = 'DATAFLOW_SENT_ERROR' then '10. DATAFLOW_SENT_ERROR'
when step = 'STAGING_FLOW_PIPELINE' then '11. STAGING_FLOW_PIPELINE'
when step = 'CDS_FLOW_PIPELINE' then '12. CDS_FLOW_PIPELINE'
when step = 'STAGING_PARQUET_WRITER' then '13. STAGING_PARQUET_WRITER'
else 'new-step'
end) as stepSequence,
* from `labs-app-mdm-production.intake.records_steps`) s on s.auditId = c.auditId
group by 1,2,3
- CLientes:
- Clockin
- Tembici
- Report above generated based on issues describing data loss.
- Issues:
-
CAPL-4909:Record did not land in staging due to Schema Changes on Big QueryCanceled
-
CAPL-4859:Record did not land in staging (part2)Done
-
CAPL-4819:Record did not land in stagingDone
-
03 - STEPS TO REPRODUCE (STEP (1...N), VIDEO, SCREENSHOTS, LOGS FOLDER, HEARTBEAT, ETC. – IF IS NOT POSSIBLE TO REPRODUCE EXPLAIN THE REASON):
04 - LINKS (ADD A LINK TO THE BUG OR TO THE TENANT):
05 - EXPECTED BEHAVIOR (LIST THE EXPECTED BEHAVIORS TO CONSIDER THIS BUG AS DONE):
- The steps sequence should be saved correctly on BQ.