[Observability] IntakeBatchSummary timeout control through creation time on platform

Description

CAPL - Bug default text according to the team DoR (Definition of Ready)

01 - PERSON OF CONTACT (PERSON THAT CAN ANSWER QUESTIONS ABOUT THE PROBLEM):

@Geny Isam Hamud Herrera @Renan Schroeder
02 - PROBLEM (WHAT'S THE ISSUE?):

Each batchId must have timeout control based on the intake time on the platform and not according to the startTime from payload received in {{/intake/batch/

{batchId}

/summary}} requests.

  • It will enable batches with summary that has startTime older than 30 minutes to receive new data and process next pipelines.

The current approach does not consider the batch summary creation time on the platform to control the timeout and may affect cases in which SmartLink can send summaries to Carol for batches that are more than 30 minutes late and after a few seconds or minutes continue sending more intake requests referring to these batches.

Risk:

This is anticipating the sending of the observability event CarolPipelinesExecutionSummary with status NO_PIPELINE_EXECUTED to SmartLink, making it no longer possible to process pipelines with the batch in question if new data were received later.

03 - STEPS TO REPRODUCE (STEP (1...N), VIDEO, SCREENSHOTS, LOGS FOLDER, HEARTBEAT, ETC. – IF IS NOT POSSIBLE TO REPRODUCE EXPLAIN THE REASON):
04 - LINKS (ADD A LINK TO THE BUG OR TO THE TENANT):
05 - EXPECTED BEHAVIOR (LIST THE EXPECTED BEHAVIORS TO CONSIDER THIS BUG AS DONE):

  • Batches received where the summary has a startTime older than 30 minutes will only send the observability event CarolSummaryValidated with status TIMEOUT after 30 minutes from the creation time of the respective record of the input_batch_summary entity. In other words, receiving a summary should be put the batch to OPEN regardless to startTime and endTime.
  • Only send the observability event CarolPipelinesExecutionSummary with status NO_PIPELINE_EXECUTED when no pipeline has been executed to a given batch only when previous status were READY.

Activity

Automation for Jira 18 December 2023, 21:05 Jira Internal Users

This issue was automatically transitioned to TESTED & MERGED, as its PR was just merged into develop branch in Github. PR Approved by Damore,douglascoimbra.

Automation for Jira 18 December 2023, 21:02 Jira Internal Users

This issue was automatically transitioned to QA REVIEW, as its PR was just approved in Github.

Douglas Coimbra Lopes 18 December 2023, 21:01 Jira Internal Users

Invalid table sending data + summary

PDUPLICATED DELIVERED MESSAGES FIXED ALSO

Automation for Jira 18 December 2023, 13:39 Jira Internal Users

Bom dia pessoal. Card validado com cenario reportado
• 🚨 Update: eventos de observabilidade vindo duplicados/fora de ordem de delivery

Sent by Slack - platform-internal - Douglas Coimbra Lopes

Douglas Coimbra Lopes 17 December 2023, 14:37 Jira Internal Users

@Gabriel DAmore Marciano @Geny Isam Hamud Herrera Retesting the fixed scenario

IT IS FIXED

SUMMARY SENT

Douglas Coimbra Lopes 17 December 2023, 01:43 Jira Internal Users

@Geny Isam Hamud Herrera @Gabriel DAmore Marciano

TESTFAILED

  1. When a intake is sent and after 30minutes, the batch is submitted, the batch is updated to READY and is set back to TIMEOUT

2. Once a batch is submitted, after 30 minutes, it is not updated to TIMEOUT

Douglas Coimbra Lopes 17 December 2023, 00:26 Jira Internal Users

@Gabriel DAmore Marciano TESTING FIXES

HAPPY PATH PASSED

  • TRYING TO SUBMIT A OPEN/READY SUMMARY PASSED

TRYING TO SUBMIT A TIMEOUT_PROCESSED/TIMEOUT_PROCESSING BATCH WITHOUT SUMMARY (NO OPEN STATUS) PASSED

TRYING TO SUBMIT A TIMEOUT BATCH WITH SUMMARY SUBMITTED PASSED

Gabriel DAmore Marciano 16 December 2023, 18:14 Jira Internal Users

Flag removed

Fix error reported @Douglas Coimbra Lopes

Douglas Coimbra Lopes 16 December 2023, 14:41 Jira Internal Users

@Geny Isam Hamud Herrera @Gabriel DAmore Marciano FOLLOW THE EVIDENCES BASED ON THE LAST COMMIT ID: https://github.com/totvslabs/mdm/commit/0c829809286030bec4d1ff3f50004c317009fc3e

scenario 1: TESt FAILED FOR THE HAPPY PATH

INTAKE SENT TO A BATCH: HAPPY BATCH

SUMMARY ERROR 409

  • SCENARIO FOR THE DUPLICATED STATUS FIXED

TIMEOUT_PROCESSING/ TIMEOUT_PROCESSED BATCHES ARE FIXED BASED ON THEY CAN NOW BE REPROCESSED

Douglas Coimbra Lopes 15 December 2023, 23:49 Jira Internal Users

SUMMARY WITH PAST DATE

SUMMARY WITH START DATE PAST MORE THAN 7 DAYS

SENDING SUMMARY + INTAKE TO A TABLE THAT IS NOT PART OF THE SQL PIPELINE

Automation for Jira 13 December 2023, 16:09 Jira Internal Users

@Renan Schroeder ,

This issue was planned to be delivered until 2024-01-02. You can check that by consulting the issue in the Due Date field.

Dates already planned for this issue: 2024-01-02

If External Issue Link field is filled, customer was also informed on JIRA TOTVS.