[Observability] IntakeBatchSummary timeout control through creation time on platform
Description
CAPL - Bug default text according to the team DoR (Definition of Ready)
01 - PERSON OF CONTACT (PERSON THAT CAN ANSWER QUESTIONS ABOUT THE PROBLEM):
@Geny Isam Hamud Herrera @Renan Schroeder
02 - PROBLEM (WHAT'S THE ISSUE?):
Each batchId must have timeout control based on the intake time on the platform and not according to the startTime
from payload received in {{/intake/batch/
/summary}} requests.
- It will enable batches with summary that has
startTime
older than 30 minutes to receive new data and process next pipelines.
The current approach does not consider the batch summary creation time on the platform to control the timeout and may affect cases in which SmartLink can send summaries to Carol for batches that are more than 30 minutes late and after a few seconds or minutes continue sending more intake requests referring to these batches.
Risk:
This is anticipating the sending of the observability event CarolPipelinesExecutionSummary
with status NO_PIPELINE_EXECUTED
to SmartLink, making it no longer possible to process pipelines with the batch in question if new data were received later.
03 - STEPS TO REPRODUCE (STEP (1...N), VIDEO, SCREENSHOTS, LOGS FOLDER, HEARTBEAT, ETC. – IF IS NOT POSSIBLE TO REPRODUCE EXPLAIN THE REASON):
04 - LINKS (ADD A LINK TO THE BUG OR TO THE TENANT):
05 - EXPECTED BEHAVIOR (LIST THE EXPECTED BEHAVIORS TO CONSIDER THIS BUG AS DONE):
- Batches received where the summary has a
startTime
older than 30 minutes will only send the observability eventCarolSummaryValidated
with statusTIMEOUT
after 30 minutes from the creation time of the respective record of theinput_batch_summary
entity. In other words, receiving a summary should be put the batch to OPEN regardless to startTime and endTime. - Only send the observability event
CarolPipelinesExecutionSummary
with statusNO_PIPELINE_EXECUTED
when no pipeline has been executed to a given batch only when previous status were READY.
This issue was automatically transitioned to TESTED & MERGED, as its PR was just merged into develop branch in Github. PR Approved by Damore,douglascoimbra.
This issue was automatically transitioned to QA REVIEW, as its PR was just approved in Github.
Invalid table sending data + summary
PDUPLICATED DELIVERED MESSAGES FIXED ALSO
Bom dia pessoal. Card validado com cenario reportado
• :rotating_light: Update: eventos de observabilidade vindo duplicados/fora de ordem de delivery
Sent by Slack - platform-internal - Douglas Coimbra Lopes
@Gabriel DAmore Marciano @Geny Isam Hamud Herrera Retesting the fixed scenario
IT IS FIXED
SUMMARY SENT
@Geny Isam Hamud Herrera @Gabriel DAmore Marciano
TESTFAILED
When a intake is sent and after 30minutes, the batch is submitted, the batch is updated to READY and is set back to TIMEOUT
2. Once a batch is submitted, after 30 minutes, it is not updated to TIMEOUT
@Gabriel DAmore Marciano TESTING FIXES
HAPPY PATH PASSED
TRYING TO SUBMIT A OPEN/READY SUMMARY PASSED
TRYING TO SUBMIT A TIMEOUT_PROCESSED/TIMEOUT_PROCESSING BATCH WITHOUT SUMMARY (NO OPEN STATUS) PASSED
TRYING TO SUBMIT A TIMEOUT BATCH WITH SUMMARY SUBMITTED PASSED
Flag removed
Fix error reported @Douglas Coimbra Lopes
@Geny Isam Hamud Herrera @Gabriel DAmore Marciano FOLLOW THE EVIDENCES BASED ON THE LAST COMMIT ID: https://github.com/totvslabs/mdm/commit/0c829809286030bec4d1ff3f50004c317009fc3e
scenario 1: TESt FAILED FOR THE HAPPY PATH
INTAKE SENT TO A BATCH: HAPPY BATCH
SUMMARY ERROR 409
SCENARIO FOR THE DUPLICATED STATUS FIXED
TIMEOUT_PROCESSING/ TIMEOUT_PROCESSED BATCHES ARE FIXED BASED ON THEY CAN NOW BE REPROCESSED
SUMMARY WITH PAST DATE
SUMMARY WITH START DATE PAST MORE THAN 7 DAYS
SENDING SUMMARY + INTAKE TO A TABLE THAT IS NOT PART OF THE SQL PIPELINE
@Renan Schroeder ,
This issue was planned to be delivered until 2024-01-02. You can check that by consulting the issue in the Due Date field.
Dates already planned for this issue: 2024-01-02
If External Issue Link field is filled, customer was also informed on JIRA TOTVS.