When fail BQ Provision same error message twice

Description

Issue Component(s) on Oct 27, 2023, 4:13:34 PM: CONNECTOR | DATA_INGESTION

Issue Component(s) on Oct 25, 2023, 12:13:10 AM: CONNECTOR | DATA_INGESTION

PRDE - Bug default text according to the team DoR (Definition of Ready)

01 - PERSON OF CONTACT (PERSON THAT CAN ANSWER QUESTIONS ABOUT THE PROBLEM):
02 - PROBLEM (WHAT'S THE ISSUE?):

Detected that happened data loss when a tenant was running the provisioning flow and data was sent to this tenant.

Query to get the missing records:

select createdReportDateTime, clockinsStruct.*,
  from (select * from `labs-app-mdm-production.a_poffo.divergences_clockinrecords` where createdReportDateTime	 = "2023-10-24 23:34:57.112457 UTC") c
  left join `labs-app-mdm-production.clockin.tenants` t on t.tenantId = c.tenantId
  where timestamp(clockinsStruct.cloockinDateTime) < "2023-10-24" and clockinsStruct.cloockinDateTime > "2023-10-23"

Report: https://docs.google.com/spreadsheets/d/16V5cH3MxLxh2uEoXGnT9tt1CXqRuCwCZ-IX8bghZwEA/edit?usp=sharing

Task provisioning BQ: https://ksi.carol.ai/ksi/carol-ui/tasks/activity/d5d65a2da88b4216a6c58acf684d7634?p=3&ps=100&sort=dateUpdated&order=DESC&filters=%5B%7B%22hideInternal%22:%22true%22%7D%5D

Table resulted as INACTIVE, but it is accepting ingestion of new data:

After the investigation, that was sent to carol again (bqInsertFlow):

03 - STEPS TO REPRODUCE (STEP (1...N), VIDEO, SCREENSHOTS, LOGS FOLDER, HEARTBEAT, ETC. – IF IS NOT POSSIBLE TO REPRODUCE EXPLAIN THE REASON):
04 - LINKS (ADD A LINK TO THE BUG OR TO THE TENANT):
05 - EXPECTED BEHAVIOR (LIST THE EXPECTED BEHAVIORS TO CONSIDER THIS BUG AS DONE):

  • Do not accept data ingestion during the BQ provisioning flow
    • Return an error message indicating the tenant is on BQ provisioning flow (400).
  • @Geny Isam Hamud Herrera Related to the AC above, I believe it came from Product as the AC. However, those are the hot topic we have created for this card after investigation:
  • Check why when we fail the Provision task we are duplicating the steps to change Staging and Datamodel to FAILED_TO_SYNC

Findings

We were not able to find the pathway which lead to this condition by looking at the code (too many possibilities). Hence we are adding logs and preventing the task to run into a final state (CANCELED or FAILED) twice. The logs will allow us to indentify properly what conditions led to the error.

Activity

Automation for Jira 27 February 2024, 22:00 Jira Internal Users

This issue was automatically transitioned to REGRESSION, as its PR was just merged into qa branch in Github.

Automation for Jira 27 February 2024, 22:00 Jira Internal Users

This issue was automatically transitioned to REGRESSION, as its PR was just merged into qa branch in Github.

Automation for Jira 23 February 2024, 18:31 Jira Internal Users

This issue was automatically transitioned to TESTED & MERGED, as its PR was just merged into develop branch in Github. PR Approved by cindysoares,olivandre,douglascoimbra.

Automation for Jira 23 February 2024, 18:31 Jira Internal Users

Github user douglascoimbra has just approved a PR (added as Shard Assignee in this Jira issue).

chore: CAPL-4936 Done logs to see if ready task status is changed

Automation for Jira 23 February 2024, 18:27 Jira Internal Users

Github user olivandre has just approved a PR (added as Shard Assignee in this Jira issue).

chore: CAPL-4936 Done logs to see if ready task status is changed

Automation for Jira 23 February 2024, 18:08 Jira Internal Users

This issue was automatically transitioned to QA REVIEW, as its PR was just approved in Github.

Automation for Jira 23 February 2024, 17:23 Jira Internal Users

Github user lucasnoetzold has just commited and issue was sent back to the REVIEW column.

Automation for Jira 23 February 2024, 15:50 Jira Internal Users

This issue was automatically transitioned to QA REVIEW, as its PR was just approved in Github.

Douglas Coimbra Lopes 22 February 2024, 12:48 Jira Internal Users

@Lucas Noetzold Regression is OK for this branch. It is pending only the code review

Automation for Jira 21 February 2024, 15:34 Jira Internal Users

This issue was automatically transitioned to REVIEW, as its PR (not DRAFT and not WIP) was just created in Github.

chore: CAPL-4936 Done logs to see if ready task status is changed

Douglas Coimbra Lopes 21 February 2024, 13:09 Jira Internal Users

@Lucas Noetzold There are no scenarios for the QA team to be tested. cc @Gabriel DAmore Marciano @Jonathan Willian Moraes

Automation for Jira 20 February 2024, 23:15 Jira Internal Users

This issue was automatically transitioned to REVIEW, as its PR (not DRAFT and not WIP) was just created in Github.

chore: CAPL-4936 Done logs to see if ready task status is changed

Automation for Jira 12 February 2024, 19:44 Jira Internal Users

@Robson Thanael Poffo ,
@Gabriel DAmore Marciano , @Lucas Noetzold , @André Pereira de Oliveira , @Cindy de Araujo Soares Moore , @Lucas Noetzold

This issue was planned to be delivered until 2024-03-04. You can check that by consulting the issue in the Due Date field.

Dates already planned for this issue: 2024-02-12, 2024-03-04

If External Issue Link field is filled, customer was also informed on JIRA TOTVS.

Automation for Jira 22 January 2024, 20:33 Jira Internal Users

@Robson Thanael Poffo ,
@Jonathan Willian Moraes , @Cindy de Araujo Soares Moore

This issue was planned to be delivered until 2024-02-12. You can check that by consulting the issue in the Due Date field.

Dates already planned for this issue: 2024-02-12

If External Issue Link field is filled, customer was also informed on JIRA TOTVS.

Gabriel DAmore Marciano 13 December 2023, 16:33 Jira Internal Users

Trade off with card

Gabriel DAmore Marciano 27 November 2023, 20:27 Jira Internal Users

Removing from sprint [CAPL_3.86] because of red-phone trade offs.