heavy Elasticsearch config queries in tasks-sql - Part 1
Description
CAPL - Bug default text according to the team DoR (Definition of Ready)
01 - PERSON OF CONTACT (PERSON THAT CAN ANSWER QUESTIONS ABOUT THE PROBLEM):
@henrique.cavarsan
02 - PROBLEM (WHAT'S THE ISSUE?):
It is necessary to optimize the Elasticsearch queries in the config; they are too heavy and affecting the ES latency times.
These queries are executed by the worker-sql in task-sql, and here is the log:
https://cloudlogging.app.goo.gl/uZ3h9AUotiE7mUSJ8
It would be interesting not to use config-* in these queries if possible.
03 - STEPS TO REPRODUCE (STEP (1...N), VIDEO, SCREENSHOTS, LOGS FOLDER, HEARTBEAT, ETC. – IF IS NOT POSSIBLE TO REPRODUCE EXPLAIN THE REASON):
04 - LINKS (ADD A LINK TO THE BUG OR TO THE TENANT):
https://cloudlogging.app.goo.gl/uZ3h9AUotiE7mUSJ8
05 - EXPECTED BEHAVIOR (LIST THE EXPECTED BEHAVIORS TO CONSIDER THIS BUG AS DONE):
We need to do this for this card:
- When a SQL task starts in a unified tenant scenario, we are searching for all TenantApps that matches the CarolAppName and then for each record TenantApp that we found, we are searching the Tenant based on the TenantId of the TenantApp just to find which one of all TenantApps belongs to the Unified Tenant to prepare the fanout and subscription the data processed by pipeline.
// BigQueryProcessTaskProcessor.java:L428
if (shouldFanOut(fanOut, tenant)) {
allowedCustomerTenantIDs =
MdmSingletonManager.getInstance(TenantAppService.class).getAllowedCustomerTenantIDs(uad);
if (CollectionUtils.isNotEmpty(customerTenantIds)
&& CollectionUtils.isNotEmpty(allowedCustomerTenantIDs)) {
allowedCustomerTenantIdsIntersect =
SetUtils.intersection(new HashSet<>(customerTenantIds), allowedCustomerTenantIDs);
}
}
But the way we get the tenant ids that should fanout/send subscriptions is through getting all the TenantApps with mdmName
that matches with the same CarolApp name from the unified tenant.
// TenantAppServiceImpl.java:L3194
public Set<String> getAllowedCustomerTenantIDs(UserAccessDetails userAccessDetails) {
var tenant = getTenantService().getTenant(userAccessDetails, userAccessDetails.getTenantId());
if (Boolean.FALSE.equals(tenant.getMdmIsUnified())) {
throw new ApplicationException(
Status.PRECONDITION_FAILED, "List all customer tenants is able only to Tenant Unified");
}
TenantApp unifiedTenantApp = this.getUnifiedTenantApp(userAccessDetails);
List<TenantApp> allTenantApps;
try {
allTenantApps = this.getAllByCarolAppName(unifiedTenantApp.getMdmName());
} catch (RecordNotFoundException e) {
throw new ApplicationException(
Status.NOT_FOUND, e.getMessage(), unifiedTenantApp.getMdmName(), e);
}
var customerTenants =
this.getCustomerTenants(unifiedTenantApp.getMdmTenantId(), new ArrayList<>(allTenantApps))
.stream()
.map(Tenant::getMdmId)
.collect(Collectors.toSet());
// check the tenants install carol app are allowed, when the strategy is HYBRID
if (!CollectionUtils.containsAny(
unifiedTenantApp.getMdmPipelineAllowedEnvironments(),
TenantApp.ALL_TENANTS_WITH_INSTALLED_APP_ALLOWED)) {
return SetUtils.intersection(
customerTenants, unifiedTenantApp.getMdmPipelineAllowedEnvironments());
}
return customerTenants;
}
- Implement a Cache to know what customer tenants are linked to an unified tenant, avoiding the search of all tenantApps all of the time we need to know this information everytime we run a pipeline
This issue was automatically transitioned to WAITING DEPLOY, as its PR was just merged into master branch in Github.
Github user wsneto has just approved a PR (added as Shard Assignee in this Jira issue).
fix: heavy Elasticsearch config queries in tasks-sql
This issue was automatically transitioned to QA REVIEW, as its PR was just approved in Github.
Github user Damore has just commited and issue was sent back to the REVIEW column.
Github user wsneto has just approved a PR (added as Shard Assignee in this Jira issue).
fix: heavy Elasticsearch config queries in tasks-sql
This issue was automatically transitioned to QA REVIEW, as its PR was just approved in Github.
Github user rfschroeder has just commited and issue was sent back to the REVIEW column.
Github user wsneto has just approved a PR (added as Shard Assignee in this Jira issue).
fix: heavy Elasticsearch config queries in tasks-sql
Regression validated.
Card validated by QA team.
Github user glaucioscheibel has just approved a PR (added as Shard Assignee in this Jira issue).
fix: heavy Elasticsearch config queries in tasks-sql
This issue was automatically transitioned to QA REVIEW, as its PR was just approved in Github.
This issue was automatically transitioned to REVIEW, as its PR (not DRAFT and not WIP) was just created in Github.
fix: heavy Elasticsearch config queries in tasks-sql
This issue was automatically transitioned to REVIEW, as its PR (not DRAFT and not WIP) was just created in Github.
fix: heavy Elasticsearch config queries in tasks-sql
@Renan Schroeder Let’s follow the rules to only go to code review when the card is actually ready to be reviewed. This can be done:
* WIP on PR title
* Label work-in-progress
This issue was automatically transitioned to REVIEW, as its PR (not DRAFT and not WIP) was just created in Github.
fix: heavy Elasticsearch config queries in tasks-sql
@henrique.cavarsan ,
@Gabriel DAmore Marciano , @Renan Schroeder , @Gabriel DAmore Marciano
This issue was planned to be delivered until 2024-01-23. You can check that by consulting the issue in the Due Date field.
Dates already planned for this issue: 2024-01-23
If External Issue Link field is filled, customer was also informed on JIRA TOTVS.
Message thread link on #red-phone channel:
https://totvscarol.slack.com/archives/C03NT4US9J9/p1704486339577779