EPA Audit - Apache NiFi SDWIS Extraction Tool
Description
|
To meet EPA’s standards for the data audit, the EPA requests that the DEP install and set up the ‘Apache NiFi SDWIS State Data Extraction Tool’. This tool streamlines and accelerates the transfer of a primacy agency's SDWIS State data to the EPA. Apache NiFi is an extract, transform, load (ETL) tool that agencies will use to migrate their SDWIS data to a centralized database hosted on the EPA's network. WSGS is confident the most efficient solution to this problem is to collaborate with GEC, a company with which we already have an established relationship. GEC has experience with this software and can assist DEP in its successful implementation. Additionally, GEC comprises water quality experts from across the country who have worked in various state roles, similar to WSGS's work in NJ. Their detailed understanding of EPA requirements is invaluable. Furthermore, we can potentially leverage an existing contract with GEC for IT services, allowing us to bypass lengthy procurement processes. Although the installation of this tool was not included in the original scope for the current WSGS/GEC contract, WSGS may consider reallocating tasks or utilizing any unused funds to facilitate this work immediately. |
Data was successfully sent as requested by DEP to EPA using this specialized tool. EPA now has the necessary dataset in the requested format for their contracted auditors to perform the data audit analysis.
NJOIT set up a server with the following specifications and granted Mike Matsko Administrator access rights to it:
Java 21: Required for Apache NiFi 2.x.
Windows: Windows Server (64-bit) is supported.
Memory (RAM): 8 GB of RAM is recommended for each node, though actual requirements depend on data throughput and data volume.
CPU: 4 CPU cores are recommended to handle data flow processing efficiently.
Disk Space: Sufficient disk space is needed for repositories (flowfile, content, and provenance). Fast disks (SSD) are recommended for high-performance throughput.
This is all windows server work. GEC will not be able to touch the DEP server so it will be a DEP person doing all the setup and configuration with GEC guidance. This will end up be a forever support assignment for this DEP person. We tried to have GEC be the lead setup for the new SDWIS application and servers and it ended up being Rich Hyjack spending weeks on it which is not a viable option here. Mike Matsko, Mike Kusmiesz and Jim Bridgewater to meet to discuss options.