EPA Audit - Apache NiFi SDWIS Extraction Tool

Description

To meet EPA’s standards for the data audit, the EPA requests that the DEP install and set up the ‘Apache NiFi SDWIS State Data Extraction Tool’. This tool streamlines and accelerates the transfer of a primacy agency's SDWIS State data to the EPA. Apache NiFi is an extract, transform, load (ETL) tool that agencies will use to migrate their SDWIS data to a centralized database hosted on the EPA's network.

WSGS is confident the most efficient solution to this problem is to collaborate with GEC, a company with which we already have an established relationship. GEC has experience with this software and can assist DEP in its successful implementation. Additionally, GEC comprises water quality experts from across the country who have worked in various state roles, similar to WSGS's work in NJ. Their detailed understanding of EPA requirements is invaluable.

Furthermore, we can potentially leverage an existing contract with GEC for IT services, allowing us to bypass lengthy procurement processes. Although the installation of this tool was not included in the original scope for the current WSGS/GEC contract, WSGS may consider reallocating tasks or utilizing any unused funds to facilitate this work immediately.

Problem Statement

The Environmental Protection Agency's Drinking Water and Ground Water Protection Section recently informed Water Supply & Geoscience (WSGS) that the Department of Environmental Protection (DEP) must start planning for a file review audit scheduled for April 2026. The EPA requests that the DEP begin preparations for this effort

Project Justification

The project must be completed to ensure DEP complies with federal regulations, including cooperation with data audits.

Estimated Transactions

None

Created

22 January 2026, 08:09

Target Rollout Date

1 April 2026

Target Rollout Date Reason

EPA has requested from DEP that we begin conducting the data audit with them sometime in April.

Updated

25 March 2026, 13:32

Attachments

Activity

Mike.Kusmiesz 25 March 2026, 14:02

Data was successfully sent as requested by DEP to EPA using this specialized tool. EPA now has the necessary dataset in the requested format for their contracted auditors to perform the data audit analysis.

james bridgewater 3 February 2026, 12:58

NJOIT set up a server with the following specifications and granted Mike Matsko Administrator access rights to it:

  • Java 21: Required for Apache NiFi 2.x.

  • Windows: Windows Server (64-bit) is supported.

  • Memory (RAM): 8 GB of RAM is recommended for each node, though actual requirements depend on data throughput and data volume.

  • CPU: 4 CPU cores are recommended to handle data flow processing efficiently.  

  • Disk Space: Sufficient disk space is needed for repositories (flowfile, content, and provenance). Fast disks (SSD) are recommended for high-performance throughput. 

james bridgewater 26 January 2026, 19:20

This is all windows server work.  GEC will not be able to touch the DEP server so it will be a DEP person doing all the setup and configuration with GEC guidance.  This will end up be a forever support assignment for this DEP person.  We tried to have GEC be the lead setup for the new SDWIS application and servers and it ended up being Rich Hyjack spending weeks on it which is not a viable option here. Mike Matsko, Mike Kusmiesz and Jim Bridgewater to meet to discuss options.