
Nonprofit Transforms Affiliate Financial Oversight with Automated Form 990 Data Lake in 4 Weeks
Nonprofit partners with Protagona to build AWS-powered data lake, delivering 5x performance improvements and self-service analytics for 1,000+ affiliates
Industry
Teams & Services
Tech & Tools
AWS Fargate, Step Functions, Glue, S3, DynamoDB, Lake Formation, QuickSight, Amazon Q, Athena, Bedrock Claude 3.5, PropPublica API
Key Data Points
The Vision
A nonprofit set out to transform how financial oversight works across a network of 1,000+ affiliates—replacing hours of manual IRS Form 990 analysis with a centralized, automated system that delivers real-time performance visibility and data integrity at scale.
The Goal
This nonprofit needed to consolidate financial oversight for 1,000+ affiliates into a single automated platform—reducing hours of manual Form 990 analysis to zero, establishing systematic data verification against official IRS filings, and giving leadership an always-current view of affiliate financial health.
The Challenge
The nonprofit needed visibility into financial health across 1,000+ affiliate organizations. Staff spent hours manually downloading and analyzing IRS Form 990 tax documents with no centralized view of affiliate performance or systematic verification of reported data against official filings.
Technical Complexity:
- Processing ~3 million IRS XML files monthly to find relevant records
- Unreliable IRS index files requiring full ZIP processing
- Form 990, 990-EZ, and 18+ schedule types with inconsistent XML schemas
- 400+ cryptic XML columns requiring transformation for usability
- Self-service analytics needed for non-technical users
The Solution
Protagona built an AWS-powered medallion data lake that automatically ingests, cleans, and surfaces IRS Form 990 data across 1,000+ affiliates. AI-powered transformation via AWS Bedrock converts 400+ cryptic XML fields into plain-language insights, while QuickSight dashboards and Amazon Q deliver self-service analytics—no SQL required. ETL performance improved 5x, from 30 minutes to just over 5.
.jpg)
