Diginix - Ai booked orders value 180,000 USD in the month of December

Extraction and analysis of the Non-Profit organization data
using BigData engineering and data collection tools

AboutThe Client

The project was implemented for the startup based in the US.

Project Customer data analysis using NLP techniques
Web Designing
Be ahead of the curve

About this project

Project Description:

There are thousands of non-profit organizations in the USA. These non-profit organizations are tax-exempt organizations. However, twice per year, they fill a number of tax information related forms. These forms are produced and provided by the US state agencies and are open. The filled forms are then digitized and stored in the Amazon S3 bucket. Hence they are publicly available. The project aims to parse the XML format of these forms and gather data. Our pipeline is focused on gathering the useful information of the staff of these non-profit organizations such as salary, the number of working hours per week, titles, emails, phone numbers, and so on. Our pipeline is also capable of filtering the data on the organization level to extract information for each non-profit organization, such as number of employees, their contact information, the revenue of the organization etc. The final aim of the project is to have the talent pool of the staff members of these non-profit organizations and provide this data to interested parties.


  • AWS, Python, Scrapy, Pandas, Postgres.

Let's Talk Business

Reach out to us by below-given details, or drop us a text to start a conversation, we are here to provide best in class business solutions.