ML - NLP

Business Challenge

For commercial transactions, in addition to filing state taxes at various levels like state, county, and city, there can be special tax jurisdictions for every state applicable to entities like fire, police, health, library, transportation, etc. The boundaries of special tax jurisdictions can be arbitrary and may not be related to any other geographic boundary, such as city limits or zip codes.

Since there is no standardized master data set for these jurisdictions across the government agencies, and the industry, extracting and mapping these STJ elements are very error prone and requires a lot of manual effort.

Our client, an enterprise financial services company, needed a way to extract the jurisdictions from various reports and invoices, and map them to its own target master set so it could apply taxes correctly, without extensive manual work.

EQengineered Approach

The EQengineered data team developed an approach to identify special jurisdictions quickly and accurately from various source systems and map them to a target set, with minimal manual intervention. Some of the highlights of the approach were:

• Use of NLP and fuzzy logic to automatically map from one set of jurisdiction names to another.

• Use of synonym dictionaries, weighting criteria, and state specific business rules to improve the overall accuracy of the mapping process.

• Enabling a high-degree of configurability to adapt the solution to different sources of data.

Business Results/Outcomes

The mapping solution achieved mapping accuracy greater than 90% for most of the states that appeared in the source data sets.

This work provided a foundation for building a production-level solution including integration with multiple tax systems, both internal and external, periodic refresh of the special jurisdiction dictionary and configurations, and the mapping of business rules.