Skip to the main content.

5 min read

Data Integrity (Model Validation)

AML data integrity (model validation) featured image banner

It’s all about the data

Over the last 15 to 20 years, automated transaction monitoring systems have progressed from the realm of the megabanks, to AML monitoring systems at all but the smallest institutions. They are a key resource for your institution’s AML/Anti-Fraud/Sanctions compliance program.  As these systems have progressed, improved and become more commonplace, one element has remained unchanged; the key to their effectiveness is the quality and the completeness of the data they receive and the utilization of this data by the monitoring system. In making this analysis, start by looking at what information your AML system can utilize to maximize its detection and reporting capabilities; it’s all about the data.

These monitoring systems all have something in common, they apply their “magic” to transaction and customer account data.  For these monitoring systems to be as successful as possible, it is imperative that every attempt is made to provide the systems with all of the information they can use.  A review of your institution’s AML monitoring system implementation project plan will provide lists of the data elements your AML system can utilize. Additionally, the review will identify the specific banking application systems the data is being extracted from i.e. host, teller, ACH, ATM, wire etc.

Your financial institution’s information technology, operations department, core service provider (Fiserv, FIS, Jack Henry, etc.) or AML automation vendor, are facilitating the process of Extract, Translate, and Load (ETL) of the transactional and customer information to your AML system. This information is extracted from any or all of your numerous application systems: core/host, trust, loan, teller, wire, SWIFT, ACH, ATM, monetary instruments, remote deposit capture, mobile banking, safe deposit, customer relationship management, and online banking.  As demonstrated, the data management and integration implications can be daunting.

Considering how all of the data in these application systems are interacting, along with any manual processes in your institution, how do you ensure that your AML system is receiving all of the data it can use, and that both the integration processes, and data have integrity?

From a data integrity and completeness perspective it is critical to perform a gap analysis to assess the following areas for potential disparities:

  • Do your institution’s banking application systems capture all the data your AML monitoring system could utilize? Not all banking applications (core, teller, wire, etc.), are created equal. Older and less expensive systems are often lacking functionality and consequently are lacking in AML related data elements. (see examples 1, 2, 3, 4, 5, and 7)
  • Does your AML system’s automated or manual data extract process capture all of the needed information that is available from your banking application systems? (see examples 2, 3, 4, 5, 6 and 7)
  • Does your operations department ensure that all of the nightly extract files have been successfully uploaded and/or does the AML monitoring system have the ability to detect missing source data extract files? (see example 9)
  • Does your institution utilize any manual/paper processes that contain AML/Anti-fraud/Sanctions information that are not being uploaded to (keyed into) the AML monitoring system? (see examples 1, 2, and 9)

The majority of financial institutions rely upon the integration capabilities of their AML platform vendor.  Some institutions have built data hubs or data warehouses to assist the AML vendors with the ETL process to the monitoring system.

When gaps are identified, you need to determine where and if the missing data is available from somewhere else in the institution. If you do not have all of the missing data inside, is it available from some other outside source? Once you have gathered all of the missing information, processes should be developed to key-in or upload it to the monitoring system, if practical.

Let’s look at some examples and the ramifications of common situations in the financial institution industry where data is:

  • lacking entirely from the source application
  • in the source application, but is not imported to the AML monitoring system
  • not importing due to an interruption in a manual process or standard automated operation procedure, (e.g. a power outage at a branch teller location, end-of-day Fedline wire file not created.
  1. Monetary instruments - not all teller systems can create a monetary instrument. Some monetary instruments are still created manually and the record is kept on a spreadsheet. In this case your AML system would be completely blind to a customer’s purchase of large quantities of monetary instruments at high dollar values.
  2. CTRs - not all teller systems can create a currency transaction report. CTRs are often generated by keying the information into a template that is not an imported data source. This process creates a challenge for tracking the number of CTRs filed for a specific customer, as well as a lack of notation on the customer record that they were a party or a conductor on one or multiple CTRs.
  3. ACH Origination Files with Addenda records - this data may be generated internally, or received from originating customer(s). However, this data may not be readily available to your institution if ACH origination services are being provided by a third party (i.e. your outsourced core vendor). If this information is not imported into the AML/fraud detection platform, your institution is blind to an opportunity to detect ACH payroll fraud (these are big dollar thefts and huge reputational risk liabilities).
  4. Standard Entry Class Codes (SEC codes) within ACH transactions - this information is always available when the institution is the receiving depository financial institution (RDFI). As the originating depository financial institution (ODFI) (as in example 3), the information may or may not be available. It is quite common that the SEC codes are not included in source data file imports because the information is generally not included on the customer’s statement. Without the SEC codes in the AML monitoring system your institution is unable to identify high-risk ACH transactions (internet, international, and telephone originated transactions) in the AML/fraud monitoring systems. Additionally, the lack of SEC codes creates a blind spot for identifying social security and tax refund fraud suspects.
  5. ATM locations - this information is not always available in the application systems. Without this information, your institution cannot detect cases of foreign ATM abuse (i.e. local cash deposits followed by foreign ATM withdrawals).
  6. General ledger transaction information - without this information a financial institution may not be able to detect multiple loan payments from inbound wires, as well as some internal employee transgressions.
  7. Location information of RDC terminals - this information is in the IP address for these e-banking transactions and may also be contained within the X9.37 file. Without this information, your institution will not be able to detect if a customer has moved their RDC terminal to a foreign location, creating an electronic pouch account (this fact pattern was a component of the 2010 Wachovia enforcement action resulting in a $160 million civil monetary penalty[1]).
  8. ITINs for W8 Customers (NRAs) - it is not uncommon for a CSR (Customer Service Representative) or teller to put 999-99-9999 in the TIN field when opening a new account, because many new account platforms require information in the TIN field in order to continue the process. This practice of defaulting Non-Resident Alien (NRA) customer social security numbers to 999-99-9999 generates inaccurate alerts within your AML monitoring system, aggregating all of those tax identification numbers as one entity. The issue is compounded when an examiner identifies that there is inaccurate information related to high-risk customer types, raising the question, “where else do you have inaccurate information?”. There are several methods for remedying this situation.
  9. Date or unique identifier in source extract file header - this information is usually generated by the middleware (Report Writer/Scheduler) that automatically generates the end-of-day file in a batch processing environment. The AML monitoring system uses this information to detect and alert the user that a source extract file is missing (e.g. a power outage occurs at several of your branch locations and is not restored prior to the scheduled extract process from the teller system). Consequently, the monitoring system will upload the previous day’s teller files in error, resulting in a duplication unless your monitoring system can detect that the unique file header is the same as the day prior.  

These are just a few simple examples from a long list of well-documented data issues of which many BSA officers are not aware. As you can see, data issues can create holes in your institution’s financial crime detection regime. And, the number of blind spots within any financial intuition’s automated financial crime detection system is usually proportionate to its size and complexity. The bigger the bank the more numerous the blind spots!


[1] Exhibit A Factual Statement #22