The regulatory case for repeat Thematic Reviews: Evidence from a decade of independent sanctions testing

Executive summary
Regulators around the world are under increasing pressure to demonstrate that their supervised entities are not just monitored, but measurably their screening system performance. In a complex and evolving sanctions landscape shaped by geopolitical uncertainty and accelerating regulatory expectations, the consequences of weak sanctions controls have never been higher.
This article shows why repeat Thematic Reviews are one of the most cost-effective tools available to a regulator. Drawing on data from multiple jurisdictions across more than a decade of independent testing conducted by AML Analytics, it demonstrates that where regulators have committed to structured, repeated review programmes, jurisdiction-wide sanctions performance has greatly improved.
Jurisdictions that have implemented Thematic Review programmes in partnership with AML Analytics have made demonstrable contributions to National Risk Assessments (NRAs). In one case, a jurisdiction was removed from the FATF Grey List, an outcome to which the structured programme of repeat testing is considered to have been a significant contributing factor. Other jurisdictions saw manipulated score performance improve by as much as 26% in three years across their entire market.
A single Thematic Review provides a snapshot. A programme of reviews produces momentum. In 2026, it is the regulatory trajectory that is key.
Why Thematic Reviews matter more than ever in 2026
Regulators around the world continue to impose large fines on financial institutions (“FIs”) that fail to detect sanctioned individuals or entities that have been allowed access to their financial systems. Sanctions screening systems that don’t work as expected represent a jurisdiction-wide vulnerability with real reputational and geopolitical consequences.
Beyond financial penalties, firms face a range of serious civil and legal consequences for sanctions screening failures. These can include criminal charges against both the firm or its senior individuals, Deferred Prosecution Agreements (DPAs) that place organisations under binding obligations for many years, the imposition of independent compliance monitors with oversight powers and licence restrictions that can threaten the firm’s ability to operate.
Three key areas are making independent supervisory Thematic Review testing more critical than ever:
Sanctions Complexity
The current geopolitical climate has accelerated the pace at which sanctions regimes evolve, dramatically increasing the operational burden on FIs. New designations arrive continuously requiring firms to monitor sanctioned records and complex ownership structures across multiple regimes in real time.
Independent Thematic Review testing allows regulators to establish a performance benchmark for sanctions screening systems. This gives supervised entities a clear standard to attain and helps regulators identify which entities within their jurisdiction require the most urgent supervisory attention.
Regulatory Acceleration
The FATF Recommendations continue to raise the bar for AML/ CFT supervision around the globe. The establishment of the EU’s AMLA authority, alongside increasing regulatory enforcement across Europe, Asia and the US has created a landscape of jurisdictional competitiveness on AML/CFT standards with genuine geopolitical consequences.
Regulators are now being assessed not only on whether oversight processes exist, but on whether those processes are facilitating measurable improvements in sanctions screening performance across their supervised entities.
The AI and Technology Inflection Point
Firms are increasingly adopting AI-assisted screening tools to identify potential alerts faster and reduce false positives. However, this presents a growing challenge for regulators who must be able to understand and interrogate the reasoning behind AI-driven decisions.
Without adequate explainability, there is a real risk of FIs relying on “black box” systems where the underlying logic is hidden and unverifiable. Regulators require demonstrable AI explainability to assess whether a system is genuinely effective at identifying alerts.
Vendor claims alone are insufficient to prove that AI-assisted screening tools perform as intended. Assurance must be gathered through independent, objective testing and validation, and Thematic Reviews are well placed to provide this. At the heart of these three areas lies a persistent accountability gap frequently exposed by enforcement actions: controls existed on paper but were never proven to work effectively in practice. This position is reinforced by recent guidance issued by the Hong Kong Monetary Authority in March 2026, building on guidance issued in 2018. The guidance recommends that entities consider appointing independent external firms to test sanctions screening systems and benchmark their results.
AI-assisted screening tools risk widening this gap by distancing financial crime professionals from direct responsibility for screening outcomes. As these tools become the industry standard, Thematic Reviews will play a critical role in verifying their reliability and ensuring that compliance teams retain full accountability for sanctions screening across their operations.
What is a Thematic Review?
A sanctions screening Thematic Review is a supervisory tool in which a regulator commissions independent testing of sanctions screening systems across a defined population of regulated entities. Rather than relying on self-reported compliance or internal audit findings, every participating entity is put through the same test under the same conditions, producing results that are objective and comparable.
Each entity receives a test dataset, runs it through their screening system, and uploads their results. They have no control over the test design as this is mandated by the regulator. The entity will then receive feedback and is expected to carry out remediation activities if necessary. This is categorically different from an internal audit, a vendor assessment, or a self-reported compliance audit, all of which are subject to entity bias.
At its core, the Thematic Review is built around three fundamental questions that regulators need answered: the “Three Es”.
First, effectiveness: is the screening system actually identifying sanctioned entities as it should?
Second, efficiency: is the system calibrated appropriately, or is it generating excessive false positives?
Third, explainability: can the entity demonstrate to the regulator why its system produced the results it did?
Together, these three areas give regulators a complete picture of how a sanctions screening system is performing.
AML Analytics is the only specialist RegTech/SupTech company in the world working with regulators, governments, supervisors, and central banks in this field. Having conducted the world’s first sanctions screening Thematic Review in 2014, AML Analytics has since tested over 1,700 financial systems for more than 1,000 regulated entities across the globe as part of a Thematic Review. No internal resource is required from the regulator – AML Analytics provides the expertise, technology, and methodology.
The methodology is vendor neutral. Results reflect how a system is configured and used, not which vendor supplies it. This is a critical distinction in 2026, as regulators are increasingly confronted with firms citing AI tools as justification for reduced scrutiny.
Test data is drawn from major global sanctions lists, and any local lists required by the regulator. This data is then used to create the test file for the regulator according to their specifications. The test file is made up of control, manipulated, and non-sanctioned (or “Clean Id”) data.
Control data tests how a screening system responds to records exactly as they appear on sanctions lists. Manipulated data tests the fuzzy matching capabilities of a system using over 70 algorithms. Clean Ids are used to measure false positive rates, evaluating a system’s efficiency.
The output is binary: a screening system either matches a sanctioned record, or it does not. There is no scope for interpretation, a record is either returned as a match, or it is not. This removes subjectivity from the supervisory process entirely. The binary nature of the data means results are directly comparable across entities, and across jurisdictions. It is the objective nature of Thematic Review data that makes it credible for regulatory authorities
The Global Benchmark
Global Benchmark is a solution developed and published by AML Analytics each month. AML Analytics publishes benchmark scores for both customer and transaction screening systems at control and manipulated level.
These monthly scores are compiled from test results submitted by Global Benchmark customers around the world and represent the average performance levels of sanctions screening systems across the global market.
During a Thematic Review, AML Analytics will present the Global Benchmark scores in comparison to the regulator’s jurisdiction data. This allows the regulator to see in precise terms, where their market stands relative to the international standard.
This equips the regulator with a credible reference point, an independent marker of what good system performance looks like. A jurisdiction that falls below the Global Benchmark has objective evidence of the performance gap, and a jurisdiction that meets or exceeds it has objective evidence of that achievement.
In both cases, the regulator has a clear, defensible basis for the supervisory actions that follow. For FATF Mutual Evaluations, independently benchmarked performance data of this kind carries particular weight.
What happens when regulators repeat a Thematic Review?
Data from repeat Thematic Reviews across multiple jurisdictions tells a consistent story. When regulators repeat a Thematic Review and retest the same entities, sanctions screening effectiveness and efficiency will improve.
The jurisdiction-level picture
The world’s first sanctions Thematic Review was conducted in a country in Africa in 2014 with 30 regulated entities. The average effectiveness control score was 83%, with 20% of entities still relying on manual screening processes. By the 2017 retest, following remediation, control scores had risen to 94%, an improvement of effectiveness of 11% in three years.
A European regulator conducted reviews in 2022 and 2025. Effectiveness scores on control data improved from 81% to 90%, while manipulated scores rose dramatically from 64% to 90%.
In the Caribbean, one regulator saw control scores improve by 25% and manipulated scores improve by 32% between 2021 and 2022 for transaction screening. Over a five-year period from 2020 to 2025, average scores across all institutions in the jurisdiction moved above the Global Benchmark. This jurisdiction was subsequently removed from the FATF Grey List. The jurisdiction’s structured programme of repeat Thematic Reviews is considered to have been a significant contributing factor to this outcome.
The same Caribbean regulator also recorded substantial gains in efficiency over the review period. For customer screening, the average number of returns per hit fell significantly, representing a 28.6% improvement in efficiency on control data and a 41.9% improvement on manipulated data.
For transaction screening, efficiency gains were even more pronounced, with 69.5% on control data and 59.9% on manipulated data.
The Institution-Level Picture
Jurisdiction-wide averages tell one part of the story, but entity-level data tells the regulator where attention needs to be directed.
The following compares results for six individual FIs that were tested in both 2022 and 2025.
Control scores (customer screening and transaction screening combined):
|
Institution
|
2022
|
2025
|
Percentage change
|
|---|---|---|---|
|
Bank 1
|
91%
|
99%
|
+8%
|
|
Bank 2
|
56%
|
72%
|
+17%
|
|
Bank 3
|
91%
|
98%
|
+7%
|
|
Bank 4
|
100%
|
99%
|
-1%
|
|
Bank 5
|
77%
|
98%
|
+22%
|
|
Bank 6
|
78%
|
98%
|
+21%
|
Manipulated scores (customer screening and transaction screening combined):
|
Institution
|
2022
|
2025
|
Percentage change
|
|---|---|---|---|
|
Bank 1
|
82%
|
98%
|
+16%
|
|
Bank 2
|
24%
|
52%
|
+28%
|
|
Bank 3
|
82%
|
92%
|
+10%
|
|
Bank 4
|
55%
|
88%
|
+33%
|
|
Bank 5
|
61%
|
91%
|
+30%
|
|
Bank 6
|
58%
|
81%
|
+24%
|
This data highlights that improvement is not uniform. While five of the six entities improved substantially on control scores, the manipulated scores reveal a wide range of results and rates of change.
This level of granularity is where Thematic Reviews deliver their greatest supervisory value. Regulators can clearly distinguish between financial institutions that are improving, those that are stagnating, and those operating at or near best practice.
Regulators with this data can present entities under their supervision with their performance in clear market context. For example, a regulator can demonstrate that a financial institution’s manipulated screening score places it in the bottom quartile, that it has failed to improve between review cycles while comparable institutions have processes, and that its performance falls below the Global Benchmark by a defined margin.
In an environment of finite supervisory resources, this ability to target intervention is as valuable as the overall market-wide improvement that Thematic Reviews generate.
The Emerging Market Picture
The Caribbean data highlights that Thematic Reviews work across different jurisdiction types. Regardless of the size or complexity of a regulator’s market, Thematic Reviews have been demonstrably effective in raising jurisdiction-wide performance across both effectiveness and efficiency, with tangible outcomes, including FATF list changes.
Why improvement happens: root causes and actions taken
The data shows that improvement happens consistently. Understanding why it happens is what allows regulators to design programmes that maximise their effect.
Configuration: FIs frequently deploy screening systems with vendor default settings and do not revisit them as the sanctions landscape evolves or as their risk appetite changes. The knowledge that the regulator will return to retest creates a sustained compliance incentive that a one-off Thematic Review does not generate.
AI Tool Explainability: Regulators require entities to demonstrate why a system alerts. As AI-assisted tools become more prevalent, unexplainable decisions are becoming more common. Thematic Reviews benchmark explainability against peers across the market and highlight firms that rely on the same vendor or model.
List Management Failure: The gap between designation and deployment is a primary cause of enforcement actions. Thematic Reviews identify this gap at jurisdiction level, enabling regulators to issue sector-wide guidance.
Senior Management Disengagement: Limited senior leadership involvement in screening system governance is a consistent driver of poor performance. The external accountability created by a Thematic Review changes this dynamic in a way that an internal audit rarely achieves. When senior management know the regulator will repeatedly test their system, screening governance moves up the agenda.
Thematic Reviews as a supervisory strategy
Thematic Reviews are not just a compliance tool. Deployed as part of a structured programme, they can sit at the centre of a regulatory strategy with measurable, demonstrable outcomes.
A single Thematic Review produces a clear view of the market. Subsequent reviews produce a trajectory, and a trajectory is what regulators need in order to demonstrate that their supervised entities is actively improving.
For FATF Mutual Evaluations, the difference between a jurisdiction that can evidence a structured supervisory programme with quantifiable, independently generated results and one that relies on self-reported compliance is significant.
The LAC jurisdiction’s removal from the FATF Grey List illustrates the kind of outcome that a structured programme of repeat testing can contribute to. Thematic Reviews were a significant part of that jurisdiction’s supervisory evidence base. This is precisely the kind of independently generated, data-driven evidence that FATF assessors are looking for.
Thematic Reviews: Integration with ORBS
Thematic Review data now feeds directly into ORBS, a risk analytics tool from AML Analytics. This makes the entire Thematic Review process simple and straightforward. When an FI tests its screening system as part of a Thematic Review, those results are shared automatically with the regulator through ORBS. This creates total transparency and gives regulators complete confidence in the data they are acting on. Routine test results outside of a Thematic Review can also be shared by an FI with their regulator.
Remedial action taken by FIs following a Thematic Review is also fed back into the ORBS Risk Matrix, allowing regulators to observe quantified improvements to the risk profile of entities and sectors in real time. The Risk Matrix continues to support detailed results interrogation, now significantly enhanced with drill-through capabilities.
By embedding Thematic Reviews directly into ORBS, regulators gain real-time visibility of sanctions screening performance across their entire market.
Looking ahead: Building a Thematic Review programme for 2026 and beyond
As AI screening tools become more prevalent, the ability to test them independently, rather than accepting vendor assurance, is becoming critical.
The evidence suggests that a two-to-three-year review cycle is an effective supervisory strategy: sufficient time for meaningful remediation, but not so long that risk accumulates between reviews. This cycle should become part of the fabric of supervisory planning, rather than a response to emerging concern.
No internal resource from the regulator is required to run a Thematic Review. AML Analytics provides the expertise, the technology, the testing methodology and the analytical framework. Every regulator that has completed the process has confirmed its cost-effectiveness. The barrier to building a structured supervisory programme is lower than many regulators assume.
Conclusion
The data presented in this article demonstrates clearly that when regulators commit to a structured cycle of Thematic Reviews and retesting, the sanctions screening system performance of their regulated entities improves.
Evidence from sanctions screening Thematic Reviews in Africa, Europe, and the Caribbean, all point in the same direction. Thematic Review programmes have made a significant contribution to a number of jurisdictions’ National Risk Assessments. One jurisdiction was subsequently removed from the FATF Grey List, with the structured programme of repeat testing considered a significant contributing factor. Other jurisdictions have seen scores improve by as much as 26% across their entire market.
In 2026, with the sanctions landscape more complex than at any previous point in time, a structured Thematic Review programme is one of the most powerful and cost-effective tools available to a regulator, one that raises jurisdiction-wide performance and provides demonstrable evidence of what has been achieved.