Compliance and Data Quality

Bad quality master data increases the probability of missed hits; for instance, when matching customer names against names of politically exposed persons (PEP), relevant hits may be missed due to character encoding problems.

In banking systems, names are represented as sequences of binary digits (bits). The first name “Jim”, for instance, can be represented by three called ASCII characters:

"01001010" "01101001" "01101101".

Character encodings such as ISO-8859-1 and UTF-8 have different character representations, since ASCII cannot represent ä, ö, ü and other characters.

For example, the name “Jürg Näf” has different encodings in ISO-8859-1 and UTF-8.

"01001010" "11111100" "01110010" "01100111"

"01001010" "11000011 10111100" "01110010" "01100111"

"01001110" "11100100" "01100110"

"01001110" "11000011 10100100" "01100110"

The character encoding becomes compliance-relevant when a banking system uses simultaneously different types for legacy reasons. In this case, one encoding of “Jürg Näf” may match perfectly. However, if by mistake the UTF-8 encoding is assumed to be ISO-8859-1, 44% of the characters will not match (4 out of 9).

Generation of verifications for hits that differ by 40% or more, would obtain a significant number of false positives.

Back to Compliance

Complete Revision of the Federal Data Protection Act

Complete Revision of the Federal Data Protection Act: „As of 15th September 2017, draft and report for a completely revised Federal Data Protection Act is public. In a first step parliament and the people agreed to adaptations in order to be compliant with EU law. The second part of the revision is debated by the parliament since September 2019. Data Protection is to be increased by giving people more control over their private data as well as reinforcing transparency regarding the handling of confidential data.”

Links: datenrecht.ch

Compliance

Media Analysis

Knowledge Managament

Compliance and Data Quality

Name Coding

Jim (ASCII)

Other Encodings

Jürg (ISO-8859-1)

Jürg (UTF-8)

Näf (ISO-8859-1)

Näf (UTF-8)

Compliance Blog

Complete Revision of the Federal Data Protection Act