Subject: Masking username in Spark with regexp_replace and reverse functions


Thanks guys.

All the analysis on windowing functions are done using the authentic names.
I only randomize names for the reporting purposes. So the figures tend to
be correct.

I agree with you Jorn that masking one name is not enough and one can
identify the row through transaction dates and the amount paid. Also most
tools these days tokenize the name and account numbers not realising that
certain information like mobile numbers are unique IDs.

In this case it is just a case study but for real world it will require
professional tools and approach.

Thanks again.

Dr Mich Talebzadeh

LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
<https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*

http://talebzadehmich.wordpress.com
*Disclaimer:* Use it at your own risk. Any and all responsibility for any
loss, damage or destruction of data or any other property which may arise
from relying on this email's technical content is explicitly disclaimed.
The author will in no case be liable for any monetary damages arising from
such loss, damage or destruction.
On Sun, 17 Mar 2019 at 09:50, Jörn Franke <[EMAIL PROTECTED]> wrote: