Clearing, Scrubbing, or Masking a column via Oracle Data Pump Export

I was recently asked to have a particular columns cleared out for exports into a test environment. It turns out with Data Pump this is a fairly simple task.

You first need a function in a package to perform the masking. The column I needed to clear was a unique key field so putting null in there wouldn’t work well. As such I decided to just mask the values using a hashing algorithm. The function simply needs to accept a parameter of the appropriate data type (of your column) and then return the same data type. I decided to use SH-1 to give the greatest possibility of the values definitely staying unique. Be aware that the hashed value could be bigger than the original, in which case you might have some extra work to do. Also, be aware that if it is a foreign key you might have to sync the data across tables.

Here is the package I created (with a few tweaks to hide the innocent):

create or replace package remap_data as
  function replace_me_name (input_me_name varchar2) return varchar2;
end remap_data;
create or replace package body remap_data as
  function replace_me_name
  ( input_me_name varchar2)
  return varchar2
    -- unique column, need a value, let's hash it
    return SYS.DBMS_CRYPTO.HASH(UTL_RAW.CAST_TO_RAW(input_me_name),dbms_crypto.hash_sh1);
end remap_data;

Once this was created, exporting the data was simple with a quick call to remap_data, in the format:
where schema1 is the owner of the table, and schema2 is the owner of the masking function.

expdp dumpfile=myuser.dmp logfile=myuser.dmp.log schemas=myuser

Here is an example of the data from the original system:

---------- ----------------------------------------
    100000 849444020233644070709145558
    100001 849440300147702080409224654
    100002 849444730631285200109213127
    100003 849444730808339031109001029
    100004 849444040102357200109213127
    100006 849440012155449200109213127
    100007 849444010298284200109213127
    100008 849444670050033280709100847
    100009 849444030002443200109213127
    100010 849440012373745200109213127

Here is the data from the system it was loaded to:

     ME_ID ME_NAME                    <-- larger data length
---------- --------------------------------------------------
    100000 4FF2E04D404BB8B624EAC3389F835402
    100001 081E63C6F5142049CE8C7DF46D24EFC0
    100002 7B3DF42B1EE30609A6D20CE8224D1337
    100003 E5B22EE711DEA64A40A0E3E689B06990
    100004 BC5CEFF452AC352BC6D953B7C20E23E4
    100006 BA716AA0A9C0B0ED977BA4CE3DB57171
    100007 8D61A436A6C288B7D012AA24F88FF2ED
    100008 4CAE72B024B55AACE5BEB1CD9275CDF0
    100009 CE47D15720D61099A701A803EE6B3372
    100010 B5901CBE8302DE1302033CD3A384A9F6

So, there you have it, pretty simple (if you don’t have too many columns to mask.

One response to “Clearing, Scrubbing, or Masking a column via Oracle Data Pump Export

  1. Pingback: Clearing or Masking a column via Data Pump Export | Jed's

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s