Create a DLP Exact Match Hash from a Virtual Appliance
This process requires you to create a CSV file containing the exact match data and upload it to the Netskope cloud using a virtual appliance. When you upload the CSV file using the request dlp-pdd upload
command, the Virtual Appliance encrypts the file before uploading it to the Netskope cloud.
To create a hash of your structured content:
Prepare the file in CSV format structured in rows and columns. We recommend that the CSV file has no more than 50 million records and includes a header row that names the columns. These names will show up in the DLP rule under File Column for Exact Match validation. Ensure the data in the columns are normalized. There are two ways to normalize the data, depending on the data type.
Normalize columns that contain numbers: Ensure data, like credit cards, are consecutive numbers that don't contain special characters such as dashes, commas, quotes, and spaces.
Normalize columns that contain strings: Ensure data, like first and last names, are in Sentence case, with the first letter in uppercase and the remainder in lower case.
Using
nstransfer
account, transfer the CSV file to thepdd_data
directory on the Virtual Appliance:scp <CSV file> nstransfer@<virtual_appliance_host>:/home/nstransfer/pdd_data
The location of the
pdd_data
directory varies between thenstransfer
andnsadmin
user accounts. When using thenstransfer
account to copy the file to the appliance, the location of thepdd_data
directory is/home/nstransfer/pdd_data
. When you log in to the appliance using thensadmin
account, thepdd_data
directory is located at/var/ns/docker/mounts/lclw/mountpoint/nslogs/user/pdd_data
.After the data is successfully transferred, log in to the appliance using the
nsadmin
account.Run the following command at the Netskope shell prompt to hash the data and upload the data to the Netskope cloud:
request dlp-pdd upload column_name_present true csv_delim ~ norm_str 2,3 file /var/ns/docker/mounts/lclw/mountpoint/nslogs/user/pdd_data/upload/sensitivedata.csv
Tip
column_name_present true
specifies that there is a header row in the file.csv_delim ~
specifies that the CSV file is tilda-delimited.dict_cins 1,2,4
creates a case insensitive dictionary from columns 1, 2, and 4.dict_cs 3,5
creates a case sensitive dictionary from columns 3 and 5.norm_str 2,3
specifies that columns 2 and 3 are to be treated as strings.file <CSV_file>
specifies the file that needs to be hashed and uploaded.The command returns:
PDD uploader pid 9501 started. Monitor the status with >request dlp-pdd status.
Check the status of the upload:
request dlp-pdd status
The command returns:
Uploading data ...... 100% completed out of [####] of bytes
When the upload is complete, the command
request dlp-pdd status
returns:Successfully uploaded the data from /var/ns/docker/mounts/lclw/mountpoint/nslogs/user/pdd_data/upload/sensitivedata.csv to Netskope cloud
When the data is successfully uploaded, the
sensitivedata.csv
file and its corresponding column names will appear in the Exact Match tab of the DLP rules.