Skip to main content

Netskope Help

Create a DLP Exact Match Hash from a Virtual Appliance

This process requires you to create a CSV file containing the exact match data and upload it to the Netskope cloud using a virtual appliance. When you upload the CSV file using the request dlp-pdd upload command, the Virtual Appliance encrypts the file before uploading it to the Netskope cloud.

To create a hash of your structured content:

  1. Prepare the file in CSV format structured in rows and columns. We recommend that the CSV file has no more than 50 million records and includes a header row that names the columns. These names will show up in the DLP rule under File Column for Exact Match validation. Ensure the data in the columns are normalized. There are two ways to normalize the data, depending on the data type.

    Normalize columns that contain numbers: Ensure data, like credit cards, are consecutive numbers that don't contain special characters such as dashes, commas, quotes, and spaces.

    Normalize columns that contain strings: Ensure data, like first and last names, are in Sentence case, with the first letter in uppercase and the remainder in lower case.

  2. Using nstransfer account, transfer the CSV file to the pdd_data directory on the Virtual Appliance:

    scp <CSV file> nstransfer@<virtual_appliance_host>:/home/nstransfer/pdd_data

    The location of the pdd_data directory varies between the nstransfer and nsadmin user accounts. When using the nstransfer account to copy the file to the appliance, the location of the pdd_data directory is /home/nstransfer/pdd_data. When you log in to the appliance using the nsadmin account, the pdd_data directory is located at /var/ns/docker/mounts/lclw/mountpoint/nslogs/user/pdd_data.

  3. After the data is successfully transferred, log in to the appliance using the nsadmin account.

  4. Run the following command at the Netskope shell prompt to hash the data and upload the data to the Netskope cloud:

    request dlp-pdd upload column_name_present true csv_delim ~ norm_str 2,3 file /var/ns/docker/mounts/lclw/mountpoint/nslogs/user/pdd_data/upload/sensitivedata.csv

    Tip

    column_name_present true specifies that there is a header row in the file.

    csv_delim ~ specifies that the CSV file is tilda-delimited.

    dict_cins 1,2,4 creates a case insensitive dictionary from columns 1, 2, and 4.

    dict_cs 3,5 creates a case sensitive dictionary from columns 3 and 5.

    norm_str 2,3 specifies that columns 2 and 3 are to be treated as strings.

    file <CSV_file> specifies the file that needs to be hashed and uploaded.

    The command returns:

    PDD uploader pid 9501 started. Monitor the status with >request dlp-pdd status.

  5. Check the status of the upload:

    request dlp-pdd status

    The command returns:

    Uploading data ...... 100% completed out of [####] of bytes

    When the upload is complete, the command request dlp-pdd status returns:

    Successfully uploaded the data from /var/ns/docker/mounts/lclw/mountpoint/nslogs/user/pdd_data/upload/sensitivedata.csv to Netskope cloud
  6. When the data is successfully uploaded, the sensitivedata.csv file and its corresponding column names will appear in the Exact Match tab of the DLP rules.