technical August 09, 2016

The Documents Data Type

Document records are forms that contain confidential communications and information related to their organization.

Documents may include emails, contracts, invoices, and other internally produced forms that deal with the ongoing operations of a company. These forms contain critical information relating to a company’s intellectual property and internal processes, and could cause significant financial and reputational damage if stolen. A paragraph of text constitutes a single Document record in Matchlight. These records are fingerprinted and compared against the billions of fingerprints that Matchlight’s crawler indexes every day. In the event that Matchlight locates a match for a customer’s record, the customer instantly receives an alert. Alerts contain a relevance score, date and timestamp, and URL of where the information was found

Use Cases:

  • Monitor private communications between executive leadership.
  • Monitor trade secrets, including recipes and formulas.
  • Monitor contracts written with external parties, which may contain payment and exclusive industry information.

How We Treat This Record Type:

Normalization: To ensure that Matchlight’s web crawler finds clients’ information, Documents records are normalized before they are fingerprinted. Normalization accounts for changes in syntax, white space, formatting, and changes the record to upper case. This process allows Matchlight to detect the appearance of clients’ records if they have been altered from their original form.