Enterprise Vault™ Classification using the Microsoft File Classification Infrastructure

Last Published:
Product(s): Enterprise Vault (12.5)
  1. About this guide
    1. Introducing this guide
      1.  
        Relationship between FCI classification and other classification methods
    2. Where to get more information about Enterprise Vault
      1.  
        Enterprise Vault training modules
  2. Getting started
    1.  
      About classification
    2.  
      Overview of the procedure for setting up classification
    3.  
      Prerequisites for classification
    4.  
      Roles-based administration (RBA) and the classification feature
    5. How Enterprise Vault caches the items that it submits for classification
      1.  
        Limits on the size of classification files
      2.  
        Checking the cache location on the Enterprise Vault storage servers
      3.  
        Configuring Enterprise Vault to keep the classification files in the cache folder
  3. Setting up the classification properties
    1.  
      About the Enterprise Vault classification properties
    2.  
      Setting up the Enterprise Vault classification properties manually
    3.  
      Checking the Folder Usage classification property
    4.  
      How classification property values and retention categories interact
    5.  
      Setting up new values for the Enterprise Vault classification properties
    6.  
      Points to note on setting retention categories
  4. Configuring your classification rules
    1.  
      About classification rules
    2.  
      About the example classification rules
    3.  
      Importing the example rule set
    4.  
      Creating or changing classification rules
    5.  
      Supported configuration parameters for rules that use the Veritas Information Classifier method
  5. Defining and applying classification policies
    1.  
      About classification policies
    2.  
      Defining classification policies
    3.  
      About the PowerShell cmdlets for working with classification policies
    4.  
      Associating classification policies with retention plans
    5.  
      About the PowerShell cmdlets for working with retention plans
    6.  
      Applying retention plans to your Enterprise Vault archives
  6. Running classification in test mode
    1.  
      About classification test mode
    2.  
      Implementing classification test mode
    3.  
      About the PowerShell cmdlets for running classification in test mode
    4.  
      Understanding the classification test mode reports
  7. Publishing classification properties and rules across your site
    1.  
      How to publish the classification properties and rules
  8. Using classification with smart partitions
    1.  
      About smart partitions
    2.  
      How Enterprise Vault determines whether to archive an item to a smart partition
    3.  
      Setting up smart partitions
    4.  
      Verifying that Enterprise Vault has archived items to smart partitions
  9. Appendix A. Enterprise Vault properties for use in classification rules
    1.  
      About the Enterprise Vault properties
    2.  
      System properties
    3.  
      Attachment properties
    4.  
      Custom Enterprise Vault properties
    5.  
      Custom Enterprise Vault properties for File System Archiving items
    6.  
      Custom Enterprise Vault properties for SharePoint items
    7.  
      Custom Enterprise Vault properties for Compliance Accelerator-processed items
    8.  
      Custom properties for use by policy management software
    9.  
      Custom properties for Enterprise Vault SMTP Archiving
  10. Appendix B. PowerShell cmdlets for use with classification
    1.  
      About the classification cmdlets
    2.  
      Disable-EVClassification
    3.  
      Get-EVClassificationFCITags
    4.  
      Get-EVClassificationPolicy
    5.  
      Get-EVClassificationStatus
    6.  
      Get-EVClassificationTestMode
    7.  
      Import-EVClassificationFCIRules
    8.  
      New-EVClassificationPolicy
    9.  
      Publish-EVClassificationFCIRules
    10.  
      Remove-EVClassificationPolicy
    11.  
      Set-EVClassificationPolicy
    12.  
      Set-EVClassificationTestMode
  11. Appendix C. Monitoring and troubleshooting
    1.  
      Auditing
    2.  
      Checking the classification performance counters
    3.  
      Troubleshooting classification

Supported configuration parameters for rules that use the Veritas Information Classifier method

When you create a rule that uses the Veritas Information Classifier method, you must specify one or more additional configuration parameters. These parameters define the text strings or regular expressions for which you want to search in items. Each parameter consists of a name and a corresponding value.

You can specify multiple configuration parameters for the same rule. For example, you may want to create a rule that searches the subject lines of items for one word and their message bodies for a second word. Where this is the case, an item must match all the parameters for the rule to match; the Veritas Information Classifier links the parameters together with Boolean AND operators rather than OR operators.

Note:

To simulate the effect of linking multiple parameters with Boolean OR operators, create multiple rules that assign the same value to the same classification property. For example, you might create two rules that assign the same value to the evtag.category property: one rule that searches the subject lines of items for a word and a second rule that searches their message bodies for a different word.

Supported values for Name

The values that you type in the Name column of the Classification Parameters dialog box set the scope of the configuration parameter: they specify the properties of an item that you want to search.

You can search an individual property by typing its name in the Name column. For example, you might type cont to search the message body of an item or rbea to search the email addresses of its recipients. Indexed items can have a large number of properties, but only a subset is of interest for classification purposes. These are the properties and associated values that Enterprise Vault stores in the plain-text files in the classification cache folder.

If you want to classify the items in one archive only, the archiveid property lets you specify the unique identifier of this archive. For example, by specifying an archiveid property value in one configuration parameter and a cont property value in a second configuration parameter, you can limit classification to the items in the nominated archive that have particular words in their message bodies.

A number of composite properties are also available with which you can search multiple properties of items at once. Table: Composite properties describes these values.

Table: Composite properties

Name

Description

Attachment

Searches all the attachment-related properties: content, file name, size, type, and dates.

Author

Searches the author properties.

Content

Searches both the subject line and content of items and their attachments.

Item

Searches the item in its entirety: subject line, content, and all the classifiable properties of items and their attachments.

Recipient

Searches the recipient list properties.

Subject

Searches the subject lines of items and their attachments.

You can combine multiple properties in a single Name value by separating them with a pipe symbol (|). For example, the following Name value is equivalent to the composite value Subject because it lets you search the subject lines of an item (subj) and its attachments (a_subj).

subj|a_subj

The next example searches the subject lines of an item and its attachments (Subject) and the content of those attachments (a_cont).

Subject|a_cont

Supported values for Value

In the Value column of the Classification Parameters dialog box, you specify what to search for: a word or phrase, for example, or a regular expression.

By default, the values that you enter are case-insensitive. So, the value Fraud matches not just Fraud but fraud and FRAUD as well. However, you can make a value case-sensitive by preceding it with (?-i). For example, (?-i)Fraud matches Fraud only.

Specify date and time values as Coordinated Universal Time (UTC) values in the ISO 8601 format. According to ISO 8601, a combined date and time value has the following format:

yyyy-mm-ddThh:mm:ssZ

For example, 2016-07-12T13:00:00Z.

Table: Supported values in the Value column describes the types of values that the Veritas Information Classifier supports.

Table: Supported values in the Value column

Value

Description

A string

Searches for the specified word or phrase, such as fraud or cover up.

A regular expression

Searches for the specified regular expression. A regular expression is a pattern of text that consists of ordinary characters (for example, letters a through z) and special characters, called metacharacters. The pattern describes one or more strings to match when searching text. For example, the following regular expression matches the sequence of digits in all Visa card numbers:

\b4[0-9]{12}(?:[0-9]{3})?\b

The regular expression docx? matches both doc and docx, so it is useful if you want to search for Microsoft Word documents.

Your regular expressions must conform to the .NET Framework regular expression syntax. For more information on this syntax, see the following articles on the Microsoft website:

https://msdn.microsoft.com/library/az24scfc.aspx

http://go.microsoft.com/fwlink/?LinkId=180327

For many illustrations of regular expression syntax, see the example classification rules.

A proximity search

Searches for words or regular expressions that are within the specified number of characters of each other. Punctuation and space characters count as normal characters. The syntax is as follows:

NEAR[proximity,regular_expression,regular_expression]

For example, type the following to find fraud and cover up within 100 characters of each other:

NEAR[100,fraud,cover up]

Type the following to find fraud and either cover up or write off within 150 characters of each other:

NEAR[150,fraud,(cover up|write off)]

A list of strings or regular expressions

Searches for multiple words, phrases, or regular expressions. The syntax is as follows:

LIST[string_or_regular_expression|string_or_regular_expression|...]

For example, to find cost of sales, earnings per share, or financial expenses, type the following:

LIST[cost of sales|earnings per share|financial expenses]

If you want to enter a list that contains many hundreds of words or phrases, you may be able to maximize performance with the following, alternative syntax:

LARGELIST[string1|string2|string3|...]

LARGELIST uses a different method for evaluating the list against the item properties. You can further enhance performance by placing the words or phrases that are most likely to find a match at the start of the list.

Note:

Unlike LIST, LARGELIST does not support regular expressions.

A date range

For use with date-type properties only, such as adat, date, and mdat. Searches for items with a date property value that falls within the specified date range. Ranges can be open-ended. The syntax is as follows:

  • YYYY-MM-DD..YYYY-MM-DD

    For example, 2016-01-20..2016-06-19 finds items between these two dates.

  • YYYY-MM..YYYY-MM

    For example, 2015-01..2016-07 finds items between these two months.

  • YYYY..YYYY

    For example, 2015..2016 finds items between these two years.

  • YYYY-MM-DD..

    For example, 2016-01-20.. finds items after this date.

  • ..YYYY-MM-DD

    For example, ..2016-01-20 finds items before this date.

The dates are in the current time zone on the Enterprise Vault storage server.