Enterprise Vault™ Classification using the Veritas Information Classifier

Last Published:
Product(s): Enterprise Vault (14.5)
  1. About this guide
    1. Introducing this guide
      1.  
        Relationship between the Veritas Information Classifier and other classification methods
    2.  
      What's in this guide
    3. Where to get more information about Enterprise Vault
      1.  
        Enterprise Vault training modules
  2. Preparing Enterprise Vault for classification
    1.  
      About the preparatory steps
    2.  
      What you need
    3.  
      Checking the cache location on the Enterprise Vault storage servers
    4.  
      Setting up the Data Access account
    5.  
      Enabling the Veritas Information Classifier on all Enterprise Vault servers
    6.  
      Configuring the Veritas Information Classifier for secure client connections
  3. Setting up Veritas Information Classifier policies
    1.  
      Introducing Veritas
    2.  
      Opening the Veritas Information Classifier
    3.  
      Finding your way around
    4.  
      Analyzing sample content for policy matches
    5. About policies
      1.  
        Creating policies
      2.  
        About policy conditions
      3.  
        Enabling or disabling policies
      4.  
        Exporting or importing policies
      5.  
        Resetting policies
      6.  
        Deleting policies
    6. About patterns
      1.  
        Creating or editing patterns
      2.  
        Exporting or importing patterns
      3.  
        Deleting patterns
    7. About tags
      1.  
        Creating or editing tags
      2.  
        Exporting or importing tags
      3.  
        About the Enterprise Vault index properties
      4.  
        How classification property values and retention categories interact
      5.  
        Points to note on setting retention categories
      6.  
        Deleting tags
    8. About sentiment analysis
      1.  
        About sentiment conditions
      2.  
        Enforcing sentiment analysis at a site level
  4. Defining and applying Enterprise Vault classification policies
    1.  
      About Enterprise Vault classification policies
    2. Defining classification policies
      1.  
        Configuring classification policies to assign retention categories with the shortest duration
    3.  
      About the PowerShell cmdlets for working with classification policies
    4.  
      Associating classification policies with retention plans
    5.  
      About the PowerShell cmdlets for working with retention plans
    6.  
      Applying retention plans to your Enterprise Vault archives
  5. Running classification in test mode
    1.  
      About classification test mode
    2.  
      Implementing classification test mode
    3.  
      About the PowerShell cmdlets for running classification in test mode
    4.  
      Understanding the classification test mode reports
  6. Using classification with smart partitions
    1.  
      About smart partitions
    2.  
      How Enterprise Vault determines whether to archive an item to a smart partition
    3.  
      Setting up smart partitions
    4.  
      Verifying that Enterprise Vault has archived items to smart partitions
  7. Appendix A. Enterprise Vault properties for use in custom field searches
    1.  
      About the Enterprise Vault properties
    2.  
      System properties
    3.  
      Attachment properties
    4.  
      Custom Enterprise Vault properties
    5.  
      Custom Enterprise Vault properties for File System Archiving items
    6.  
      Custom Enterprise Vault properties for SharePoint items
    7.  
      Custom Enterprise Vault properties for Compliance Accelerator-processed items
    8.  
      Custom properties for use by policy management software
    9.  
      Custom properties for Enterprise Vault SMTP Archiving
  8. Appendix B. PowerShell cmdlets for use with classification
    1.  
      About the classification cmdlets
    2.  
      Disable-EVClassification
    3.  
      Get-EVClassificationPolicy
    4.  
      Get-EVClassificationStatus
    5.  
      Get-EVClassificationTestMode
    6.  
      Get-EVClassificationVICTags
    7.  
      Initialize-EVClassificationVIC
    8.  
      Set-EVClassificationVICFIPSMode
    9.  
      New-EVClassificationPolicy
    10.  
      Remove-EVClassificationPolicy
    11.  
      Set-EVClassificationPolicy
    12.  
      Set-EVClassificationTestMode
  9. Appendix C. Classification cache folder
    1.  
      How Enterprise Vault caches the items that it submits for classification
    2.  
      Limits on the size of classification files
    3.  
      Configuring Enterprise Vault to keep the classification files in the cache folder
  10. Appendix D. Migrating from FCI classification to the Veritas Information Classifier
    1.  
      Converting FCI classification rules for use with the Veritas Information Classifier
  11. Appendix E. Monitoring and troubleshooting
    1.  
      Auditing
    2.  
      Checking the classification performance counters
    3.  
      Troubleshooting classification
    4.  
      Searching archives for items that the Veritas Information Classifier has classified
    5.  
      Troubleshooting language detection

Troubleshooting language detection

By default, Veritas Information Classifier determines the language in a message if there are at least 80 characters. Using Veritas Information Classifier 2.4.0, an administrator can configure the minimum number of characters and a higher or lower confidence level for language detection. When multiple languages are present in small files, the administrator can specify a smaller size of each chunk that language detection is performed on.

Perform the following steps:

  1. Navigate to the C:\Program Files (x86)\Enterprise Vault\Services\vic\Engine directory and open the .vic-overrides-config.yml file with a text editor.

    This file is used to override the configuration settings Veritas Information Classifier for customization.

  2. Ensure that the property languageDetectionEnabled under the classifier section is set to true.
  3. To override any values for language detection, set the values for the following properties under the classifier section.

    Property

    Description

    minimumTextRequiredForLanguageDetection

    Specify the minimum length of text for language detection.

    Any text smaller than the set value is designated as language "unknown". The default value is 80 Unicode characters.

    chunkSizeForLanguageDetection

    Specify the size of each chunk that language detection is performed on. The default value is 300.

    For example, if a document is of length 500 Unicode characters, then Veritas Information Classifier detects language on the first 300 characters and then on the last 200 characters, the language which has the most occurrences is designated as primary.

    When the document has less than 300 Unicode characters where multiple languages are present, use this property to reduce the chunk size for language detection.

    minimumConfidenceForLanguageDetection

    Specify the confidence level to detect language. Higher confidence level gives greater accuracy but with a greater likelihood of language being determined as "unknown".

    The value should be between 1 and 100. The default value is 90.

    An example of the override entries:

    classifier:
      minimumTextRequiredForLanguageDetection: 200
      chunkSizeForLanguageDetection: 400
      minimumConfidenceForLanguageDetection: 90
    
  4. Save the .vic-overrides-config.yml file.
  5. Recycle the EnterpriseVaultVIC application pool.

    The changes get reflected in the .vic-merged-config.yml file under the C:\Program Files (x86)\Enterprise Vault\Services\vic\Engine directory.