Description
Index volume until Enterprise Vault 14.1
Until Veritas Enterprise Vault 14.1, index volume was mapped one-on-one with index (collection) managed by Velocity or Alta Vista index engine. Therefore an index volume can be physically located inside available non-Elasticsearch index locations by volume ID-based folder.
Index volume from Enterprise Vault 14.2 onwards
Starting from Enterprise Vault 14.2, the Elasticsearch index engine is introduced where data ingested in an index is internally distributed by the index engine in different shards for optimized read and write performance. Therefore, Enterprise Vault index volume mapping with an Elasticsearch index is kept as one to many. This means, multiple index volumes can be associated with a physical Elasticsearch index and the data from multiple index volumes can reside in a single index. This association is maintained by aliases mapped with an index.
Each content source or archive type has a separate Elasticsearch index. Exchange Mailbox, Exchange Journal, Domino Mailbox, Domino Journal, SMTP journal, and so on have a separate Elasticsearch index. The naming format of an Elasticsearch index is {ComputerName}_{ArchiveType}_{Number}
.
Example:
An Exchange Mailbox archive triggering the first index creation on the index server 'EVINDX1' will have the name 'EVINDX1_mb_1'. As soon as the thresholds are reached for the existing index for Exchange Mailboxes, a new index will be created with the name 'EVINDX1_mb_2'.
There are a certain number of archives that can be indexed in an Elasticsearch index. All index volumes associated with an archive are indexed into the same index unless size thresholds trigger the creation of a new index and an archive can be spanned across multiple indices of the same type.
When an archive is associated with an Elasticsearch index, an alias in the form of the index volume is associated with that index.
Example:
An archive "EXArchive1" has index volumes "EXArchive1IV1", "EXArchive1IV2". Items from both the index volumes will go in the same index, provided space is available. Elasticsearch index will have aliases as EXArchive1IV1 and EXArchive1IV2.
While creating an Elasticsearch index, the type of archive is considered. The broad level division is journal archive types (Exchange Journal, Domino Journal, SMTP Journal) and all other archive types except Journal Archive types.
As a general pattern, journal archives are less in numbers but can have large data per archive and non-journal archives are more in numbers but have smaller data per archive.
The following threshold parameters/keys can be defined in EVIndexVolumeProcessor.exe.config in the app settings section to override default limits for the size, the number of shards, and maximum archives per index:
- MaximumArchivesPerIndexForLargeData: Number of archives that can be associated per Elasticsearch index. The default value is 7. Large data refers to Exchange, Domino, and SMTP journal archives.
- MaximumArchivesPerIndexForSmallerData: Number of archives that can be associated per Elasticsearch index. The default value is 600. Smaller data refers to all archives except Exchange, Domino, and SMTP journal archives.
- ShardsPerIndexForLargeData: Number of shards allocated per index of Exchange Journal, Domino Journal, and SMTP archives type. The default value is 10.
- ShardsPerIndexForSmallerData: Number of shards allocated per index of all archive types except Exchange Journal, Domino Journal, and SMTP archives types. The default value is 5.
- CommaSeparatedLargeDataArchiveTypes: Comma-separated string of archive types that need to be classified as large data. Default classification for large data archive types can be overridden by using this setting. Default value is DV_DS_VT_JN_ARCHIVE, DV_DS_VT_LOTUS_JN_ARCHIVE, DV_DS_VT_SMTP_ARCHIVE. MaximumArchivesPerIndexForLargeData and ShardsPerIndexForLargeData rely on this for large data archive types.
- MaximumShardSizeMB: Maximum storing capacity of one shard inside an index. Value is always defined in MB. The default value is 50000 MB which is equal to 50 GB per shard. Total index size can be derived as Number of Shards Per Index X MaximumShardSizeMB.
New Elasticsearch index creation is triggered when either the Total Index Size (Number of Shards Per Index X MaximumShardSizeMB) or the Maximum Archives Per Index limits is reached. New archives and their index volumes are associated with the newly created index.
By default, 600 maximum shards per index server are allowed after which new Elasticsearch index creation is blocked. In such cases, only new items going in the existing index volumes mapped to existing Elasticsearch indices is indexed, unless the total index size is reached.