How to configure EV to optimally detect Japanese ANSI code pages

Article: 100047788
Last Published: 2020-06-02
Ratings: 0 0
Product(s): Enterprise Vault

Description

Enterprise Vault converts non Unicode text files (a.k.a. ANSI text files) into Unicode, before indexing and making preview. As text files don't have encoding property, so Enterprise Vault need to "guess" encoding from its contents. For this purpose, Enterprise Vault uses Microsoft's DetectInputCodepage() function.

IMultiLanguage2::DetectInputCodepage method
https://docs.microsoft.com/en-us/previous-versions/windows/internet-explorer/ie-developer/platform-apis/aa740986(v%3Dvs.85)

DetectInputCodepage() usually returns multiple guesses with some related factors like confidence level. The caller application (Enterprise Vault in this case) need to determine one encoding from these results. The top confidence rate encoding doesn't always correct, especially when the length of text isn't long enough. (The longer the text file is, the better DetectInputCodepage() guesses correct encoding.)

Enterprise Vault provide some configuration parameters to fine-tune this determining process. They are implemented as registry keys. The explanation for these keys are in the different Knowledge Base article (100017458).

When using Enterprise Vault (EV), it is not possible to searchfor some messages or files.
https://www.veritas.com/docs/100017458

This article is to provide the recommended configuration values to detect Japanese ANSI code pages properly:

Warning: Incorrect use of the Windows registry editor may prevent the operating system from functioning properly. Great care should be taken when making changes to a Windows registry. Registry modifications should only be carried-out by persons experienced in the use of the registry editor application. It is recommended that a complete backup of the registry and workstation be made prior to making any registry changes.

Recommended Codepage Detection configuration values for Japanese
Registry Value Remarks
CodepageOverride DWORD 0x000003a4 (932)  
DecisionType DWORD 0x00000001 (1)  
FallbackCodepage DWORD 0x000003a4 (932)  
LogConversions DWORD 0x00000000 (0) *1
MinimumConfidenceLevel DWORD 0x0000001e (30)  
MinimumDocumentPercent DWORD 0x00000000 (0)  

      
          
     

 

 

 

 

 

*1 : Recommend to set "1" to confirm how these parameters works, only just after configuration has been changed.

Was this content helpful?