Veritas Data Insight Administrator's Guide
- Section I. Getting started
- Introduction to Veritas Data Insight administration
- Configuring Data Insight global settings
- Overview of Data Insight licensing
- About scanning and event monitoring
- About filtering certain accounts, IP addresses, and paths
- About archiving data
- About Data Insight integration with Symantec Data Loss Prevention (DLP)
- Configuring advanced analytics
- About open shares
- About bulk assignment of custodians
- Section II. Configuring Data Insight
- Configuring Data Insight product users
- Configuring Data Insight product servers
- About node templates
- About automated alerts for patches and upgrades
- Configuring saved credentials
- Configuring directory service domains
- Configuring containers
- Section III. Configuring native file systems in Data Insight
- Configuring NetApp file server monitoring
- Configuring clustered NetApp file server monitoring
- About configuring secure communication between Data Insight and cluster-mode NetApp devices
- Configuring EMC Celerra or VNX monitoring
- Configuring EMC Isilon monitoring
- Configuring EMC Unity VSA file servers
- Configuring Hitachi NAS file server monitoring
- Configuring Windows File Server monitoring
- Configuring Veritas File System (VxFS) file server monitoring
- Configuring monitoring of a generic device
- Managing file servers
- Adding filers
- Adding shares
- Renaming storage devices
- Configuring NetApp file server monitoring
- Section IV. Configuring SharePoint data sources
- Configuring monitoring of SharePoint web applications
- About the Data Insight web service for SharePoint
- Adding web applications
- Adding site collections
- Configuring monitoring of SharePoint Online accounts
- About SharePoint Online account monitoring
- Adding SharePoint Online accounts
- Adding site collections to SharePoint Online accounts
- Configuring monitoring of SharePoint web applications
- Section V. Configuring cloud data sources
- Section VI. Configuring ECM data sources
- Section VII. Health and monitoring
- Section VIII. Alerts and policies
- Section IX. Remediation
- Section X. Reference
- Appendix A. Backing up and restoring data
- Appendix B. Data Insight health checks
- Appendix C. Command File Reference
- Appendix D. Data Insight jobs
- Appendix E. Troubleshooting
- Troubleshooting FPolicy issues on NetApp devices
Scheduled Data Insight jobs
Each Data Insight service performs several actions on a scheduled basis. These services are called jobs. The section explains the function of the important jobs that run in various services. The schedule for few jobs can be changed from the Advanced Settings tab of the Server details page.
Table: Communication service jobs
Job | Description |
---|---|
ADScanJob | Initiates the adcli process on the Management Server to scan the directory servers. Ensure the following:
|
CollectorJob | Initiates the collector process to pre-process raw audit events received from storage devices. The job applies exclude rules and heuristics to generate audit files to be sent to the Indexers. It also generates change-logs that are used for incremental scanning. |
ChangeLogJob | The CollectorJob generates |
ScannerJob | Initiates the scanner process to scan the shares and site collections added to Data Insight. Creates the scan database for each share that it scanned in the |
IScannerJob | Intiates the incremental scan process for shares or site-collections for paths that have changed on those devices since the last scan. |
CreateWorkflowDBJob | Runs only on the Management Server. It creates the database containing the data for DLP Incident Management, Entitlement Review, and Ownership Confirmation workflows based on the input provided by users. |
DlpSensitiveFilesJob | Retrieves policies and sensitive file information from Data Loss Prevention (DLP). |
FileTransferJob | Transfers the files from the |
FileTransferJob_content | Runs every 10 seconds on the Windows File Server. Routes content file and CSQLite file to the assigned Classification Server. |
FileTransferJob_Evt | Sends Data Insight events database from the worker node to the Management Server. |
FileTransferJob_WF | Transfers workflow files from Management Server to the Portal service. |
FileTransferJob_classify | Runs on all Data Insight nodes once every minute. It distributes the classification events between Data Insight nodes. |
IndexWriterJob | Runs on the Indexer node; initiates the idxwriter process to update the Indexer database with scan (incremental and full), tags, and audit data. After this process runs, you can view newly added or deleted folders and recent access events on shares on the Management Console. |
ActivityIndexJob | Runs on the Indexer node; It updates the activity index every time the index for a share or site collection is updated. The Activity index is used to speed up the computation of ownership of data. |
IndexCheckJob | Verifies the integrity of the index databases on an Indexer node. |
PingHeartBeatJob | Sends the heartbeat every minute from the worker node to the Data Insight Management Server. |
PingMonitorJob | Runs on the Management Server. It monitors the heartbeat from the worker nodes; sends notifications in case it does not get a heartbeat from the worker node. |
SystemMonitorJob | Runs on the worker nodes and on the Management Server. Monitors the CPU, memory, and disk space utilization at a scheduled interval. The process sends notifications to the user when the utilization exceeds a certain threshold value. |
DiscoverSharesJob | Discovers shares, site collections, or equivalent on the devices for which you have selected the Automatically discover and monitor shares on this filer check box when configuring the device in Data Insight |
ScanPauseResumeJob | Checks the changes to the pause and resume settings on the Data Insight servers, and accordingly pauses or resumes scans. |
DataRetentionJob | Enforces the data retention policies, which include archiving old index segments and deleting old segments, indexes for deleted objects, old system events, and old alerts. |
IndexVoldbJob | Runs on the Management Server and executes the command voldb.exe --index which consumes the device volume utilization information it receives from various Collector nodes. |
SendNodeInfoJob | Sends the node information, such as the operating system, and the Data Insight version running on the node to the Management Server. You can view this information on the Data Insight Server > Overview page of the Management Console. |
EmailAlertsJob | Runs on the Management Server and sends email notifications as configured in Data Insight.The email notifications pertain to events happening in the product, for example, a directory scan failure. You can view them on the Settings > System Overview page of the Management Console. |
LocalUsersScanJob | Runs on the Collector node that monitors configured file servers and SharePoint servers. In case of a Windows File Server that uses agent to monitor access events, it runs on the node on which the agent is installed. It scans the local users and groups on the storage devices. |
UpdateCustodiansJob | Runs on the Indexer node and updates the custodian information in the Data Insight configuration. |
CompactJob | Compresses the The job also deletes stale data that's no longer being used. |
Compact_Job_Report | Compresses the folders that store report output. |
StatsJob | On the Indexer node, it records index size statistics to |
MergeStatsJob | Rolls up (into hourly, daily and weekly periods) the published statistics. On the Collector nodes for Windows Filer Server, the job consolidates statistics from the filer nodes. |
StatsJob_Index_Size | Publishes statistics related to the size of the index. |
StatsJob_Latency | On the Collector node, it records the filer latency statistics for NetApp filers. |
SyncScansJob | Gets current scan status from all Collector nodes. The scan status is displayed on the Settings > Scanning Dashboard > In-progress Scans tab of the Management Console. |
SPEnableAuditJob | Enables auditing for site collections (within the web application), which have been added to Data Insight for monitoring. By default, the job runs every 10 minutes. |
SPAuditJob | Collects the audit logs from the SQL Server database for a SharePoint web application and generates SharePoint audit databases in Data Insight. |
SPScannerJob | Scans the site collections at the scheduled time and fetch data about the document and picture libraries within a site collection and within the sites in the site collection. |
NFSUserMappingJob | Maps every UID in raw audit files for NFS and VxFS with an ID generated for use in Data Insight. Or generates an ID corresponding to each User and Group ID in raw audit files received from NFS/VxFS. |
MsuAuditJob | Collects statistics information for all indexers on the Indexer. |
MsuMigrationJob | Checks whether a filer migration is in process and carries it out. |
ProcessEventsJob | Processes all the Data Insight events received from worker nodes and adds them to the yyyy-mm-dd_events.db file on the Management Server. |
ProcessEventsJob_SE | Processes scan error files. |
SpoolEventsJob | Spools events on worker nodes to be sent to Management Server. |
WFStatusMergeJob | Merges the workflow and action status updates for remediation workflows (DLP Incident Remediation, Entitlement Reviews, Ownership Confirmation), Enterprise Vault archiving, and custom actions and update the master workflow database with the details so that users can monitor the progress of workflows and actions from the Management Console. |
UpdateConfigJob | Reconfigures jobs based on the configuration changes made on the Management Server. |
DeviceAuditJob | Fetches the audit records from the Hitachi NAS EVS that are configured with Data Insight. By default, this job runs in every 5 seconds. |
HNasEnableAuditJob | Enables the Security Access Control Lists (SACLs) for the shares when a Hitachi NAS filer is added. By default, this job runs in every 10 minutes. |
WorkflowActionExecutionJob | This service reads the request file created on the Management Server when a Records Classification workflow is submitted from the Portal. The request file contains the paths on which an Enterprise Vault action is submitted. When the action on the paths is complete, the job updates the request file with the status of the action. By default, this job runs in every 1 hour. |
UserRiskJob | Runs on each Indexer. The job updates hashes used to compute the user risk score. By default, the job runs at 2:00 A.M. everyday. |
UpdateWFCentralAuditDBJob | Runs only on the Management Server. It is used to update the workflow audit information in By default, this job runs every 1 minute. |
TagsConsumerJob | Parses the CSV file containing tags for paths. Imports the attributes into Data Insight and creates a Tags database for each filesystem object. By default, this job runs once every day. |
KeyRotationJob | Run the job on demand to change the encryption keys. It is not an automatically scheduled job. It is recommended to run this job after the Data Insight servers including Windows File Agent server is upgraded to 5.2. If you want to run the KeyRotationJob without upgrading all the servers, restart all services on the servers that have not been upgraded after the KeyRotationJob is executed and the configuration database is replicated on these servers. |
RiskDossierJob | Runs on each Indexer and computes the number of files accessible and number of sensitive files accessible to each user on each share. This job runs every day at 11.00 P.M. by default. |
ClassifyInputJob | Runs every 10 seconds on the Management Server. The job processes the classification requests from the Data Insight console and from reports for the consumption of the book keeping database. |
ClassifyBatchJob | Runs every minute on the Indexer. The job splits the classification batch input databases for the scanner's consumption, which are later pushed to the Collector. |
ClassifyIndexJob | Runs once every minute on the Indexer node. Updates the index with classification tags and also updates the status of the book keeping database. |
ClassifyMergeStatusJob | Runs once every minute on the Management Server. The job calls the files with the classification update status that are received from each indexer. These files are automatically created on the indexer whenever updates are available. It also updates the global book keeping database that is used to show high level classification status on the Console. |
CloudDeviceAuditJob_sponline | Runs once every 70 seconds on the Collector. Collects the audit data for site collections (within the SharePoint Online account), which have been added to Data Insight for monitoring. |
CloudDeviceAuditJob_onedrive | Runs once every 70 seconds on the Collector. Fetches the audit records for the OneDrive accounts that are configured with Data Insight. |
RTWBJob | Runs once every 1 minute on the Indexer to evaluate configured Real-time Data Activity User Whitelist-based and Data Activity User Blacklist-based policies and generates alerts. |
The following processes run in the Data Insight WatchDog service
Table: WatchDog service jobs
Job | Description |
---|---|
SyncPerformanceStatsJob - | Runs only on the Management server. Fetches performance related statistics from all other servers. |
SystemMonitorJob | Gathers statistics like disk usage, CPU, memory usage. |
SystemMonitorJob_backlog | Gathers statistics for unprocessed backlog files. |
UpdateConfigJob | Reconfigures its own jobs based on configuration updates from the Management Server. |
The following processes run in the Data Insight Workflow service
Table: Workflow service jobs
Job | Description |
---|---|
WFStepExecutorJob | Processes actions for Enterprise Vault archiving, requests for permission remediation, and custom actions configured in Data Insight. |
WFStepExecutorJob_im | Processes workflows of type Entitlement Reviews, DLP Incident Remediation, and Ownership confirmation. It also sends email reminders containing links to the remediation portal to the custodians at a specified interval. |
UpdateConfigJob | Updates its schedules based on the configuration changes made on the Management Server. |
WFSpoolStatusJob | Reads the workflow data every minute, and if there are any new updates in last minute, it creates a status database with the new updates. |
FileTransferJob_WF | Transfer workflow status databases from the Self-Service portal nodes to the Management Server. |
The following processes run in the Data Insight Webserver service.
Table: Webserver service jobs
Job | Description |
---|---|
CustodianSummaryReportJob | Periodically runs the custodian summary report, which is used to determine the custodians assigned in Data Insight for various resources. The output produced by this report is used in DLP Incident Remediation, Entitlement Review, and Ownership Confirmation workflows. |
HealthAuditReportJob | Periodically creates a report summarizing health of the entire deployment, and stores it to |
PolicyJob | Evaluates configured policies in the system and raises alerts. |
PurgeReportsJob | Deletes older report outputs. |
UpdateConfigJob | Updates configuration database on the worker nodes based on the configuration changes made on the Management Server. |
UserIndexJob_merge | Consolidates user activity and permission map from all indexers. |
UserIndexJob_split | Requests each Indexer for user activity and permission map. |
UserRiskMergeJob | This job runs on the Management Server. Its default schedule is 6:00 A.M. every day. The job combines data from all MSUs into a single risk score value for each user. This job creates the |
The following processes run in the Data Insight Classification service.
Table: Classification service jobs
Job | Description |
---|---|
ClassifyFetchJob | Runs every minute on the server that is assigned the role of a Classification Server.
It searches the Note: Error logs are created in the |
ClassifyFetchPauseJob | Runs once every minute on any node that acts as the Classification Server. Refreshes the pause or resume status of fetch jobs as per the duration configured for content fetching. |
CancelClassifyRequestJob | Runs every 20 seconds in Communication Service and Classification Service. Fetches the list of classification requests that are cancelled and distributes this request between Data Insight nodes. Before classifying files, all the classification jobs consult this list to identify the requests that are marked for cancellation. If they observe any canceled request in the new request that is submitted for classification, then that request is deleted. |
ClassifyJob | Runs once every minute on any node that acts as a Classification Server. Checks the |
UpdateVICPolicyMapJob | Runs every ten seconds on the Management Server. It ensures that Data Insight configuration database is in sync with the Classification Policy Manager. |
UpdateConfigJob | Reconfigures jobs based on the configuration changes made on the Management Server. |
CreateFeaturesJob | Runs once every week on Sunday at 00.01 A.M. on the Indexer. Checks if sufficient classified data is available for the supervised learning algorithm to create predictions (training sets). The job has a multi-threaded execution framework which executes actions in parallel. The default thread count is 2. You can set the value using the matrix.classification.sl.features.threads property at global or node level. Note: The node level property always takes precedence over the global level property. |
PredictJob | Runs once every week on Sunday at 05.00 A.M. on the Indexer. Copies the prediction files from the temp output directory to a classification outbox. |
SLCreateBatchesJob | Runs every 2 hours on the Indexer. The job creates batches of files for the consumption of Veritas Information Classifier. These files are classified with high priority. |
ClassifyManageWorkloadJob | Runs every one minute on the server that is assigned the role of a Classification Server. This job is enabled only on the master Classification Server. Checks the classification or workload folder on master Classification Server and counts batches based on their priority. If the workload needs to be distributed, the job fetches a list of servers' in it its pool and fetches the number of batches based on their priority in the |