Troubleshooting Veritas NetBackup KM

This section describes the basic troubleshooting steps to follow before contacting Customer Support and lists the most common issues.

First Troubleshooting Steps

When you encounter a problem when installing or running Veritas NetBackup KM:

  • Look for error messages in the PATROL Console System Output Window (SOW) or in the log file NBU_<port>.log. Most error messages are self-explanatory.
  • Run the KM Status Report by selecting the menu KM Status from the server instance or the NetBackup Setup icon. This report lists most KM problems.
  • For most severe problems, look for PEM events. They include an Expert Advice, which provides details about the problem and some suggestions to resolve it.

Most Common Issues

Install/Upgrade Issues

KM Behavior is Unchanged After Upgrade

Check the version of the KM from the main Infobox. If it has not changed, then the installation is not complete. Make sure both the PATROL Console and the PATROL Agent are uninstalled and installed correctly during the KM upgrade

NetBackup Icon Missing After Loading

  • Check that the KM is loaded using NBU_LOAD.kml and that NBU_MAIN is loaded.
  • Check that there is no KM version mismatch between the PATROL Console and the PATROL Agent. Check the messages in the SOW to verify this.
  • Check whether the PATROL Agent tuning variable, /AgentSetup/AgentTuning/pslInstructionMax has been increased as suggested in the Changes Required section. Check the messages in the SOW.
  • Check whether the PATROL Agent user has necessary privileges added in the Agent’s Access Control List (/AgentSetup/accessControlList) in order to read and write to the Agent Configuration Database.

Unable to Find NBU_LOAD.kml

  • Check that the Load KM browser is looking for *.kml files under PATROL_HOME/lib/knowledge folder.
  • Veritas NetBackup KM files have not been installed correctly under the PATROL installation directory on the PATROL Console.

Display/Refresh Issues

KM Application Instances Do Not Appear

  • Check that the KM instance limits have not been exceeded. Look for error messages in the SOW, and increase the instance limits for affected objects using the menu Configuration > Instance Limits….
  • Check whether the server icon is in offline state. None of the data collectors will be executed until the server is enabled and online.
  • If the KM is configured for multi-node mode monitoring, some components are not monitored on the passive cluster node.

KM Configuration Menus are Disabled

Veritas NetBackup KM can either be configured from a BMC PATROL Console (Classic Mode) or BMC TrueSight Operations Management. When the KM is installed on a PATROL Agent, which is managed by Central Monitoring Administration (CMA), all the KM configuration menus are disabled in the PATROL Console. To configure Veritas NetBackup KM from a PATROL Console, you need to force the KM to run in Classic Mode.

When set to run in Classic Mode, Veritas NetBackup KM stops receiving configuration from CMA. Any monitoring set in CMA and used by the PATROL Agent is removed and replaced by the configuration made from the PATROL Console. Although policies created in CMA are not deleted, any configuration set in Central Monitoring Administration will be ignored.

To force the KM to run in Classic Mode:

  1. In the PATROL Console, right-click the Veritas NetBackup icon > KM commands > Configuration > Force Classic Configuration Mode…

    Forcing the KM to run in Classic Mode

  2. Check Force the KM to run in Classic mode and click OK.

NetBackup KM will now start running in Classic Mode, enabling you to use the KM Configuration menus.

To configure the KM in TrueSight OM, follow the above procedure and uncheck Force the KM to run in Classic mode. All configurations made through the PATROL Console will then be ignored.

KM Objects Disappear from the Console

  • Check whether there is a KM version mismatch between the PATROL Console and the PATROL Agent, possibly after an improper upgrade of the KM. Check the messages in the SOW to verify this.
  • Check that the Veritas NetBackup KM login details are still valid. Has the password changed on the system? Look for error messages in the SOW, and check for additional information in the last annotation point for parameter NBULoginStatus.

Old Active Jobs are not Removed

By default, all active jobs are monitored, and they are exempted from ageing. It is possible to change this behavior by Including Active Jobs in the menu Configuration > Jobs from Jobs container.

Old Acknowledged Jobs Kept in pconfig

By default, the KM stores all acknowledged jobs. Use the following PSL through PATROL Console to keep only the last <number> of jobs on <node-id>:

%PSL pconfig("REPLACE",
"/Runtime/NBU/<node-id>/NBU_JOB/jobAcknowledgementCapacity", <number>);

Replace <node-id> with the appropriate node ID of the NetBackup server and restart the PATROL Agent. Leave the <node-id> empty for localhost monitoring.

Performance Issues

CPU and Memory Usage is too High

CPU and memory usage will depend on the size and complexity of your environment and your Veritas NetBackup KM configuration. As you increase data collection frequency, increase the number of servers and components monitored by the KM, your CPU and memory usage will increase.

When monitoring a standard local installation of Veritas NetBackup using Veritas NetBackup KM, the PATROL Agent will consume between 5MBytes and 10MBytes of additional system memory. An enterprise installation of Veritas NetBackup Master Server with multiple Media Servers, Clients, Storage Servers and Storage Units can consume more memory (as per other KMs used by the PATROL Agent). The memory usage of Veritas NetBackup KM can be reduced by:

  • disabling monitoring unnecessary component instances
  • disabling unwanted components by setting their instance limits to 0 (zero)
  • disabling unwanted collectors by using the PATROL Configuration Manager
  • increasing the collector scheduling interval by using the PATROL Configuration Manager
  • decreasing the instance limits to limit the number of instances created by the collectors

The data collectors in Veritas NetBackup KM uses Veritas NetBackup command line interface to obtain Veritas NetBackup information. Most of the performance degradation is associated with these command executions and amount of data returned. It may improve the overall performance, if the regular housekeeping is followed on all Veritas NetBackup systems.

Network Traffic

When monitoring a NetBackup server through a local PATROL Agent, Veritas NetBackup KM generates minimal network traffic. Most of the data is kept on the managed node. The amount of network traffic that it generates depends on the number of PATROL Consoles that are connected to the PATROL Agent and the frequency of data collection.

When monitoring remote NetBackup servers, some network traffic will be observed as it transfers the commands result over the network. The traffic depends on the amount of data polled during each command execution. When commands are expected to return large output, the KM is designed to use file transfers through SFTP (on UNIX/Linux) and Windows file shares (on Windows).

Parameters and Application Classes Refresh Takes too Long

Data collectors run according to their scheduling interval (polling cycle) defined in the KM. These intervals are defined for a standard environment with minimal resource impact. These intervals can be customized from the PATROL Developer Console or PCM to suit your environment. Refer to the PATROL Console User Guide for more details.

Poor Performance of the Master/Media Server

The performance of the Master/Media Server may change after installing Veritas NetBackup KM on a heavily used system. Depending on the complexity of your Veritas NetBackup environment, the Veritas NetBackup KM may consume more resources to interrogate the application and process the data. In such a complex environment, the Veritas NetBackup KM may require some fine tuning to optimize the available resources. Consider the following options:

  • Disable the monitoring of unnecessary application instances. Refer to the section Filtering Elements to Monitor for more details.

  • Increase the scheduling interval (polling cycle) for data collectors.

  • Deactivate monitoring non-critical components by setting the Instance Limits to 0 (zero).

  • Deactivate unnecessary data collectors during selected time intervals, where there is no Veritas NetBackup activity. For example, if the job monitoring can be disabled between 9 am and 4 pm everyday, except weekends, then disable job data collector (NBUJobCollector) during this period by running the following PSL through the PATROL Console:

    %PSL pconfig("REPLACE","/Runtime/NBU/<node-id>/NBUJobCollectorMode", "0|32400|57600|0|0|||||||||||0|0");

    Here the pconfig variable is named as: <collector name>Mode. Replace <node-id> with the appropriate node ID of the NetBackup server. Leave the <node-id> empty for localhost monitoring. The value contains following details, delimited by pipe (|):enabled (1)/disabled (0) data collection, default start/end times in number of seconds since midnight, start/end times for non-default days starting from Sunday through to Saturday. Restart the PATROL Agent after the change.

  • The JOB_TEXT command which sets display only text parameter NBUJobText, can be disabled to improve the performance using the PSL below. Replace <node-id> with the appropriate node ID of the NetBackup server and restart the PATROL Agent. Leave the <node-id> empty for localhost monitoring.

    %PSL pconfig("REPLACE", "/Runtime/NBU/<node-id>/NBU_JOB/jobCollectText", 0);

  • As part of collection, the collector compares each job against previous similar backup to calculate the progress data. In addition the last backup info is shared under NBU_POLICY parameters (NBUPolicy*Backup*) to monitor the success at the policy level. This functionality can also be disabled to speed up the collector, using the PSL below. Replace <node-id> with the appropriate node ID of the NetBackup server and restart the PATROL Agent. Leave the <node-id> empty for localhost monitoring.

    %PSL pconfig("REPLACE", "/Runtime/NBU/<node-id>/NBU_JOB/jobCollectLastBackupStats", 0);

  • Similarly, the policy collector (NBUPolicyCollector) collects policy schedule data to determine the next backup details. This is used to monitor the successful backup start-up as per the schedule through NBUPolicyState (3 = Not Started). This functionality can also be disabled to speed up the collector, using the PSL below. Replace <node-id> with the appropriate node ID of the NetBackup server and restart the PATROL Agent. Leave the <node-id> empty for localhost monitoring.

    %PSL pconfig("REPLACE", "/Runtime/NBU/<node-id>/NBU_POLICY/policyCollectNextBackupInfo", 0);

  • Defining a “no command execution window” for all collectors will pause running commands at peak times or during NetBackup maintenance windows. This can be set using the PSL below. Replace <node-id> with the appropriate node ID of the NetBackup server and restart the PATROL Agent. Leave the <node-id> empty for localhost monitoring.

    %PSL pconfig("REPLACE", "/Runtime/NBU/<node-id>/noExecuteWindow", "23:59:00|120");

    The value of this configuration variable is in format <start time in HH:MM:SS 24-hour clock>|<duration in seconds>. The above 23:59:00|120 sets all collectors to sleep between 23:59:00 and 00:01:00 (2 minutes) every day before executing commands. Also, this noExecutewindow supports multiple time windows:

    %PSL pconfig("REPLACE", "/Runtime/NBU/rt-netbck-vtl/noExecuteWindow",["23:59:00|120","11:59:00|120"]);

  • Purge unnecessary information in Veritas NetBackup catalog databases and log files.

  • If there are too many clients configured in Veritas NetBackup, the NBUClientCollector and NBUPolicyCollector may affect the overall performance. In such environment, disable the NBUClientCollector, or set their instance limits to 0 (zero), using menu Configuration > Instance Limits.

  • Refer to Infinite Loop Errors section below for a possible PATROL internal scheduling delay which may impact the performance of the KM.

Others

Infinite Loop Errors

If error messages in the SOW report that some Veritas NetBackup KM data collectors may be in an infinite loop, check the setting of the tuning variable /AgentSetup/AgentTuning/pslInstructionMax.

PATROL Agent uses the pre-configured tuning variable (/AgentSetup/AgentTuning/pslInstructionMax) to stop running PSL functions in an infinite loop. When a PSL function reaches this maximum threshold, it reports this error, and puts the execution of this function to the back of the process queue. This will not only delay the data collector, it will also impact the performance of the system.

To resolve this situation, the maximum number of instructions should be increased to an optimum value. This depends on the complexity of your environment. It is required that the default value of 500,000 should be increased to at least 5,000,000 on a standard Veritas NetBackup environment to enable the Veritas NetBackup KM data collectors to execute without impacting your system.

If this still does not resolve the problem, you can disable this functionality by setting the value of the tuning variable to 0 (zero).

Modifying the Job Instance Label

By default the job instances are labeled as:

<policy>:<policy client> @ <date/time>

This label can be changed to include job ID as in one of the following formats:

  • Format 1: <job id>
  • Format 2: <job id>: <policy client> @ <date/time>
  • Format 3: <job id>: <policy>:<policy client> @ <date/time>

Use the following PSL through PATROL Console to modify the job instance label:

%PSL pconfig("REPLACE","/Runtime/NBU/<node-id>/labelByJobID", <format-number>);

Where:

  • <node-id> is the appropriate node ID of the NetBackup server
  • <format-number> is either 1, 2, or 3 as described above. Use the menu Force Full Collection from the Jobs container instance to recreate the job instances.

NBULoginStatus in Suspicious (Warning) Status

This parameter will show a “suspicious” state if any command executed by Veritas NetBackup KM fails.

  • Check the annotation point on the first state change data point of this parameter to look for failing commands. If an annotation point cannot be found, or if it is not up-to-date, check the KM Status Report, which can be viewed by selecting the menu KM Status from the server icon. These errors are produced from the Veritas NetBackup commands executed by the Veritas NetBackup KM .
  • Check that the operating system user configured in the menu Configuration > Login can execute all NetBackup commands and access the NetBackup files.
Keywords:
storage netbackup km patrol