Fatal agent error: Target Interaction Manager failed at Startup

Symptoms
========

 $ /AGENT_INST/bin/emctl start agent

Oracle Enterprise Manager Cloud Control 12c Release 3
Copyright (c) 1996, 2013 Oracle Corporation.  All rights reserved.
Starting agent ......................................................................... failed.
Fatal agent error: Target Interaction Manager failed at Startup
Fatal agent error: Target Interaction Manager failed at Startup
Fatal agent error: Target Interaction Manager failed at Startup
EMAgent is Thrashing. Exiting watchdog

$/AGENT_INST/sysman/log/emagent.nohup reports following error

2013-11-01 17:29:38,596 [1:main] WARN - Missing filename for log handler 'opsscfg' Agent is going down due to an OutOfMemoryError

$/AGENT_INST/sysman/log/gcagent.log reports following error

2013-11-01 17:29:19,204 [1:3305B9] INFO - Invoking STARTUP_P1 (1) on Target Interaction Manager
2013-11-01 17:29:23,255 [1:main] FATAL - Fatal error: Target Interaction Manager failed at Startup java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2882)


OR

$/AGENT_INST/bin/emctl start agent

fails with following error

Oracle Enterprise Manager Cloud Control 12c Release 2
Copyright (c) 1996, 2012 Oracle Corporation.  All rights reserved.
Starting agent ........ failed.
Ping Manager failed at Startup
Consult emctl.log and emagent.nohup in: /data/oracle/agent/product/agent12c/agent_inst/sysman/log

/AGENT_INST/sysman/log/gcagent.log reports file IO exception

2014-11-07 21:22:44,286 [1930450:GC.Executor.778231 (oracle_database:sgpssp01_prd:log_full) (oracle_database:sgpssp01_prd:log_full:dumpFull)] ERROR - File not found exception
java.io.FileNotFoundException: /data/oracle/agent/product/agent12c/agent_inst/sysman/emd/state/statemgmt/oracle_database/sgpssp01_prd/SevDiff/1.tmp (Invalid file system control data detected.)

Deleting file has filesystem related error

$ rm /AGENT_INST/sysman/emd/agntstmp.txt

rm: agntstmp.txt not removed.
Invalid file system control data detected


Cause
=====

Bug 17329098 12C AGENT STARTUP FAILS WITH, FATAL AGENT ERROR: STATE MANAGER FAILED AT STARTUP
Its highly likely that there are too many keys declared by some target  threshold. so the user can either increase the amount of heap the agent can  use, or fix the target.
Bug 17389753 AGENT WILL NOT START - JAVAHEAP OUT OF MEMORY


Solution
========

1. Stop agent

$/AGENT_INST/bin/emctl stop agent

If agent does not shutdown gracefully then kill all agent background processes by first grepping for agent perl and java processes only

$ps -ef | grep java | grep '<agent_home >'
$ps -ef | grep perl
$kill -9  <Process id>

2. Move old files from  /agent_inst/sysman/emd/state/* to a new directory

Example:

$mv /u02/agent_inst/sysman/emd/state/*  /u02/tmp/

3. Execute agent clearstate agent

$/AGENT_INST/bin/emctl clearstate agent
Note : This command should be executed only when Oracle support recommends.
The clearstate command is an option that can be used with the EM Agent emctl utility.  The Agent maintains state information about severities and statuses of the components that it monitors.  In order to prevent unnecessary network traffic, the agent does not keep resending this information to the Management Service (OMS) unless a change occurs to the monitored target that would warrant the upload.  The emctl clearstate command forces the agent to take a new reading of the severity and status of each components and resend this information to the OMS.

Ref : Internal Note 285375.1 : Understanding the 'emctl clearstate' Command in EM

4. For IO exception like 'Invalid file system control data detected'

Verify the filesystem (agent home,  /temp) has READ_WRITE permission (i.e. not READ only)  (Consult System Administrator)

5. Restart agent

$/AGENT_INST/bin/emctl start agent


For further investigation collect <AGENT_STATE_DIR>/sysman/emd folder, before doing the clean up


Ref:

EM 12c, EM 13c: Enterprise Manager Cloud Control emctl start agent Fails with OutOfMemoryError Message or: Fatal agent error: Target Interaction Manager failed at Startup (Doc ID 1952593.1)

Comments

Popular posts from this blog

[INS-40718] Single Client Access Name (SCAN): could not be resolved. ( LDOMS & Zones)

CRS-2883: Resource 'ora.asm' failed during Clusterware stack start