|
Problem Definition
IMA Service shuts down unexpectedly with Resource Manager Alerts after installing Service Pack 4 for MetaFrame XP.
Environment
• Windows 2004 Service Pack 4
• MetaFrame XP Enterprise Edition Feature Release 3 with Service Pack 4
Troubleshooting Methodology
The first thing we needed to do was find out why IMA was stopping.
1. We checked the Event Viewer for any events that indicated how the IMA Service had stopped. Unable to find any errors in the event viewer. The IMA Service would stop similar to a manual stop of the service and would start up manually from the services applet.
2. The next thing that was done was to verify the integrity of the data store using CTX107800 – DSCHECK Version 5.9. DSCheck did not find any errors with the data store.
The customer noted that the issue started happening after applying Service Pack 4 for MetaFrame.
We started monitoring the event logs and setup some CTX tracing. This is what we found:
[LMS_Subsystem, Error] LMS_Subsystem::DS_UpdateSrvLoad(), updating to DS with server load = 20000
[Admin Subsystem, Error] IMA_AdminTool::NotifySubsystemAgain [IMA_License, Error] IMA_License Error: result = 0x80000007 (X:\nt\private\ima\subsystems\ss\license\groupcache.cpp:84) [IMA_License, Error] IMA_License Error: result = 0x80000007 (X:\nt\private\ima\subsystems\ss\license\acquire.cpp:708) [NDSDrvSS, Error] NDSDrvSS: NotifySubsystem [NDSDrvSS, Error] NDSDrvSS::NotifySubsystem(IMASERVICE_STOP)
We also found this event in the event viewer:
Event Type: Error Event Source: MetaframeEvents Event Category: Citrix XML Service Event ID: 1203 Date: 2/17/2005 Time: 12:18:47 AM User: N/A Computer: COMPUTERNAME Description: The Citrix XML Service on this server requested a ticket for user username\domainname from server 142.242.5.19. Server 142.242.5.19 never responded and the request for the ticket timed out.
It appeared that the servers where timing out on a ticket request and the IMA Service was subsequently shutdown. This sounds similar to an issue resolved in Service Pack 4.
From the Service Pack 4 readme:
216. An unresponsive server (IMA, XML, or Termsrv) could cause a farm outage. This occurred when the unresponsive server was still capable of sending load level updates to the data collector. Because the server was not accepting connections, it remained as the least loaded server in the farm and all new connection attempts were routed to it.
When a least loaded server was unresponsive, the data collector continued to resolve connections to the server. The data collector sent all new connection attempts and/or ticket requests to that server even though that server would not respond to these requests.
If the server does not respond to a ticket request (times out), the data collector is notified to update the server's load to maximum. This prevents any new ticket request to that server. The server stops its IMA Service.
The next bit of troubleshooting done was to disable ticketing and see if the issue still occurred.
To disable ticketing we did the following:
In all template.ica files found on the system (on all Web Interface servers) with this entry do the following:
Remove:
AutologonAllowed=ON [NFuse_Ticket]
ADD in its place:
[NFuse_IFSESSIONFIELD sessionfield="NFUSE_ENCRYPTIONLEVEL" value="basic"] Username=[NFuse_User] Domain=[NFuse_Domain] Password=[NFuse_PasswordScrambled] [/NFuse_IFSESSIONFIELD]
We had discovered that after disabling ticketing, the IMA Service was not being shut down anymore.
Resolution
To resolve this issue we need to increase the timeout value for the for the XML Broker when it issues a ticket.
To set the ticket time out you will need to modify the registry:
Caution! Using Registry Editor incorrectly can cause serious problems that may necessitate reinstalling your operating system. Citrix cannot guarantee that problems resulting from the incorrect use of Registry Editor can be solved. Use Registry Editor at your own risk. Make sure you back up the registry before you edit it.
HKEY_LOCAL_MACHINE\SYSTEM\Services\CtxHttp Name: SocketTimeout Type: REG_DWORD Data: Time-out number value in milliseconds
|