Vsanobserver Sh Vsantraced Is Not Started Can T Start Vsanobserver
- Forums
- Bits & Bytes
- Virtualized Computing
You should upgrade or use an alternative browser.
Services Failing on ESXi 5.5 Host - Can't Restart Services
- Thread starter KapsZ28
- Start date
- #1
- Joined
- May 29, 2009
- Messages
- 2,114
When I run "./sbin/services.sh restart", below is what I am getting.
Running vmware-fdm stop Stopping vmware-fdm:success Running xorg stop Running wsman stop Stopping openwsmand Openwsmand is not running. Running sfcbd stop This operation is not supported. Please use /etc/init.d/sfcbd-watchdog stop Running snmpd stop root: snmpd is not running. Running sfcbd-watchdog stop sh: bad number sh: you need to specify whom to kill pkill: failed to kill /sbin/sfcbd (8739627): No such process pkill: Failure pkill: failed to kill /sbin/sfcbd (8739627): No such process pkill: Failure pkill: failed to kill /sbin/sfcbd (8739627): No such process pkill: Failure Connect to localhost failed: Connection failure Connect to localhost failed: Connection failure Running vpxa stop vpxa is not running Connect to localhost failed: Connection failure Running vobd stop vobd is not running Running lacp stop LACP daemon is not running Running memscrubd stop memscrubd is not running Running nscd stop nscd is not running Running smartd stop smartd is not running Running dcbd stop dcbd is not running Running cdp stop cdp is not running Running slpd stop Stopping slpd Running rhttpproxy stop rhttpproxy is not running. Running vsantraced stop watchdog-vsantraced: PID file /var/run/vmware/watchdog-vsantraced.PID does not e xist watchdog-vsantraced: Unable to terminate watchdog: No running watchdog process f or vsantraced vsantraced is not running Failed to clear vsantraced memory reservation Running swapobjd stop swapobjd is not running Running vmfstraced stop watchdog-vmfstracegd: PID file /var/run/vmware/watchdog-vmfstracegd.PID does not exist watchdog-vmfstracegd: Unable to terminate watchdog: No running watchdog process for vmfstracegd vmfstracegd is not running Failed to clear vmfstracegd memory reservation Running sensord stop sensord is not running Running lbtd stop net-lbt is not running Running hostd stop hostd is not running. Running storageRM stop storageRM is not running Running sdrsInjector stop sdrsInjector is not running Running DCUI stop Disabling DCUI logins VobUserLib_Init failed with -1 Running SSH stop SSH login disabled VobUserLib_Init failed with -1 Connect to localhost failed: Connection failure Errors: Invalid operation requested: This ruleset is required and connot be disabled Running ntpd stop Stopping ntpd watchdog-ntpd: Terminating watchdog process with PID 13010437 Connect to localhost failed: Connection failure Running iomemory-vsl stop Running iomemory-vsl restart Running ntpd restart Connect to localhost failed: Connection failure Starting ntpd Running SSH restart Connect to localhost failed: Connection failure SSH login enabled VobUserLib_Init failed with -1 Running DCUI restart Enabling DCUI login: runlevel = VobUserLib_Init failed with -1 Running sdrsInjector restart sdrsInjector started Running storageRM restart storageRM started Running hostd restart Ramdisk 'hostd' with estimated size of 1053MB already exists hostd started. Running lbtd restart net-lbt started Running sensord restart sensord started Running vmfstraced restart VMFS Global Tracing is not enabled. Running swapobjd restart swapobjd started Running vsantraced restart Storing traces to /scratch/vsantraces mkdir: can't create directory '/scratch/vsantraces': Connection timed out Failed to mkdir: /scratch/vsantraces Running rhttpproxy restart rhttpproxy started. Running slpd restart Starting slpd Running cdp restart cdp started Running dcbd restart dcbd started Running smartd restart smartd started Running nscd restart nscd started Running memscrubd restart The checkPages boot option is FALSE, hence memscrubd could not be started. Running lacp restart LACP daemon started Running vobd restart vobd started Running vpxa restart Connect to localhost failed: Connection failure Running sfcbd-watchdog restart Connect to localhost failed: Connection failure Connect to localhost failed: Connection failure sfcbd is running. Running snmpd restart root: snmpd opening firewall port(s) for notifications. Running sfcbd restart This operation is not supported. Please use /etc/init.d/sfcbd-watchdog start Running wsman restart Starting openwsmand Running xorg restart Running vmware-fdm restart Starting vmware-fdm:success
- #2
- Joined
- Oct 11, 2001
- Messages
- 31,956
Hostd and vpxa both started. Go see what's in /var/log/vmkernel.log, /var/log/hostd.log, and /var/log/vpxa.log
See if they're throwing errors (especially vmkernel).
- #3
- Joined
- May 29, 2009
- Messages
- 2,114
looks normalish to me.Hostd and vpxa both started. Go see what's in /var/log/vmkernel.log, /var/log/hostd.log, and /var/log/vpxa.log
See if they're throwing errors (especially vmkernel).
Ha, I used the word "stableish" this morning to describe the server.
The services restart took much longer than normal and kept getting stuck on some processes and some show as failed. Although I never fully looked at the list before since it normal goes through pretty quickly. It was also getting stuck at "Running usbarbitrator restart" so I had to run "chkconfig usbarbitrator off" just to get it to complete. I will check the logs shortly.
- #4
- Joined
- Oct 11, 2001
- Messages
- 31,956
- #5
- Joined
- May 29, 2009
- Messages
- 2,114
If I go to the Windows server. C:\ProgramData\VMware\VMware Syslog Collector\Data\10.20.10.50 and open the syslog file, there is tons of information but I don't know what exactly I should be looking for.
- #6
- Joined
- Oct 11, 2001
- Messages
- 31,956
- #7
- Joined
- May 29, 2009
- Messages
- 2,114
<182>2014-08-12T15:54:49.908Z vps24.corp.domain.net vmkernel: cpu39:32860)ScsiDeviceIO: 2337: Cmd(0x41300b6bba40) 0x1a, CmdSN 0x26977 from world 0 to dev "naa.600605b0070ec9c01b0613ea1266322e" failed H:0x5 D:0x0 P:0x0 Possible sense data: 0x0 0x0 0x0. <180>2014-08-12T15:54:49.908Z vps24.corp.domain.net vmkwarning: cpu26:13028837)WARNING: ScsiDeviceIO: 7793: READ CAPACITY on device "naa.600605b0070ec9c01b0613ea1266322e" from Plugin "NMP" failed. I/O error <182>2014-08-12T15:54:49.908Z vps24.corp.domain.net vmkernel: cpu26:13028837)WARNING: ScsiDeviceIO: 7793: READ CAPACITY on device "naa.600605b0070ec9c01b0613ea1266322e" from Plugin "NMP" failed. I/O error <182>2014-08-12T15:54:49.908Z vps24.corp.domain.net vmkernel: cpu26:13028837)Vol3: 2174: Could not open device 'naa.600605b0070ec9c01b0613ea1266322e:5' for probing: I/O error <182>2014-08-12T15:54:49.908Z vps24.corp.domain.net vmkernel: cpu35:9109179)ScsiDeviceIO: 2324: Cmd(0x41300b6bba40) 0x9e, CmdSN 0x26978 from world 0 to dev "naa.600605b0070ec9c01b0613ea1266322e" failed H:0x0 D:0x8 P:0x0 Possible sense data: 0x0 0x0 0x0.
- #8
- Joined
- Oct 11, 2001
- Messages
- 31,956
from the locked up host:
vmkfstools --lock lunreset /vmfs/devices/disks/naa.600605b0070ec9c01b0613ea1266322e
However, that's partition 5 on that NAA - local disk? or you all using extents
- #9
- Joined
- May 29, 2009
- Messages
- 2,114
<166>2014-08-12T15:56:20.801Z vps24.corp.domain.net Rhttpproxy: [FFE1ED70 warning 'Proxy Req 00897'] Connection to localhost : 8307 failed with error N7Vmacore15SystemExceptionE(Connection refused). <166>2014-08-12T15:56:31.007Z vps24.corp.domain.net Vpxa: [FFD7A1A0 verbose 'commonvpxXml'] [VpxXml] Error fetching /sdk/vimServiceVersions.xml: 503 (Service Unavailable) <166>2014-08-12T15:56:31.015Z vps24.corp.domain.net Vpxa: [FFD7A1A0 verbose 'commonvpxXml'] [VpxXml] Error fetching /definitions/import/@namespace from /sdk/vimService?wsdl: 503 (Service Unavailable)
- #10
- Joined
- May 29, 2009
- Messages
- 2,114
You have a stuck reservation or someting similar.from the locked up host:
vmkfstools --lock lunreset /vmfs/devices/disks/naa.600605b0070ec9c01b0613ea1266322e
However, that's partition 5 on that NAA - local disk? or you all using extents
Local disks in RAID 1. So it is safe to run that command while VMs are running?
- #11
- Joined
- May 29, 2009
- Messages
- 2,114
# vmkfstools --lock lunreset /vmfs/devices/disks/naa.600605b0070ec9c01b0613ea1266322e
Command lunreset failed
Error: Unable to access device, please check your connection to the device.
Possibly a RAID card issue?
- #12
- Joined
- Oct 11, 2001
- Messages
- 31,956
When you do esxcfg-mpath -b | grep -i naa.600605b0070ec9c01b0613ea1266322e -A4
what do you get back?
- #13
- Joined
- May 29, 2009
- Messages
- 2,114
naa.600605b0070ec9c01b0613ea1266322e : Local LSI Disk (naa.600605b0070ec9c01b0613ea1266322e)
vmhba0:C2:T0:L0 LUN:0 state:active Local HBA vmhba0 channel 2 target 0
- #14
- Joined
- Oct 11, 2001
- Messages
- 31,956
- #15
- Joined
- May 29, 2009
- Messages
- 2,114
- #16
- Joined
- Oct 11, 2001
- Messages
- 31,956
- #17
- Joined
- May 29, 2009
- Messages
- 2,114
- #18
- Joined
- May 29, 2009
- Messages
- 2,114
Awhile back on this same server with had an issue with the vSphere Flash Read Cache. The latency on the SSD literally went to 40,000 or higher. Being that we had other cache issues with Veeam, and this was a new technology to VMware, I just removed the Flash Cache from all servers. That SSD is connected to the same RAID controller that has two hard drives in RAID 1 with ESXi installed.
Even after this server was rebooted, I still had some issues and decided to delete the RAID volume, re-create it, and reinstall ESXi. Again, started to see some issues.
Had someone insert a USB thumb drive into the back of the server, installed the same exact version of ESXi 5.5, and haven't seen any problems since. What is annoying is that as far as diagnostics, alerts, etc. Nothing shows that there is a problem with the RAID controller.
- Forums
- Bits & Bytes
- Virtualized Computing
Vsanobserver Sh Vsantraced Is Not Started Can T Start Vsanobserver
Source: https://hardforum.com/threads/services-failing-on-esxi-5-5-host-cant-restart-services.1829769/