General host info and storage issues
Host Info:
echo -e "Host Info \n ==========" && hostname -f && vmware -vl && date && uptime
SCSI sense codes (ignoring noise as per KB1031221):
cd /var/run/log;SENSE=$(grep -vE ' 0x85| 0x4d| 0x1a| 0x12' vmkernel.log|egrep -oi 'H:+.*+sense data+.*' |sort |uniq -c |sort -nr |head -20) ; echo -e "Host & Plug-in\n====================" ; echo "$SENSE" |grep "D:0x0" ; echo ;echo -e "Device\n====================" ; echo "$SENSE" |grep -v "D:0x0" ;
Common issues:
echo -e "vmkernel common errors\n====================" ; egrep -i 'snapshot|doubt|medium|apd|perm|non-responsive|offline|marked|corrupt|abort|timeout|frame|lock|splitter|zdriver|heap|admission|Rank violation' vmkernel.log|cut -d" " -f3- |sort|uniq -c |sort -nr |head -30
Hardware issues:
echo -e "ipmi-sel Messages:\n====================";localcli hardware ipmi sel list -p -i -n all |grep Message
NOTE: If you can't check the logs and you have something like :
root@Server-dr1:~] less /var/log/vmkernel.log
/var/log/vmkernel.log: Input/output error
# less /var/run/log/vmkernel.log
Input/output error
..... try:
watch "dmesg | tail -20"
Driver logs
Driver logs:
echo -e "HBA drivers logs\n====================" ; egrep $( localcli storage core adapter list |awk 'FNR >2 {print $2}' |sort |uniq |sed ':a;N;$!ba;s/\n/|/g') vmkernel.log| egrep -v 'INFO|PCI' |cut -d " " -f2- | cut -d ")" -f2- |sort |uniq -c |sort -nr |less
Network Health
Ethernet issues:
echo -e "Ethernet Issues:\n====================";/usr/lib/vmware/vm-support/bin/nicinfo.sh |egrep 'errors|dropped' |grep -v ": 0"
To find if there is a network congestion by checking packets retransmission:
one=$(vsish -e cat /net/tcpip/instances/defaultTcpipStack/stats/tcp |grep sndrexmitpack |cut -d : -f2) ;sleep 10;two=$(vsish -e cat /net/tcpip/instances/defaultTcpipStack/stats/tcp |grep sndrexmitpack |cut -d : -f2);let
"dif= two - one";echo "$dif packets retransmitted during 10 seconds"
Adaptor and connectivity
To get all the information for vmhbas
for name in `vmkchdev -l | grep vmhba | awk '{print$5}'`;do echo $name ; echo "VID :DID SVID:SDID"; vmkchdev -l | grep -w $name | awk '{print $2 , $3}';printf "Driver: ";echo `esxcfg-scsidevs -a | grep -w $name |awk '{print $2}'`;vmkload_mod -s `esxcfg-scsidevs -a | grep -w $name|awk '{print $2}'` |grep -i version;echo `lspci -vvv | grep -w $name | awk '{print $1=$NF="",$0}'`;printf "\n";done
NIC/HBA info:
lspci
esxcli storage core adapter list
vmkchdev -l| egrep 'nic|hba'
vmkchdev -l| egrep 'nic|hba' | awk {'print $2 " " $3'} |sort | uniq
/usr/lib/vmware/vm-support/bin/nicinfo.sh
esxcfg-info -a
Devices/Adpater stats:
localcli storage core device stats get
localcli storage core adapter stats get
OR vsish:
/> get /storage/scsifw/devices/naa.600605b00e02a5a02290472e030b89c5/stat
/> get /storage/scsifw/adapters/vmhba64/stats
Get used vmhbas and number of paths associated with each one
esxcfg-mpath -L | grep -i vmhba | awk -F ":" '{print $1}'| sort | uniq -c
Paths status:
esxcli storage core path list |grep "State: " |sort | uniq -c
FC
FC status:
echo -e "FC status:\n====================";localcli storage san fc list;localcli storage san fc stats get |egrep 'Error|Failure|Loss|Invalid' |grep -v ": 0";echo;echo -e "FC Events\n====================" localcli storage san fc events get
fcoe status:
echo -e "FCOE Status \n====================";localcli storage san fcoe list && localcli storage san fcoe stats get
ISCSI
Iscsi status:
echo -e "ISCSI Status \n====================";localcli storage san iscsi list && localcli storage san iscsi stats get
Iscsi connections tests :
nc -z -s [host's port IP address] [iSCSI server's IP address] [Port ID]
vmkping I vmk1 -d -s 8972 x.x.x.x
Note: If you have more then one vmkernel port on the same network (such as a heartbeat vmkernel port for iSCSI) then all vmkernel ports on the host on the network would need to be configured with Jumbo Frames (MTU: 9000) too. If there are other vmkernel ports on the same network with a lower MTU then the vmkping command will fail with the -s 8972 option. Here in the command -d option sets DF (Don't Fragment) bit on the IPv4 packet.
To test 1500 MTU, run the command: vmkping -I vmkX x.x.x.x -d -s 1472.
You can specify which vmkernel port to use for outgoing ICMP traffic with the -I option
-d option sets DF (Don't Fragment) bit on the IPv4 packet.
-s 8972 in case you are using jumbo frames MTU 9000
-s 1472 in case you are using normal MTU 1500
Iscsi sessions/connections/targets/more:
esxcli iscsi session list
esxcli iscsi session connection list
esxcli iscsi adapter target portal list
esxcli iscsi networkportal lis
esxcli iscsi adapter param get -A vmhba34
Script to set MTU to 1500 for the VMKs used for ISCSi port binding:
for x in `localcli iscsi networkportal list |grep vmk |cut -d ":" -f2 | cut -d " " -f2`;do localcli network ip interface set -m 1500 -i $x;done
/etc/init.d/hostd restart
esxcfg-rescan --all
SAS status:
echo -e "SAS Status \n====================";localcli storage san sas list && localcli storage san sas stats get
RDM , Devices, Partitions , Datastores
Print the partition table:
partedUtil getptbl /vmfs/devices/disks/naa.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Check the partition data:
hexdump -c /vmfs/devices/disks/naa.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx:1 |less
Seach for VMFS signature
offset="128 2048"; for dev in `esxcfg-scsidevs -l | grep "Console Device:" | awk {'print $3'}`; do disk=$dev; echo $disk; partedUtil getptbl $disk; { for i in `echo $offset`; do echo "Checking offset found at $i:"; hexdump -n4 -s $((0x100000+(512*$i))) $disk; hexdump -n4 -s $((0x1300000+(512*$i))) $disk; hexdump -C -n 128 -s $((0x130001d + (512*$i))) $disk; done; } | grep -B 1 -A 5 d00d; echo "---------------------"; done
Finding & creating VMFS Partition table link
- Download the attached find_vmfs_partition_boundaries.sh script and upload to ESXi host's /tmp directory
- Change permission:
"chmod 777 /tmp/find_vmfs_partition_boundaries.sh" - Change directory to:
/sbin - "cd /sbin"
Run the script against the device (naa.60060160729025007628b54969f4e211 in this example, do not use the path /vmfs/devices/disks): - ../tmp/find_vmfs_partition_boundaries.sh naa.60060160729025007628b54969f4e211
Using naa.60060160729025007628b54969f4e211 ...
Starting offset is 1048576 and LVM majorVersion is 05. Assuming VMFS5 and GPT.
Done. Check the /tmp/partitioncmds.txt file for partition-creation syntax.
"less /tmp/partitioncmds.txt" provides the command to run: - partedUtil setptbl /vmfs/devices/disks/naa.60060160729025007628b54969f4e211 gpt "1 2048 1048562549 AA31E02A400F11DB9590000C2911D1B8 0"
Browse datastore/filesystems:
localcli storage vmfs extent list
localcli storage filesystem list
esxcli storage vmfs snapshot list
Find the detailed information about the file system:
vmkfstools -v10 -Ph /vmfs/volumes/datastore1/
Get the UUID of the ESXi installation partition:
esxcfg-info -b
Force mount:
esxcfg-volume -M xxxxx
Get the Perennially Reserved flag value for NON-VMFS LUNs :
esxcli storage vmfs extent list| grep -iEoh "naa.*|eui.*|t10.*" | awk '{print $1}' > /tmp/vmfs.txt;esxcli storage core device list | awk '{print $1}' | grep -iE "naa|eui|t10" | grep -v -f /tmp/vmfs.txt > /tmp/nonvmfs.txt ;printf "%10s %s\n" "UID " "Perennially Reserved";printf "%10s %s\n" "--- " "--------------------";for x in `grep -iE "naa|eui|t10" /tmp/nonvmfs.txt`;do printf "%10s %s\n" $x `esxcli storage core device list -d $x |grep -i "perennially" | awk '{print $NF}' `; done
List RDMs:
vim-cmd vmsvc/getallvms|grep 'vmnamehere'| awk '{print $1}'|grep [0-9]|while read a; do vim-cmd vmsvc/device.getdevices $a|grep -v parent|grep -A8 RawDisk;done
OR
find /vmfs/volumes/ -type f -name '*.vmdk' -size -1024k -exec grep -l '^createType=.*RawDeviceMap' {} \; > /tmp/rdmsluns.txt
for i in `cat /tmp/rdmsluns.txt`; do vmkfstools -q $i; done
OR
List RDM vml & vmdk
for i in `vm-support -V | awk '{print $1}' | cut -d "/" -f 1-5`; do find $i -name '*.vmdk' ;done | grep -v "\-flat.vmdk" | grep -v "rdm" > /tmp/vmdk.txt ; for i in `grep vmdk /tmp/vmdk.txt`; do echo $i ; grep -i RawDeviceMap $i; done > /tmp/tempfile.txt; grep -iB 1 "RawDeviceMap" /tmp/tempfile.txt | grep vmdk > /tmp/rdmvmdk.txt; for i in `grep vmdk /tmp/rdmvmdk.txt`; do echo $i; esxcfg-scsidevs -u |grep `vmkfstools -q $i | grep vml | awk '{print $3}'`; done | grep naa | awk '{print $1}' > /tmp/rdmnaa.txt; for i in `grep naa /tmp/rdmnaa.txt`; do esxcli storage core device setconfig -d $i --perennially-reserved=true ; done
Manually grow datastore:
partedUtil getptbl /vmfs/devices/disks/naa.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
partedUtil getUsableSectors /vmfs/devices/disks/naa.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
partedUtil resize /vmfs/devices/disks/naa.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 1 2048 xxxxxxxxxx
vmkfstools --growfs /vmfs/devices/disks/naa.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx:1 /vmfs/devices/disks/naa.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx:1
vmkfstools -V
Unmount DS:
esxcli storage filesystem unmount -u 55a670c0-474d3873-c2c6-ac162dbd3bc8
Deattach device :
esxcli storage core device detached list --> shows all manually detached devices
esxcli storage core device set --state=off -d <naa.id> --> manually detach a device
esxcli storage core device detached remove -d <naa.id> --> permanently remove the device configuration
SCSI reservation conflicts:
esxcfg-info | egrep -B5 "s Reserved|Pending"
|----Console Device....................../dev/sda
|----DevfsPath........................../vmfs/devices/disks/vml.
02000000006001c230d8abfe000ff76c198ddbc13e504552432035
|----SCSI Level..........................6
|----Queue Depth.........................128
|----Is Pseudo...........................false
|----Is Reserved.........................false
|----Pending Reservations................ 1
Note: The host that has Pending Reserves with a value that is larger than 0 is holding the lock.
Remove the reservation from anywhere :
vmkfstools --lock lunreset /vmfs/devices/disks/vml.02000000006001c230d8abfe000ff76c198ddbc13e504552432035
Manually delete datastore from the DB:
Rescan HBA/FS:
esxcfg-rescan --all
vmkfstools -V
VAAI primitives status:
esxcli storage core device vaai status get -d naa.600601603aa029002cedc7f8b356e311
verify unmap info :
esxcli storage vmfs reclaim config get -u <Datastore_UUID>
# vsish
> cd /vmkModules/vmfs3/auto_unmap/volumes/
> ls
This should list all the volumes that the system are watching for auto_unmap
Mask LUN By Path KB1009449:
esxcli storage core claimrule add -r 300 -t location -A vmhba33 -C 0 -T 3 -L 0 -P MASK_PATH
esxcli storage core claimrule load
esxcli storage core claiming reclaim -d t10.F405E46494C4542596D477279794D24364A7A4D233F69713
NFS
Connection test KB1003967:
vmkping -I vmkN -s nnnn xxx.xxx.xxx.xxx
vmkN is vmk0, vmk1, etc, depending on which vmknic is assigned to NFS.
Note: The -I option to select the vmkernel interface is available only in ESXi 5.1. Without this option in 4.x/5.0, the host uses the vmkernel associated with the destination network being pinged in the host routing table. The host routing table can be viewed by running the esxcfg-route -l command.
nnnn is the MTU size minus 28 bytes for overhead. For example, for an MTU size of 9000, use 8972.
xxx.xxx.xxx.xxx is the IP address of the target NFS storage.
nc -z array-IP 2049
remove and mount NFS3:
esxcli storage nfs remove -v nfs_datastore
esxcli storage nfs add --host=dir42.eng.vmware.com --share=/<mount_dir> --volume-name=nfsstore-dir42
esxcli storage nfs add -H 192.168.5.2 -s /ctnr-async-metro-oltp-1-siteA -v ctnr-async-metro-oltp-1-siteA
Note: for NFS 4.1 replace nfs with nfs41
Note:The conf are in /etc/vmware/esx.conf
NFS Heap
'memstats -r heap-stats -s name:pctFreeOfMax |grep nfs'
NFS DAVG
Capturing network dump
tcpdump-uw host array-ip –w /vmfs/volumes/datastorex/capture.pcap
Corruptions
VMDK
vmkfstools -x check /vmfs/volumes/iSCSI-T1/VCSA1/VCSA1_11.vmdk
vmkfstools -x repair /vmfs/volumes/iSCSI-T1/VCSA1/VCSA1_11.vmdk
LUN Partition table checker
voma -m ptbl -d /dev/disks/mpx.vmhba0:C0:T0:L0
LUN LVM Checker
voma -m lvm -d /dev/disks/mpx.vmhba0:C0:T0:L0:3
LUN VMFS
voma -m vmfs -f check -d /vmfs/devices/disks/naa.600a0980006c1123000001cf58413cdc:1
voma -m vmfs -f fix -d /vmfs/devices/disks/naa.600a0980006c1123000001cf58413cdc:1
For VMFS 6 (=>ESXi 6.7u2)
voma -m vmfs -f advfix -d /vmfs/devices/disks/naa.xxxxxx0000b8:1 -p /tmp/voma.txt
(Note: you have to unmount the DS from all hosts and run voma from one host only)
Locks
VMFS:
ls | while read x; do vmfsfilelockinfo -p $x| grep -i "is locked"; done
OR
vmkfstools -D xxx.vmdk
NFS3:
ls -la | grep .lck
#Find the Host that created the lock
hexdump -C .lck-e003090001000000
00000010 01 00 00 00 64 68 69 6e 67 2d 65 73 78 2e 76 6d |..........esx.vm|
00000020 77 61 72 65 2e 63 6f 6d 00 00 00 00 00 00 00 00 |ware.com........|
#Find the file that is being locked
stat * | grep -B2 `v2=$(v1=.lck-e003090001000000;echo ${v1:13:2}${v1:11:2}${v1:9:
2}${v1:7:2}${v1:5:2});printf "%d\n" 0x$v2` | grep File
File: Win2k8-flat.vmdk
VSAN:
ls *.vmdk | xargs grep "vsan://" | awk -F'//' '{print $2}' | awk -F'"' '{print $1}' | while read x; do vmfsfilelockinfo -p .$x.lck ; done
Find out which files are open by which process;
lsof | grep -i “VM name”
Which VM is using this process xxxx=process
esxcli vm process list | grep -iC4 xxxxx
VVOL
VVOl status:
echo -e "VVOL Status \n====================";echo -e "VVOL VASS";esxcli storage vvol vasaprovider list;echo -e "VVOL protocolendpoint "; esxcli storage vvol protocolendpoint list;echo -e "VVOL storagecontainer ";esxcli storage vvol storagecontainer list
Test VASA SSL:
openssl s_client -connect vasa.local:443
OR
openssl s_client -connect vasa.local 443
Refresh VASA cert
Browse to vCenter Server in the vSphere Web Client navigator.
1.Click the Configure tab, and click Storage Providers.
2.From the list, select the storage provider and click the Refresh the certificate
Refresh ESXi certificate:
Browse to the host in the vSphere Web Client inventory.
Click the Manage tab and click Settings.
Select System, and click Certificate.You can view detailed information about the selected host's certificate.
Click Renew or Refresh CA Certificates.
OR
cd /etc/vmware/ssl
mv rui.crt orig.rui.crt
mv rui.key orig.rui.key
/sbin/generate-certificates
/etc/init.d/hostd restart
/etc/init.d/vpxa restart
Reconnect ESXi host to vCenter server.
Run the following commands on the ESXi host:
/etc/init.d/vvold ssl_reset
/etc/init.d/vvold restart
Publish certificate vCenter:
/usr/lib/vmware-vmafd/bin/dir-cli trustedcert publish --chain --cert /tmp/vasa.crt
Trust certificate from ESXi
/etc/vmware/ssl/castore.pem
Find VVOL Objects [details]
localcli --plugin-dir /usr/lib/vmware/esxcli/int/ storage internal vvol virtualvolume get --container-id 8954ad916c2b4520-9bd5519866e14fcf --uuid naa.60060160a9c9b824cf9bdb4e38fb4dfe
localcli --plugin-dir /usr/lib/vmware/esxcli/int/ storage internal vvol virtualvolume metadata list --container-id 8954ad916c2b4520-9bd5519866e14fcf --uuid naa.60060160a9c9b824cf9bdb4e38fb4dfe
localcli --plugin-dir /usr/lib/vmware/esxcli/int/ storage internal vvol daemon set --dump-objects ; less /var/log/vvold.log
Add VASA from ESXi without vCenter (shared by Lav Tiwari)
localcli --plugin-dir /usr/lib/vmware/esxcli/int/ storage internal vvol vasaprovider add --vp-name YYYYY --vp-url https://x.x.x.x:xxxx/vasa/version.xml
/etc/init.d/hostd restart
Trivia logs for vvold and sps:
For the host vvold, the is a log level in /etc/vmware/vvold/config.xml which overrides the local vvold verbose setting:
config>
<log>
<!-- default log level -->
<level>verbose</level>
you have to change this log to trivia, and restart the vvold before it is possible to change the vvold log-level to trivia using esxcli.
"BTW, the is a log level in /etc/vmware/vvold/config.xml which overrides the local vvold verbose setting:
config>
<log>
<!-- default log level -->
<level>verbose</level>
you have to change this log to trivia, and restart the vvold before it is possible to change the vvold log-level to trivia using esxcli."
And for vCenter SPS (sometimes other components needs to be =TRACE) …. Depending on the logs/case :
- Edit file /usr/lib/vmware-vpx/sps/conf/log4j.properties, and change the following lines
log4j.appender.file.Threshold=TRACE
log4j.logger.com.vmware.vim.storage.common.vc=TRACE
- Restart sps with the command "vmon-cli -r sps"
Example "1. The error observed is that SPS is not able to invoke a Property Collector call on vpxd to find the datastore on which the virtual disk is placed.
2019-11-18T12:28:01.551Z [pool-17-thread-4] INFO opId=k2xnped9-138464-auto-2yuc-h5:70026758-3-01 com.vmware.vim.storage.common.vc.impl.VcInventoryImpl - No entities of type VirtualMachine found in VC inventory2019-11-18T12:28:01.551Z [pool-17-thread-4] WARN opId=k2xnped9-138464-auto-2yuc-h5:70026758-3-01 com.vmware.vim.storage.common.vc.impl.VcQueryImpl - No virtual devices found for given vm : vm-346390
2019-11-18T12:28:01.551Z [pool-17-thread-4] ERROR opId=k2xnped9-138464-auto-2yuc-h5:70026758-3-01 com.vmware.pbm.prov.impl.PreProvisionServiceImpl - Not able to find current placement hub of entity: (pbm.ServerObjectRef) {
dynamicType = null,
dynamicProperty = null,
objectType = virtualDiskId,
key = vm-346390:2000,
serverUuid = ED65BB82-CDFD-4B35-8986-FB9C0B5EC307
}
2019-11-18T12:28:01.551Z [pool-17-thread-4] WARN opId=k2xnped9-138464-auto-2yuc-h5:70026758-3-01 com.vmware.pbm.prov.impl.PreProvisionServiceImpl - Error when trying to run pre-provision validation
java.lang.RuntimeException: Not able to find current placement hub of entity: (pbm.ServerObjectRef) {
dynamicType = null,
dynamicProperty = null,
objectType = virtualDiskId,
key = vm-346390:2000,
serverUuid = ED65BB82-CDFD-4B35-8986-FB9C0B5EC307
}
at com.vmware.pbm.prov.impl.PreProvisionServiceImpl.findCurrentPlacementHub(PreProvisionServiceImpl.java:599)
at com.vmware.pbm.prov.impl.PreProvisionServiceImpl.fillExistingPolicyAssociations(PreProvisionServiceImpl.java:512)
at com.vmware.pbm.prov.impl.PreProvisionServiceImpl.preProvisionValidate(PreProvisionServiceImpl.java:437)
at com.vmware.pbm.profile.impl.ProfileManagerImpl.preProvisionProcess(ProfileManagerImpl.java:4119)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.vmware.vim.vmomi.server.impl.InvocationTask.run(InvocationTask.java:65)
at com.vmware.vim.vmomi.server.common.impl.RunnableWrapper$1.run(RunnableWrapper.java:47)
at com.vmware.vim.storage.common.task.opctx.RunnableOpCtxDecorator.run(RunnableOpCtxDecorator.java:38)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
2019-11-18T12:28:01.552Z [pool-17-thread-4] INFO opId=k2xnped9-138464-auto-2yuc-h5:70026758-3-01 com.vmware.pbm.profile.impl.ProfileManagerImpl - Timer stopped: preProvisionProcess, Time taken: 60 ms.
2. However, just a few milli secs before this SPS was able to fetch this information from vpxd.
2019-11-18T12:28:01.536Z [pool-6-thread-10] DEBUG opId=k2xnped9-138464-auto-2yuc-h5:70026758-3-01 com.vmware.spbm.domain.policy.Entity - Datastore for vm-346390:2000 : ManagedObjectReference{type = Datastore, value = datastore-323358}
2019-11-18T12:28:01.536Z [pool-6-thread-10] DEBUG opId=k2xnped9-138464-auto-2yuc-h5:70026758-3-01 com.vmware.spbm.domain.policy.Entity - Datastore type of vm-346390:2000 : VVOL
2019-11-18T12:28:01.540Z [pool-6-thread-10] DEBUG opId=k2xnped9-138464-auto-2yuc-h5:70026758-3-01 com.vmware.spbm.domain.policy.Entity - Backing object id for vm-346390:2000 : naa.60002AC00000000000007B430001A419
2019-11-18T12:28:01.544Z [pool-6-thread-10] DEBUG opId=k2xnped9-138464-auto-2yuc-h5:70026758-3-01 com.vmware.spbm.domain.policy.Entity - Storage id for vm-346390:2000 : [vvol:482a38f0f78a48a8-a937d1f8ae4b64d3]
3. There is no error in the vpxd logs corresponding to the property collector query.
4. The vc support bundle has TRACE level logging for parts of SPBM. Thanks for attempting to reproduce the issue with this turned on. However, this did not enable TRACE level logging in the part of code that connects to VC for the property collector query. So at this point, it is not clear why the property collector call from SPS to vpxd failed to fetch the datastore information for the virtual disk.
Mohamed,
Is this a one-off failure or is it easily reproducible? If reproducible, could you increase the logging level to TRACE for the property collector calls alone? This will unfortunately log a lots of lines and the log files will churn faster. So you would have to turn this on, reproduce the issue, collect the logs, and then turn it off quickly. Here is how it can be done:
- Edit file /usr/lib/vmware-vpx/sps/conf/log4j.properties, and change the following lines
log4j.appender.file.Threshold=TRACE
log4j.logger.com.vmware.vim.storage.common.vc=TRACE
- Restart sps with the command "vmon-cli -r sps"
- Also turn on TRIVIA logging in vpxd, and restart with the command "vmon-cli -r vpxd"
- Repro issue, collect vc support, revert these log lines in sps and restart sps once again, revert vpxd to INFO level logging and restart vpxd again."
Network commands
#List NICs
esxcfg-nics -l
#List vmknics
esxcfg-vmknic -l
#List routes
esxcfg-route -l
#List neighbor-list
esxcfg-route -n
#List Switchs
esxcfg-vswitch -l
#List all ports
net-stats -l
#errors and statistics for a network adapter, run this command:
esxcli network nic stats get -n <vmnicX>
#Get the vmk tag
esxcli network ip interface tag get -i vmk1
----------------------
# esxcli storage core device detached list --> shows all manually detached devices
# esxcli storage core device set --state=off -d <naa.id> --> manually detach a device
# esxcli storage core device detached remove -d <naa.id> --> permanently remove the device
============================
iSCSI:
# nc -z -s [host's port IP address] [iSCSI server's IP address] [Port ID]
-----------------
# vmkping I vmk1 -d -s 8972 x.x.x.x
Note: If you have more then one vmkernel port on the same network (such as a heartbeat vmkernel port for iSCSI) then all vmkernel ports on the host on the network would need to be configured with Jumbo Frames (MTU: 9000) too. If there are other vmkernel ports on the same network with a lower MTU then the vmkping command will fail with the -s 8972 option. Here in the command -d option sets DF (Don't Fragment) bit on the IPv4 packet.
To test 1500 MTU, run the command: vmkping -I vmkX x.x.x.x -d -s 1472.
You can specify which vmkernel port to use for outgoing ICMP traffic with the -I option
-d option sets DF (Don't Fragment) bit on the IPv4 packet.
-s 8972 in case you are using jumbo frames MTU 9000
-s 1472 in case you are using normal MTU 1500
------------
esxcli iscsi adapter target portal list --> shows all iscsi target portals
esxcli iscsi adapter list
esxcli iscsi networkportal list --> lists the details for iscsi adapters (excluding vmnics)
esxcli iscsi adapter param get -A vmhba34 --> lists parameters of iscsi adapters
----------------
Nmp commands:
esxcli storage nmp satp list --> lists the default PSP SATP combinations
esxcli storage nmp psp list --> lists PSPs available in the host
esxcli storage nmp path list -d <NAA ID> -->
esxcli storage nmp device list -d <NAA ID> | grep PSP --> shows the PSP of this LUN.
esxcli storage nmp device set -d <NAA ID> --psp=VMW_PSP_RR --> changes the PSP of this LUN to be RR.
esxcli storage core claimrule list
esxcli storage core claimrule load --> load the new claim rule you added
esxcli storage core claimrule run
esxcli storage core claiming reclaim -d <NAA ID> --> unclaim and then re-claim the LUN
esxcli storage nmp satp rule list
========================
VAAI
# esxcli storage core device vaai status get
# esxcli storage core device vaai status get -d naa.600601603aa029002cedc7f8b356e311 --> to know which VAAI primitives are supported,helpful in Unmap Issues