Skip to main content

VMFS Datastore mount Issues troubleshooting

1. In order to mount a datastore, you must have:


  • Connectivity to the LUN 
    • You can check using :
      • esxcli storage core device list #list all devices 
      • esxcli storage core device list -d naa.60060160a69598294457e678ed964ba4 #list one LUN

      • ls -alh /dev/disks/naa.6589cfc000000d57b6aa334a36a85f19

    • If it is a new host make sure that:
      • you have compatible HBA and drivers installed

      • You have added the ISCSi targets and allowed the host IQN in the storage array

      • In case of FC you created the relevant zoning and masking and the HBA can login to the fabric and the array

      • Troubleshoot more using Storage troubleshooting and relevant KBs

  • A partition table
    • partedUtil getptbl /dev/disks/naa.6589cfc000000d57b6aa334a36a85f19 #show the partion table
      gpt
      7179 255 63 115343360
      1 2048 115343326 AA31E02A400F11DB9590000C2911D1B8 vmfs 0

    • If the datastore was deleted you can try to recreate it if you can find the VMFS signature

      • offset="128 2048"; for dev in `esxcfg-scsidevs -l | grep "Console Device:" | awk {'print $3'}`; do disk=$dev; echo $disk; partedUtil getptbl $disk; { for i in `echo $offset`; do echo "Checking offset found at $i:"; hexdump -n4 -s $((0x100000+(512*$i))) $disk; hexdump -n4 -s $((0x1300000+(512*$i))) $disk; hexdump -C -n 128 -s $((0x130001d + (512*$i))) $disk; done; } | grep -B 1 -A 5 d00d; echo "---------------------"; done

          • Example of recoverable datstore output:

            /vmfs/devices/disks/naa.600a0980383137546c5d503247614132

          • gpt

          • 294110 255 63 4724883456
            1 2048 264191 E3C9E3160B5C4DB8817DF92DF00215AE microsoftRsvd 0
            Checking offset found at 2048:
            0200000 d00d c001
            0200004
            1400000 f15e 2fab
            1400004
            0140001d 52 43 56 44 49 5f 56 44 49 49 6d 61 67 65 5f 30 |RCVDI_VDIImage_0|
            0140002d 32 4e 5f 76 6f 6c 00 00 00 00 00 00 00 00 00 00 |2N_vol..........|

        • Example of partion recreation:
        • partedUtil getUsableSectors /vmfs/devices/disks/naa.600a0980383137546c5d503247614132
          34 4724883422

        •  partedUtil setptbl /vmfs/devices/disks/naa.600a0980383137546c5d503247614132 gpt "1 2048 4724883422 AA31E02A400F11DB9590000C2911D1B8 0"



  • LVM and VMFS metadata
    • Make sure the LUN is not detected as a snapshot lun
      • Logs example:
        • LVM: 8445: Device eui.0017380012020364:1 detected to be a snapshot:
          LVM: 8452: queried disk ID: <type 1, len 17, lun 36, devType 0, scsi 0, h(id) 7683208289187576905>
          LVM: 8459: on-disk disk ID: <type 1, len 17, lun 17, devType 0, scsi 0, h(id) 7683208289187576905>
        • esxcli storage vmfs snapshot list #to list snapshots lun
        • vSphere handling of LUNs detected as snapshot LUNs (1011387)
      • hexdump -C /vmfs/devices/disks/naa.6589cfc0000001cfa6997d5c9ea677b8:1 |head -35 #reading the header of the partition in raw data
      • 00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
      • *
      • 00100000  0d d0 01 c0 06 00 00 00  15 00 00 00 01 16 03 00  |................|
      • 00100010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
      • 00100020  00 00 00 00 00 00 00 00  00 00 00 00 00 00 31 30  |..............10|
      • 00100030  30 30 30 30 30 31 00 00  00 00 00 00 00 00 69 53  |000001........iS|
      • 00100040  43 53 49 20 00 00 00 00  00 00 00 00 00 00 00 00  |CSI ............|
      • 00100050  00 00 00 00 00 00 00 00  00 00 00 02 00 00 00 fe  |................|
      • 00100060  ef ff 4a 00 00 00 01 00  00 00 01 00 00 00 00 00  |..J.............|
      • 00100070  00 00 03 00 00 00 00 00  00 00 00 00 10 01 00 00  |................|
      • 00100080  00 00 3f 18 e3 5f a4 bd  13 5f f0 9d 00 50 56 01  |..?.._..._...PV.|
      • 00100090  81 cb 63 cf ba f0 1e b7  05 00 f8 4c bb f0 1e b7  |..c........L....|
      • 001000a0  05 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
      • 001000b0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
      • 001000c0  00 00 00 00 00 00 00 00  00 00 00 10 00 00 af 04  |................|
      • 001000d0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
      • *
      • 00101000  00 00 00 f0 4a 00 00 00  02 00 00 00 00 00 00 00  |....J...........|
      • 00101010  01 00 00 00 35 66 65 33  31 38 33 65 2d 66 34 37  |....5fe3183e-f47|
      • 00101020  65 64 63 63 36 2d 37 32  36 61 2d 30 30 35 30 35  |edcc6-726a-00505|
      • 00101030  36 30 31 38 31 63 62 00  00 00 00 00 00 00 00 00  |60181cb.........|
      • 00101040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
      • 00101050  00 00 00 00 3e 18 e3 5f  c6 dc 7e f4 6a 72 00 50  |....>.._..~.jr.P|
      • 00101060  56 01 81 cb 01 00 00 00  05 34 bb f0 1e b7 05 00  |V........4......|
      • 00101070  00 00 00 00 af 04 00 00  00 00 00 00 00 00 00 00  |................|
      • 00101080  ae 04 00 00 00 00 00 00  83 47 bb f0 1e b7 05 00  |.........G......|
      • 00101090  01 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
      • 001010a0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
      • *
      • 00101110  01 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
      • 00101120  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
      • *
      • 0017f000  6e 61 61 2e 36 35 38 39  63 66 63 30 30 30 30 30  |naa.6589cfc00000|
      • 0017f010  30 31 63 66 61 36 39 39  37 64 35 63 39 65 61 36  |01cfa6997d5c9ea6|
      • 0017f020  37 37 62 38 3a 31 00 00  00 00 00 00 00 00 00 00  |77b8:1..........|
  • Read and write access to the metadata
    • Check the vmkernel logs for any IO errors , read only messages , SCSi read or write failures
    • you can use VOMA after making sure that the datastore is not mounted on any host

    • voma -f check -d /vmfs/devices/disks/mpx.vmhba0:C0:T1:L0:1

      Module name is missing. Using "vmfs" as default
      Running VMFS Checker version 2.1 in check mode
      Initializing LVM metadata, Basic Checks will be done


      Checking for filesystem activity
      Performing filesystem liveness check..\Scanning for VMFS-6 host activity (4096 bytes/HB, 1024 HBs).
      Scsi 2 reservation successful
      Phase 1: Checking VMFS header and resource files
      Detected VMFS-6 file system (labeled:'esxi1_local') with UUID:5f843dd3-0a319ad2-00cd-0050560181cb, Version 6:82
      Phase 2: Checking VMFS heartbeat region
      Marking Journal addr (0, 3) in use
      Marking Journal addr (0, 2) in use
      Phase 3: Checking all file descriptors.
      Phase 4: Checking pathname and connectivity.
      Phase 5: Checking resource reference counts.
      ON-DISK ERROR: JBC inconsistency found: (0,0) allocated in bitmap, but never used
      ON-DISK ERROR: JBC inconsistency found: (0,1) allocated in bitmap, but never used
      ON-DISK ERROR: JBC inconsistency found: (0,4) allocated in bitmap, but never used
      ON-DISK ERROR: JBC inconsistency found: (0,6) allocated in bitmap, but never used


      Total Errors Found: 4


  • The ability to reserve/lock the datastore
    • check the vmkernrl logs for something like :
      • 2017-01-31T20:49:15.992Z cpu13:34177 opID=e922c397)WARNING: HBX: 2227: ATS-Only VMFS volume 'SYN01' is not mounted. This host does not support ATS, or ATS initialization failed.

      • 2017-01-31T20:49:15.992Z cpu13:34177 opID=e922c397)WARNING: Fil3: 2469: Failed to reserve volume f530 28 1 56f52615 c7d247e4 b190e688 6588261c 0 0 0 0 0 0 0

      • 2020-12-04T11:33:23.475Z cpu4:2098225)NMP: nmp_ResetDeviceLogThrottling:3776: Error status H:0x0 D:0x18 P:0x0 Sense Data: 0x0 0x0 0x0 from dev "naa.xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx" occurred 2000 times(of 2003 commands)
    • You may want to:

Tips: 

  • Scan storage for LUNs and VMFS :
    esxcli storage core adapter rescan --all
    vmkfstools -V
  • List all detected VMFS datastores mounted or not
    esxcfg-scsidevs -m
    naa.6589cfc000000d57b6aa334a36a85f19:1 /vmfs/devices/disks/naa.6589cfc000000d57b6aa334a36a85f19:1 5f8452f7-3db2f584-829e-0050560181cb 0 iSCSI-2
    naa.6589cfc0000001cfa6997d5c9ea677b8:1 /vmfs/devices/disks/naa.6589cfc0000001cfa6997d5c9ea677b8:1 5fe3183f-6b492edc-33e6-0050560181cb 0 Datastore_freeNAS
    naa.6589cfc000000797d1402030f7936812:1 /vmfs/devices/disks/naa.6589cfc000000797d1402030f7936812:1 5f8452cb-821eba08-5cf8-0050560181cb 0 iSCSI-1
    mpx.vmhba0:C0:T0:L0:7 /vmfs/devices/disks/mpx.vmhba0:C0:T0:L0:7 5f7f0aa6-5ed5e268-3227-0050560181cb 0 OSDATA-5f7f0aa6-5ed5e268-3227-0050560181cb
    mpx.vmhba0:C0:T1:L0:1 /vmfs/devices/disks/mpx.vmhba0:C0:T1:L0:1 5f843dd3-0a319ad2-00cd-0050560181cb 0 esxi1_local

  • The grep that grep it all !!! Volume name, UUID and Device name 
    egrep -i 'esxi1_local|5f843dd3-0a319ad2-00cd-0050560181cb|naa.6589cfc0000001cfa6997d5c9ea677b8' /var/log/vmknerl.log
  • Massive snapshot force mount 
    for i in `esxcfg-volume -l | grep VMFS | cut -d ' ' -f 3 | cut -d '/' -f 2 `; do esxcfg-volume -M $i ; done

Comments

Popular posts from this blog

ما هى ال FSMO Roles

  بأختصار ال FSMO Roles هى اختصار ل Flexible Single Operation Master و هى عباره عن 5 Roles فى ال Active Directory و هما بينقسموا لقسمين A - Forest Roles 1- Schema Master Role و هى ال Role اللى بتتحكم فى ال schema و بيكون فى Schema Master Role واحد فى ال Forest بيكون موجود على Domain Controller و بيتم التحكم فيها من خلال ال Active Directory Schema Snap in in MMC بس بعد ما يتعمل Schema Register بواسطه الامر التالى من ال Cmd regsvr32 schmmgmt.dll 2-Domin Naming Master و هى ال Role المسئوله عن تسميه ال Domains و بتتأكد ان مفيش 2 Domain ليهم نفس الاسم فى ال Forest و بيتم التحكم فيها من خلال ال Active Directory Domains & Trusts B- Domain Roles 1-PDC Emulator و هى ال Role اللى بتتحكم فى ال Password change فى ال domain و بتتحكم فى ال time synchronization و هى تعتبر المكان الافتراضى لل GPO's و هى تعتبر Domain Role مش زى الاتنين الاولانيين و بيتم التحكم فيها من خلال ال Active directory Users & Computers عن طريق عمل كليك يمين على اسم الدومين و نختار operations master فى تاب ال PDC Emu

Recreating a missing VMFS datastore partition in VMware vSphere 5.x and 6.x

    Symptoms A datastore has become inaccessible. A VMFS partition table is missing.   Purpose The partition table is required only during a rescan. This means that the datastore may become inaccessible on a host during a rescan if the VMFS partition was deleted after the last rescan. The partition table is physically located on the LUN, so all vSphere hosts that have access to this LUN can see the change has taken place. However, only the hosts that do a rescan will be affected.   This article provides information on: Determining whether this is the same problem Resolving the problem   Cause This issue occurs because the VMFS partition can be deleted by deleting the datastore from the vSphere Client. This is prevented by the software, if the datastore is in use. It can also happen if a physical server has access to the LUN on the SAN and does an install, for example.   Resolution To resolve this issue: Run the  partedUtil  command on the host with the issues and verify if your output

Unlock the VMware VM vmdk file

  Unlock the VMware VM vmdk file Kill -9 PID Sometimes a file or set of files in a VMFS become locked and any attempts to edit them or delete will give a device or resource busy error, even though the vm associated with the files is not running. If the vm is running then you would need to stop the vm to manipulate the files. If you know that the vm is stopped then you need to find the ESX server that has the files locked and then stop the process that is locking the file(s). 1. Logon to the ESX host where the VM was last known to be running. 2.  vmkfstools -D /vmfs/volumes/path/to/file  to dump information on the file into /var/log/vmkernel 3.  less /var/log/vmkernel  and scroll to the bottom, you will see output like below: a. Nov 29 15:49:17 vm22 vmkernel: 2:00:15:18.435 cpu6:1038)FS3: 130: &lt;START vmware-16.log&gt; b. Nov 29 15:49:17 vm22 vmkernel: 2:00:15:18.435 cpu6:1038)Lock [type 10c00001 offset 30439424 v 21, hb offset 4154368 c. Nov 29 15:49:17 vm22 vmkernel: gen 664