Skip to main content

Recreating a missing VMFS datastore partition in VMware vSphere 5.x and 6.x

 


 Symptoms
  • A datastore has become inaccessible.
  • A VMFS partition table is missing.
 Purpose
The partition table is required only during a rescan. This means that the datastore may become inaccessible on a host during a rescan if the VMFS partition was deleted after the last rescan. The partition table is physically located on the LUN, so all vSphere hosts that have access to this LUN can see the change has taken place. However, only the hosts that do a rescan will be affected.
 
This article provides information on:
  • Determining whether this is the same problem
  • Resolving the problem
 Cause
This issue occurs because the VMFS partition can be deleted by deleting the datastore from the vSphere Client. This is prevented by the software, if the datastore is in use. It can also happen if a physical server has access to the LUN on the SAN and does an install, for example.
 Resolution
To resolve this issue:

Run the partedUtil command on the host with the issues and verify if your output is similar to
 
partedUtil getptbl /vmfs/devices/disks/naa.6006016045502500c20a2b3ccecfe011


Verify if the output of the command is similar to:

gpt
52216 255 63 838860800
1 2048 838850039 AA31E02A400F11DB9590000C2911D1B8 vmfs 0



If your output appears similar to the following, it indicates the partition is missing:

gpt
52216 255 63 838860800



In this case, you must recreate the partition. To recreate the partition: 
  1. Find the beginning and end blocks of the VMFS partition. To find the beginning of the partition, run this command (one line script) on the host:


    # offset="128 2048"; for dev in `esxcfg-scsidevs -l | grep "Console Device:" | awk {'print $3'}`; do disk=$dev; echo $disk; partedUtil getptbl $disk; { for i in `echo $offset`; do echo "Checking offset found at $i:"; hexdump -n4 -s $((0x100000+(512*$i))) $disk; hexdump -n4 -s $((0x1300000+(512*$i))) $disk; hexdump -C -n 128 -s $((0x130001d + (512*$i))) $disk; done; } | grep -B 1 -A 5 d00d; echo "---------------------"; done


    Note: The preceding script checks all of the storage devices and the list may be lengthy. This script is not applicable for local disks. 

    You see output similar to:

    /vmfs/devices/disks/naa.60060160455025009839a9ed4cfee011
    msdos
    78325 255 63 1258291200
    1 128 1258291124 251 0
    Checking offset found at 128:
    0110000 d00d c001 
    0110004
    1310000 f15e 2fab
    1310004
    0131001d 46 43 5f 53 68 61 72 65 64 00 45 76 65 72 5f 47 |old_VMFS3.......|
    0131002d 65 74 74 69 6e 67 5f 55 70 00 00 00 00 00 00 00 |................|
    ---------------------
    /vmfs/devices/disks/naa.6006016045502500c20a2b3ccecfe011
    gpt
    52216 255 63 838860800
    Checking offset found at 2048:
    0200000 d00d c001 
    0200004
    1400000 f15e 2fab 
    1400004

    0140001d 4a 55 50 48 41 4d 5f 53 52 4d 35 00 00 00 00 00 |new_VMFS5.......|
    0140002d 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
    ---------------------


    The preceding output has two example storage devices. The first example was created on an ESXi host prior to version 5 and it reports: 

    Checking offset found at 128.

    Where 128 is the beginning block.

    The second storage device was created on vSphere 5 or later and reports: 

    Checking offset found at 2048. 

    Note: In this example, you are using the second device, so the beginning of the partition is 2048.
     
  2. To get the end block for the partition, run this command:

    # partedUtil getUsableSectors /vmfs/devices/disks/naa.6006016045502500c20a2b3ccecfe011 

    You see this output:

    34 838860766

    Notes
     
    • If you do not see this output and you get an Unknown partition table on disk error, run this command to label the table as a GPT partition table:

      # partedUtil mklabel /vmfs/devices/disks/naa.6006016045502500c20a2b3ccecfe011 gpt

      Rerun the partedUtil getUsableSectors command. If you do not get the expected output of 2 numbers, run the partition type identification commands in the next bullet also.
       
    • If you do not see the specified output and receive an error message stating partition table invalid,unable to satisfy all constraints on the partition or a similar error, run this command:

      partedUtil setptbl /vmfs/devices/disks/naa.6006016045502500c20a2b3ccecfe011 gpt "1 2048 4123456 AA31E02A400F11DB9590000C2911D1B8 0"

      This creates a temporary partition. You can now read the disk information. You should now see the correct output. You should now be able to calculate the correct last usable block. 

      The partition type identifies the purpose of a partition, and may be represented by either a decimal identifier (for example, 251) or a GUID (for example, AA31E02A400F11DB9590000C2911D1B8).  Partitions created on ESXi 5.x and higher with the gpt disklabel must be specified using the GUID.
       
  3. Run this command to temporarily turn off Storage IO Control:

    # /etc/init.d/storageRM stop
     
  4. Run this command to set the correct values for the partition table:

    Note: Ensure to use appropriate values in this command depending on your environment.

    # partedUtil setptbl /vmfs/devices/disks/naa.6006016045502500c20a2b3ccecfe011 gpt "1 2048 838860766 AA31E02A400F11DB9590000C2911D1B8 0"

    The number in Red indicates the last usable block, so the end of the partition cannot be any higher. It is unknown whether this was the number used when the datastore was created, so you can try it and adjust if necessary. 
     
  5. Run this command to attempt to mount the VMFS datastore:

    vmkfstools -V


    Note: If the datastore mounts, the numbers are correct and you need not adjust the value.
     
  6. If the datastore does not mount, you may see a message in /var/log/vmkernel.log similar to:

    ... cpu0:44828)LVM: 2891: [naa.6006016045502500c20a2b3ccecfe011:1] Device expanded (actual size 838858719 blocks, stored size 838847992 blocks)


    In this case, add the offset value, minus one, to the stored size to get the actual end block.

    For example:

    838847992 + 2047 = 838850039

    Run the command with the new end value:

    # partedUtil setptbl /vmfs/devices/disks/naa.6006016045502500c20a2b3ccecfe011 gpt "1 2048 838850039 AA31E02A400F11DB9590000C2911D1B8 0"


    Now you have the correct partition. Run the VMFS rescan again:

    # vmkfstools -V

     
  7. Run this command to temporarily turn off Storage IO Control:

    # /etc/init.d/storageRM start

After the datastore is successfully mounted on one host, you can expect that the same VMFS rescan command will mount the VMFS datastore when run on other hosts that have access to this LUN. 

Alternatively, you can run a full cluster rescan from the vCenter Server using the vSphere Client.
 Related Information
In VMware Sphere 5.x and later, newly-created VMFS datastores use GPT partition tables instead of MBR partition tables.
 
The benefit of using GPT partition tables is thatmore than one copy of the partition table is kept on the LUN. If a physical Windows host has access to the LUN on the SAN, it, by default, automatically assigns a drive letter to the LUN, which destroys an MBR partition table. This type of problem does not occur with GPT, since vSphere uses the backup partition table.



To avoid lengthy delays, please add in addtional script for single device interrogation:
 
disk="/vmfs/devices/disks/naa.....xxxxx"; offset="128 2048"; echo $disk; partedUtil getptbl $disk; { for i in `echo $offset`; do echo "Checking offset found at $i:"; hexdump -n4 -s $((0x100000+(512*$i))) $disk; hexdump -n4 -s $((0x1300000+(512*$i))) $disk; hexdump -C -n 128 -s $((0x130001d + (512*$i))) $disk; done; } | grep -B 1 -A 5 d00d; echo "---------------------"

Comments

  1. You're a legend, this guide just saved me a world of pain after accidentally deleting a datastore from a local disk, thank you very much

    ReplyDelete

Post a Comment

Popular posts from this blog

ما هى ال FSMO Roles

  بأختصار ال FSMO Roles هى اختصار ل Flexible Single Operation Master و هى عباره عن 5 Roles فى ال Active Directory و هما بينقسموا لقسمين A - Forest Roles 1- Schema Master Role و هى ال Role اللى بتتحكم فى ال schema و بيكون فى Schema Master Role واحد فى ال Forest بيكون موجود على Domain Controller و بيتم التحكم فيها من خلال ال Active Directory Schema Snap in in MMC بس بعد ما يتعمل Schema Register بواسطه الامر التالى من ال Cmd regsvr32 schmmgmt.dll 2-Domin Naming Master و هى ال Role المسئوله عن تسميه ال Domains و بتتأكد ان مفيش 2 Domain ليهم نفس الاسم فى ال Forest و بيتم التحكم فيها من خلال ال Active Directory Domains & Trusts B- Domain Roles 1-PDC Emulator و هى ال Role اللى بتتحكم فى ال Password change فى ال domain و بتتحكم فى ال time synchronization و هى تعتبر المكان الافتراضى لل GPO's و هى تعتبر Domain Role مش زى الاتنين الاولانيين و بيتم التحكم فيها من خلال ال Active directory Users & Computers عن طريق عمل كليك يمين على اسم الدومين و نختار operations master فى تاب ال PDC Emu

Unlock the VMware VM vmdk file

  Unlock the VMware VM vmdk file Kill -9 PID Sometimes a file or set of files in a VMFS become locked and any attempts to edit them or delete will give a device or resource busy error, even though the vm associated with the files is not running. If the vm is running then you would need to stop the vm to manipulate the files. If you know that the vm is stopped then you need to find the ESX server that has the files locked and then stop the process that is locking the file(s). 1. Logon to the ESX host where the VM was last known to be running. 2.  vmkfstools -D /vmfs/volumes/path/to/file  to dump information on the file into /var/log/vmkernel 3.  less /var/log/vmkernel  and scroll to the bottom, you will see output like below: a. Nov 29 15:49:17 vm22 vmkernel: 2:00:15:18.435 cpu6:1038)FS3: 130: <START vmware-16.log> b. Nov 29 15:49:17 vm22 vmkernel: 2:00:15:18.435 cpu6:1038)Lock [type 10c00001 offset 30439424 v 21, hb offset 4154368 c. Nov 29 15:49:17 vm22 vmkernel: gen 664