Adams Bros Blog

30May/0925

Recover LVM Volume Groups and Logical Volumes WITHOUT Backups

I recently had a misfortune, in that somehow my volume group meta-data got corrupted, and LVM would not enable the volume group. Essentially, I lost my LVM volume disk. This happened after I resized a volume, and had done a file system check before and after. So, I knew my data was still there.

I did an lvextend on my primary logical volume. Normally this is a routine task, but for some reason, things went very badly for me this time around. I did an "fsck -f" before and after extending the volume and the filesystem (with resize2fs). Everything checked out just fine, so I thought everything was done, and ready to reboot.

I proceeded to issue the three finger salute, and my system began the reboot process. Upon trying to boot up, I got errors about my root logical volume not being found. So, I booted up with a gentoo live cd again, and got the following errors...

root@Microknoppix:~# pvscan
/dev/hda: open failed: Read-only file system
Attempt to close device '/dev/hda' which is not open.
Incorrect metadata area header checksum
Incorrect metadata area header checksum
Incorrect metadata area header checksum
WARNING: Volume Group s is not consistent
PV /dev/sdb5   VG bak             lvm2 [32.00 GB / 0    free]
PV /dev/sdb6   VG bak             lvm2 [266.09 GB / 0    free]
PV /dev/sda4   VG s               lvm2 [207.58 GB / 88.70 GB free]
PV /dev/sdf2                      lvm2 [59.88 GB]
PV /dev/sdf4                      lvm2 [88.89 GB]
Total: 5 [654.42 GB] / in use: 3 [505.66 GB] / in no VG: 2 [148.76 GB]

root@Microknoppix:~# vgchange -ay
Incorrect metadata area header checksum
Incorrect metadata area header checksum
1 logical volume(s) in volume group "bak" now active
Incorrect metadata area header checksum
Incorrect metadata area header checksum
Volume group "s" inconsistent
Incorrect metadata area header checksum
Incorrect metadata area header checksum
WARNING: Inconsistent metadata found for VG s - updating to use version
19
Incorrect metadata area header checksum
Automatic metadata correction failed

So, what to do? I didn't have a system level backup (just data), because I hadn't gotten a round to it yet, after converting from Mac OS X. Well, with gentoo Linux, this is a bit devastating, because you have to recompile entirely from source. That can take days, and sometimes it takes months to figure out all the settings you had, if you don't have a backup. To boot, I didn't even have a backup of '/etc/', where the LVM backups are stored, DOH!!! As a result, all the lvm backups were unavailable. This is all really sad, seeing I'm an automatic backup freak, and can't stand it when I don't have system backups.

Now, one could go searching for the beginning of their logical volume, if they wanted to, and recover just that. But, that could be very painful, especially if your LV is not contiguous; which can easily happen if you have multiple drives in your volume group, and you have been resizing your volumes a few times.

Well, as it goes, my brother suggested I boot up with a knoppix CD, to see if I could figure out how to fix the problem. I began running a few different commands. I started with pvs, which displays physical volume information.

root@Microknoppix:~# pvs -v

Scanning for physical volume names
Incorrect metadata area header checksum
Incorrect metadata area header checksum
WARNING: Volume Group s is not consistent
Incorrect metadata area header checksum
Incorrect metadata area header checksum
PV         VG   Fmt  Attr PSize   PFree  DevSize PV UUID
/dev/sda4  s    lvm2 a-   207.58G 88.70G 207.58G 8cYXSr-l35B-2HBg-V7YS-TWsb-rZ8L-C5EC7J
/dev/sdb5  bak  lvm2 a-    32.00G     0   32.00G Vx3gVW-YNoq-xLHt-rOaJ-2HHW-qBcj-nJSfmx
/dev/sdb6  bak  lvm2 a-   266.09G     0  266.09G o7Mi6k-lEsH-ndqb-QMxe-3t3Z-jTWS-qw9KKv

Next I did "vgdisplay -v" dump, and then I ran into pvck by accident. This will display your physical volume information. It happens to display the offsets to all of your LVM metadata backups (GREAT).  I've heard that these are stored in a cycling manner.  So, it may not be worth paying attention to the order they appear.

root@Microknoppix:/mnt/safe/trenta# pvck -d -v /dev/sda4
Scanning /dev/sda4
Incorrect metadata area header checksum
Found label on /dev/sda4, sector 1, type=LVM2 001"
Found text metadata area: offset=4096, size=192512
Found LVM2 metadata record at offset=26624, size=2048, offset2=0 size2=0
Found LVM2 metadata record at offset=24576, size=2048, offset2=0 size2=0
Found LVM2 metadata record at offset=22528, size=2048, offset2=0 size2=0
Found LVM2 metadata record at offset=20480, size=2048, offset2=0 size2=0
Found LVM2 metadata record at offset=17920, size=2560, offset2=0 size2=0
Found LVM2 metadata record at offset=15360, size=2560, offset2=0 size2=0
Found LVM2 metadata record at offset=12800, size=2560, offset2=0 size2=0
Found LVM2 metadata record at offset=10752, size=2048, offset2=0 size2=0
Found LVM2 metadata record at offset=8704, size=2048, offset2=0 size2=0
Found LVM2 metadata record at offset=6656, size=2048, offset2=0 size2=0
Found LVM2 metadata record at offset=4608, size=2048, offset2=0 size2=0
Found LVM2 metadata record at offset=30720, size=165888, offset2=0 size2=0
Found text metadata area: offset=96646856704, size=183296
Incorrect metadata area header checksum

At this point, I'm becoming a little more cheery; I have a chance at recovery, FINALLY; no recompiling my entire system. So, I check knoppix to find out if it has a hex editor, and sure enough, there is an editor called hexedit. So, I ran "hexedit /dev/sda4". I converted the offsets from the pvck output to hex, and went to those offsets by paging down in hexedit. The last record is 30720, which equates to 0x7800. I start highlighting at that offset (Ctrl-Space), and then select the entire file up until the 0x0A0A newline bytes at the end of it. I copy (Esc-W), and then I paste to a file (Esc-Y), and call it /path/lvm-metadata-30720.txt. I do this for everyone that I think might contain information I need. In order to know whether it's relevant data or not, you have to know what the last few LVM changes you made were. Things like sizes of logical volumes, the size of the volume group, what physical volumes existed in the volume group, and things of that nature will all be helpful in recovery. For example, did you double the size of a logical volume, or reduce, or whatever? Then, it's just a matter of diffing each version, to see when the important changes were made. My diff shows...

root@Microknoppix:/mnt/safe/trenta# diff -u lvm-metadata-26624.txt lvm-metadata-28672.txt
--- lvm-metadata-26624.txt      2009-05-25 23:45:11.000000000 +0000
+++ lvm-metadata-28672.txt      2009-05-25 23:41:22.000000000 +0000
@@ -1,6 +1,6 @@
s {
id = "HNHKOr-RpyA-uMdz-tqhN-Z673-L2ej-qXikhF"
-seqno = 18
+seqno = 19
status = ["RESIZEABLE", "READ", "WRITE"]
extent_size = 8192
max_lv = 0
@@ -64,7 +64,7 @@

segment1 {
start_extent = 0
-extent_count = 7680
+extent_count = 15360

type = "striped"
stripe_count = 1       # linear
@@ -94,7 +94,7 @@
}
}
}
-# Generated by LVM2 version 2.02.36 (2008-04-29): Mon May 25 02:42:45 2009
+# Generated by LVM2 version 2.02.36 (2008-04-29): Mon May 25 02:44:25 2009

contents = "Text Format Volume Group"
version = 1
@@ -102,5 +102,5 @@
description = ""

creation_host = "tdamac"       # Linux tdamac 2.6.30-rc3-dirty #23 SMP Mon May 25 01:39:04 MDT 2009 x86_64
-creation_time = 1243240965     # Mon May 25 02:42:45 2009
+creation_time = 1243241065     # Mon May 25 02:44:25 2009

If you look carefully in my diff, you will see the seqno changed from 18 to 19, and the size of the "segment" doubled; that is what I had just done as part of my logical volume resize. The "vgchange -ay" command previously tried to restore seqno=19, which happened to fail. As it were, it appears that v19 of the meta-data backups is the version I need. So, now it's just a matter of taking all of the data we have dumped, and recreating the LVM information. It may be worth doing a RAW dd dump of your partition before messing with it, but I didn't bother. I was fairly certain that my data was still there, and I knew that restoring a configuration would not wipe out the data, just the LVM meta-data.

So, finally I completely recreated everything by doing the following.  The UUID for the physical volume comes from the meta data file.  It's important that you choose the correct ID, or it won't match the LVM data for the vgcfgrestore.

root@Microknoppix:~# pvcreate -ff -u 8cYXSr-l35B-2HBg-V7YS-TWsb-rZ8L-C5EC7J \
  --restorefile lvm-metadata-28672.txt /dev/sda4
root@Microknoppix:~# vgcfgrestore -f /mnt/safe/trenta/lvm-metadata-28672.txt -v s

I am pretty certain that only the second command was needed, because the physical volume meta-data was fine, it was the volume group that was messed up. But, I was hacking my way through, so I issued both commands.

WARNING, WARNING, WARNING, if you do this wrong, and use the wrong meta-data information, you may overwrite some file data somewhere else on the disk.  This can happen when the extents and sizes are off, so the vgcfgrestore command restores the meta-data to the wrong parts of the disk.  Be sure you are able to pick the correct backup data to restore, or you very well may loose data.

Now, after all of this, I think it may have been as easying as doing the following, but I am unsure.  I think what would happen with the following command, is that it would ignore the locking failure, and restore the configuration regardless.

vgchange -ay --ignorelockingfailure

Either way, this information is useful in the event that you loose meta-data due to bad blocks or whatever.

 

  • how to repair a logical volume
  • how to repair bad configured lvm
  • incorrect metadata area header checksum on
Filed under: Linux, LVM Leave a comment
Comments (25) Trackbacks (0)
  1. This happened to me just now…thanks for the blog post – it was very helpful.

    I found I *did* have to re-create the pv with pvcreate -ff – in fact the pv’s vg metadata was the problem, it seems.

    your “vgchange” command suggestion at the very end was tried but didn’t help.

    Thanks again for a very timely post.

  2. > your “vgchange” command suggestion at the very end
    > was tried but didn’t help.

    Oh, thanks for that, I wasn’t sure if that vgchange thing would work or not. I just happened to notice that in the man page after the fact.

    And you’re very welcome. After 2-4 hours of being very upset, I figured I should try and spare others the “pain”, LOL.

    Oh, and I noticed my pvcreate and vgcfgrestore commands were referring to different files, hehe. That would be bad if someone tried that. I will correct it.

  3. in my case offset of the last record does not really point to metadata and the offset of the previous record points to nonexistant place on the disk. When I try with sdb, size of the last record does not match. I just lost more than 600GB because something in hackintosh beta messed up my arrays. I feel, like I will kill somebody for that. It’s so frustrating, that words can’t describe this experience.

  4. (if you ever try hackintosh, PHYSICALLY UNPLUG ALL YOUR HARD DRIVES WITH DATA, or it can silently cross you and ruin them right after it boots up to the instalator)

  5. 600G, ouch, that is very painful 🙁

    Don’t give up though, open the disk with a hexeditor, and see if you can find the configurations near the beginning of the disk.

    • Thanks a lot for this post. It helped a lot. I was able to recover my lost LVM meta.

      Just to add, by the way, you can do cat to your disk like cat /dev/sdb3 and search for your text meta data.

  6. You, sir, are my hero. Thanks to your great blog I’ve managed to get back 3.6TB of precious data. Thank you. It worked like a charm.

  7. Thanks, you saved me too 🙂

  8. Hi I need some help to restore data from hardware based clones/snapshots. I am using the following command in rhel to change the uuid of the clone: pvchange –uuid /dev/mapper/mpath2 –config ‘global{activation=0}’ . I want to know if there something similar in SLES 9 onwards for deactivating device-mapper interaction and change the LVM metadata? Thanks, Salah.

  9. Salah,

    I have no idea actually. But, you could get out a disk hex editor, and do it that way. 😀

  10. Great article, Adams… helped me a lot in recovering a screwed-up PV here, aparently by the same cause (resizing the PV with pvresize and/or resizing the VG with “vgchange -s”). I didn’t have hexedit, so I dumped the whole PV metadata using dd bs=1 skip=N size=M lvm.cfg(where N and M are respectively the offset and the size shown by pvck) and then editing the resulting lvm.cfg to select the
    valid metadata to restore. Also, I had to use BOTH commands in the end (pvcreate AND vgcfgrestore),
    as vgcfgrestore alone complained about metadata checksum error and refused to run. Hope this additional info is useful for someone else.

  11. YOU SAVED MY LIFE!
    well you know when you’re trying a better distro for your girlfriend’s notebook and the pclinuxos’ installer screws up your lvm configuration ,where there are all those pics, songs, movies, files she wants to keep, with just 1, ONE, click?
    and what if you don’t know how to fix the mess?
    now that is fixed, thanks to you, i’ve to talk to those brilliant minds who made that wonderful installer…
    good nite…

  12. Brilliant article! Saved a lot of sweat (éven after sweating a ton). What I did was
    1. Get Ubuntu Live CD .iso with lvm support (the alternative version)
    2. Boot VMWare player with it (mount to cd/dvd) and f2 to bios and make it boot
    in fix mode
    3. pvck -v -d /dev/sda5 to get the offset to metadatas
    4. dd if=/dev/sda5 of=/metadata*.txt bs=1 skip= count=
    5. repeat step 4. for each metadata
    6. explore metadata with nano to see what went wrong and which was last working one
    7. vgcfgrestore -f /metadata.txt -v

    I couldn’t use hexdump so dd worked for me.
    The pvcreate didn’t work for me but apparently it wasn’t necessary.

    Big kiitos from Finland!

  13. Thanks a bunch. Really saved the day.

  14. This was exactly what I needed. Like the post above — pvcreate didn’t work for me – it said it couldn’t lock /dev/sda1 exclusively — after using hexedit to find metadata — vcfgrestore -f file -v vgname —– pvs -v now shows the proper information and after reboot I’m back in business.

    thank you very much.

  15. Hi and thanks for your post.
    I’ve followed the instructions, using a Knoppix live CD to recover my 1 To HDD. It went well, I could read the files, but knoppix gets very slow, and I wasn’t able
    I then reboot Knoppix, to try and copy one file on a usb drive, but now I CAN’T SEE my disk even using “fdisk -l”.

    More Details:
    In deed, my HDD is a 1 To HDD from a NAS iomega ix2 Storcenter, configure in RAID 1 with another 1 To HDD. One of the disk went down and we were not able to access the files over the network interface.
    1-I took the working HDD and connected it to a laptop using SATA-to USB
    2-I first tried to acces the HDD in degraded RAID 1, but it keeps telling “unkown file system linux_RAID_member”
    3-The “file -s” command allow me to notice that it was a Linux LVM device.
    4- I then change the device ID from Linux (ID 83) to Linux LVM (ID 8e)
    5- Then I follow your instruction to restore the metadata and VG configuration

    It all works fine till the problem mentionned above.

    ANY SUGGESTIONS? Thoughts?

  16. Well really nice (and genious) post. I’ve seen that too late, but anyways I wouldn’t have had the time to dump all my existing Data to another storrage (and didn’t have enough spare-storrage where to copy the data) but I’ll remember it for the next time.
    XenServer doesn’t make backup configs (/etc/lvm/archive) of your lvm and if you make a mistake… due to a hand of mistakes and crashed backups we lost 14 days of data (happily over christmas where not much ppl. have worked here). The night nor the days after where very funny.

    Regards

  17. Another system’s data saved! Thank you!

  18. Accidently deleted a Logical Volume in the Debian Installer. Chose storage/datastore instead of vgroot/datastore, no questions asked, simply deleted. Live System. No Backups, no Archives.

    Well Sir, thank you very much. You might not know how many tears you have saved on the internet with this post.

  19. YES! Thank you – I just had the same problem. I think it’s an issue with me multibooting multiple distributions and inconsistent use of lvmetad. I’d seen a few invalid “missing pv” warnings lately but things worked fine. Suddenly during a reboot, I see “removing these pesky missing PVs – you won’t be needing them right?” Funny as it included PVs I’d safely removed about a month ago. And a blank screen as it couldn’t figure out my vol group. Unfortunately I had a lot more PVs and developed a quick script to sort out all my backups. No hex editor or binary calculators are necessary with head -c/tail -c:

    # pvck -d -v /dev/sda2 > backups.txt

    while IFS=$',' read -r offset size ignored
    do
    echo ${offset:34}
    echo ${size:4}
    echo 'tail -c +$offset /dev/sda2 | head -c $size > $offset.txt'
    done save_backupus.sh

    # Should have a script to save all backups as ./$offset.txt. Inspect it and run.
    # Your spacing may vary.

    bash ./save_backups.sh

    # END
    # This should extract all backups from your raw device using tail/head.
    # Then it's easier to dig for the correct one.

  20. Thanks, just thanks. I found a disk from 2009 with data I’d long given up hope of recovering, I got the lot back!

  21. Thanks for this discussion – reading this let me know that recovery was a possibility, thus saving my weekend. Much appreciated.

  22. Hi I am using Iomega storecenter IX2 with 500GB RAID1. One day suddenly when the NAS box bootup, there is no data only default folders. I can access the HDD through NAS but there is no previous data. I don’t know how it happens. I took out the HDD and connect to aubunto pc and ran the commands. Please find the results below: If any one can advice me to get back the data.

    500gb DISK2

    root@basudev-UBUNTU:~# pvs -v
    PV VG Fmt Attr PSize PFree DevSize PV UUID
    /dev/sda5 ubuntu-vg lvm2 a– 297.61g 0 297.61g qzhOzn-YgK6-CFCH-d2oG-2dBi-PBZp-eSoOfY
    /dev/sdb2 md1_vg lvm2 a– 464.79g 0 464.79g kKyZKa-h4dJ-zm6T-t8vi-93c5-Cl9Q-BmO092

    root@basudev-UBUNTU:~# vgdisplay -v

    — Volume group —
    VG Name md1_vg
    System ID
    Format lvm2
    Metadata Areas 1
    Metadata Sequence No 2
    VG Access read/write
    VG Status resizable
    MAX LV 0
    Cur LV 1
    Open LV 1
    Max PV 0
    Cur PV 1
    Act PV 1
    VG Size 464.79 GiB
    PE Size 2.00 MiB
    Total PE 237971
    Alloc PE / Size 237971 / 464.79 GiB
    Free PE / Size 0 / 0
    VG UUID niI7Ih-8J1c-DUNj-ujxu-BURc-EzLG-zw3Ncj

    root@basudev-UBUNTU:~# lvdisplay

    — Logical volume —
    LV Path /dev/md1_vg/md1vol1
    LV Name md1vol1
    VG Name md1_vg
    LV UUID AhwrYy-Uutm-haZn-MZtE-g1OI-sTG8-XjeQvK
    LV Write Access read/write
    LV Creation host, time ,
    LV Status available
    # open 1
    LV Size 464.79 GiB
    Current LE 237971
    Segments 1
    Allocation inherit
    Read ahead sectors auto
    – currently set to 256
    Block device 252:1

    root@basudev-UBUNTU:~# pvck -d -v /dev/sdb2
    Scanning /dev/sdb2
    Found label on /dev/sdb2, sector 1, type=LVM2 001
    Found text metadata area: offset=4096, size=192512

    root@basudev-UBUNTU:~# hexedit /dev/sdb2

    00000200 4C 41 42 45 4C 4F 4E 45 01 00 00 00 00 00 00 00 LABELONE……..
    00000210 3C C3 D7 9B 20 00 00 00 4C 56 4D 32 20 30 30 31 <… …LVM2 001
    00000220 6B 4B 79 5A 4B 61 68 34 64 4A 7A 6D 36 54 74 38 kKyZKah4dJzm6Tt8
    00000230 76 69 39 33 63 35 43 6C 39 51 42 6D 4F 30 39 32 vi93c5Cl9QBmO092
    00000240 00 00 7B 32 74 00 00 00 00 00 03 00 00 00 00 00 ..{2t………..

    root@basudev-UBUNTU:~# vgchange -ay
    2 logical volume(s) in volume group "ubuntu-vg" now active
    1 logical volume(s) in volume group "md1_vg" now active


Leave a comment

 

No trackbacks yet.