Friday, November 28, 2008

Solaris 10: UFS to ZFS boot with Live Upgrade

Solaris 10/08 (Update 6) brings ZFS root file systems to production grade Solaris systems.

Opensolaris kind of stole the thunder of zfs booting a while ago now. But I think this is still a major milestone as its good old Solaris 10 that is used in production systems and supports over 10,000 applications that businesses rely on.

Sun has some excellent documentation on upgrading systems. The document I used for my upgrade was:
820-5238 - Solaris 10 10/08 Installation Guide: Solaris Live Upgrade and Upgrade Planning

This post is a record of my experience upgrading:
Solaris 10 Update 5 with UFS root
to
Solaris 10 Update 6 with ZFS root

There are two major steps involved here.
1. Upgrade the existing system to Solaris 10 Update 6.
2. Migrate the upgraded system to a zfs pool.

The first part is standard Solaris upgrade stuff so should work no problem.
The second part is well documented in the above release notes so should also work no problem.
Lets see how it goes...

I'm using a VirtualBox environment:
Host OS: Opensolaris 2008.11 RC2
Guest OS: Solaris 10 Update 5 (x86)
The guest system is configured with two 7GB disks.
c1t0d0 - First disk, this currently holds Solaris10u5 on UFS
c1t1d0 - Second disk, this is empty and ready to hold the secondary boot environment (BE).


Phase 1 - Upgrade to Solaris 10 Update 6

Live upgrade is the key tool in making all this work. It requires a minimum set of patches.
You can obtain the patches from http://sunsolve.sun.com
For the Solaris 10 Update 6 release search for the 206844 document.

Once patch requirements are met you need to install the live upgrade packages from the Solaris release you are upgrading to.
The packages to upgrade/install are SUNWlucfg, SUNWlur and SUNWluu.
If you mount up the Solaris 10 Update 6 DVD you can install or upgrade them easily like so:
root@solaris:~# cd /cdrom/cdrom0/Solaris_10/Tools/Installers
root@solaris:~# ./liveupgrade20 -noconsole -nodisplay
Ok, we now have the latest patches and packages for Live Upgrade.

Create the secondary Boot Environment (BE):

First setup the partitions on the second disk to be identical to the current system.
root@solaris:~# prtvtoc /dev/rdsk/c1t0d0s2 | fmthard -s - /dev/rdsk/c1t1d0s2
fmthard: New volume table of contents now in place.
Then create the BE on the second disk.
root@solaris:~# lucreate -c solarisA -n solarisB -m /:/dev/dsk/c1t1d0s0:ufs -m -:/dev/dsk/c1t1d0s1:swap
Great, now we have a primary BE (solarisA) and a secondary BE (solarisB).

Note: Issue with lucreate. On some systems lucreate will fail because of a bug in /sbin/biosdev.
This happened to me and I followed the work around here to fix it.


We can now upgrade the secondary BE to Solaris 10 Update 6.
To perform the upgrade, mount a copy of the Solaris 10 Update 6 DVD, then:
root@solaris:~# luupgrade -u -n solarisB -s /cdrom/cdrom0


This system contains only a single GRUB menu for all boot environments. To
enhance reliability and improve the user experience, live upgrade requires
you to run a one time conversion script to migrate the system to multiple
redundant GRUB menus. This is a one time procedure and you will not be
required to run this script on subsequent invocations of Live Upgrade
commands. To run this script invoke:

/usr/lib/lu/lux86menu_propagate /path/to/new/Solaris/install/image OR
/path/to/LiveUpgrade/patch

where /path/to/new/Solaris/install/image is an absolute
path to the Solaris media or netinstall image from which you installed the
Live Upgrade packages and /path/to/LiveUpgrade/patch is an absolute path
to the Live Upgrade patch from which this Live Upgrade script was patched
into the system.

root@solaris:~#
Sounds fair enough to me. Follow the above instructions to propagate the grub menu to all BEs.
root@solaris:~# /usr/lib/lu/lux86menu_propagate /cdrom/cdrom0/
Then try the upgrade again...
root@solaris:~# luupgrade -u -n solarisB -s /cdrom/cdrom0/
Success. We now have a Solaris 10 Update 6 BE available.
Check its status with lustatus:
root@solaris:~# lustatus
Boot Environment Is Active Active Can Copy
Name Complete Now On Reboot Delete Status
-------------------------- -------- ------ --------- ------ ----------
solarisA yes yes yes no -
solarisB yes no no yes -
root@solaris:~#
Now activate it and reboot.
root@solaris:~# luactivate solarisB
root@solaris:~# init 6
Ok, now we have rebooted and are in the Solaris 10 Update 6 BE.
To prove it:
root@solaris:~# cat /etc/release
Solaris 10 10/08 s10x_u6wos_07b X86
Copyright 2008 Sun Microsystems, Inc. All Rights Reserved.
Use is subject to license terms.
Assembled 27 October 2008
Great, this completes the first phase.

Phase II - Migrate to a ZFS root file system

Now we can delete the old Solaris 10 Update 5 BE to make room for our new ZFS root system.
root@solaris:~# lustatus
Boot Environment Is Active Active Can Copy
Name Complete Now On Reboot Delete Status
-------------------------- -------- ------ --------- ------ ----------
solarisA yes no no yes -
solarisB yes yes yes no -
root@solaris:~#
root@solaris:~# ludelete solarisA
System has findroot enabled GRUB
Checking if last BE on any disk...
ERROR: Last BE on disk
ERROR: This boot environment is the last BE on the above disk.
ERROR: Deleting this BE may make it impossible to boot from this disk.
ERROR: However you may still boot solaris if you have BE(s) on other disks.
ERROR: You *may* have to change boot-device order in the BIOS to accomplish this.
ERROR: If you still want to delete this BE , please use the force option (-f).
Unable to delete boot environment.
root@solaris:~#
We do have another BE on another disk so just proceed with a forced delete.
root@solaris:~# ludelete -f solarisA
Success. Now we have cleared out the old BE:
root@solaris:~# lustatus
Boot Environment Is Active Active Can Copy
Name Complete Now On Reboot Delete Status
-------------------------- -------- ------ --------- ------ ----------
solarisB yes yes yes no -
root@solaris:~#
Note: If you reboot now without changing the boot disk order then the BIOS will still load the GRUB from the first disk (ludelete does not trash the MBR). This GRUB will show menus for both the original (solarisA) and new (solarisB) BEs. Of course only the solarisB BE will actually be bootable. You could change the disk the BIOS boots from to the second disk in which case you would boot from the GRUB installed on the second disk which is up to date and only shows the solarisB BE.
If you choose to use live upgrade on spare slices on a single disk you would avoid this situation.

We now create the ZFS pool on the first disk to be used as our ZFS root. This must be done manually, live upgrade does not make the pool for you.

This zpool must be made from a slice, not a whole disk. A whole disk zpool would use the new EFI disk label but only SMI labelled disks can be booted from.

I wanted to use all the space on the disk for my zfs root pool so using the format command I repartitioned the first disk into one big slice. It looked like this:

Part Tag Flag Cylinders Size Blocks
0 root wm 1 - 3579 6.99GB (3579/0/0) 14659584
1 unassigned wu 0 0 (0/0/0) 0
2 backup wm 0 - 3579 6.99GB (3580/0/0) 14663680
3 unassigned wu 0 0 (0/0/0) 0
4 unassigned wu 0 0 (0/0/0) 0
5 unassigned wu 0 0 (0/0/0) 0
6 unassigned wu 0 0 (0/0/0) 0
7 unassigned wu 0 0 (0/0/0) 0
8 boot wu 0 - 0 2.00MB (1/0/0) 4096
9 unassigned wu 0 0 (0/0/0) 0

Then create the zpool:
root@solaris:~# zpool create rpool /dev/dsk/c1t0d0s0
root@solaris:~#
root@solaris:~# zpool list
NAME SIZE USED AVAIL CAP HEALTH ALTROOT
rpool 6.94G 94K 6.94G 0% ONLINE -
root@solaris:~#
Now use live upgrade to migrate the current UFS root file system to the zpool:
root@solaris:~# lucreate -n new-zfsBE -p rpool
Excellent. We now have a ZFS BE available!
Check it:
root@solaris:~# lustatus
Boot Environment Is Active Active Can Copy
Name Complete Now On Reboot Delete Status
-------------------------- -------- ------ --------- ------ ----------
solarisB yes yes yes no -
new-zfsBE yes no no yes -
root@solaris:~#
Now to activate it and reboot.
root@solaris:~# luactivate new-zfsBE
root@solaris:~# init 6
And thats it! We now have a Solaris 10 system running on ZFS.

What now?

Well, if knowing you will never have to fsck your system ever again or have to deal with the SVM meta commands or feeling safely protected from silent data corruption are not tangible enough benefits for you... then here's a taste of zfs snapshots.

Stand back, this is how easy it is:
root@solaris:~# lucreate -n monday-zfsBE
Five seconds later we have a bootable snapshot of the current system taking up almost zero disk space.
root@solaris:~# lustatus
Boot Environment Is Active Active Can Copy
Name Complete Now On Reboot Delete Status
-------------------------- -------- ------ --------- ------ ----------
solarisB yes no no yes -
new-zfsBE yes yes yes no -
monday-zfsBE yes no no yes -
root@solaris:~#


Labels: , , , , , , , , ,

7 Comments:

Anonymous Anonymous said...

This comment has been removed by a blog administrator.

February 24, 2009 at 7:05 PM  
Anonymous Anonymous said...

We've had a core dump on 2 systems after running the live upgrade to ZFS. We upgraded from Solaris 8 to 10 with no problems. But then upgrading the Boot Environment from 10 to 10 ZFS core dumps when attempting to boot off of the ZFS environment.

We had 2 other test systems that worked fine, but those were clean installs from a Jumpserver.

Any ideas?

March 5, 2009 at 10:55 AM  
Anonymous Anonymous said...

Oh, to add/clarify, the 2 systems that worked fine were clean Solaris 8 jumpstart installs that were liveupgraded to 10 and then to 10 ZFS.

March 5, 2009 at 11:02 AM  
Blogger Aidan said...

I would first enable verbose boot logging to try and pin point where it panics, that should give you more to go on.

You could also try re-creating the BE and double checking all the output from the lucreate command, there's a bunch of messages that could be overlooked.

Also did you include a swap space on the new BE? If not specified it may be trying to share swap with the old UFS BE and not like it.

Any ancient settings in the /etc/system file that might not be relevant with Solaris 10?

Just some ideas, let me know how you get on.

March 5, 2009 at 3:32 PM  
Anonymous Anonymous said...

This comment has been removed by a blog administrator.

April 13, 2009 at 3:27 AM  
Anonymous Anonymous said...

This is really a great post. It's so much easier to follow this than to dig through all of the documentation on Sun's site. You did a great job explaining each step, but it's also straightforward.

ZFS is a lifesaver if you run Solaris under VMWare! No more mucking around in failsafe after a VM host dies.

April 14, 2009 at 8:56 AM  
Blogger Aidan said...

Cool, glad it came in handy.

April 16, 2009 at 12:22 AM  

Post a Comment

Subscribe to Post Comments [Atom]

<< Home