I recently installed a new HP Microserver Gen8 with a Debian Jessie and a software raid-1 (because the integrated B120i controller is not a "real" raid controller and is not supported by Linux).
After the initial installation, which installed grub2 only on the first disk, I also wanted to install grub2 on the second drive (SDB) so the server is able to boot from both disks in case one fails. So I tried grub-install on SDB:
# grub-install /dev/sdb
Then I shut down the server, removed the first drive and booted. But nothing. The BIOS skipped the boot from the local hard drive and went on in the boot order to PXE (Network boot). So no boot loader was found on the remaining drive.
I booted the server again with both drives active and ran dpkg-reconfigure grub-pc. Some warnings showed up:
# dpkg-reconfigure grub-pc
Replacing config file /etc/default/grub with new version
Installing for i386-pc platform.
grub-install: warning: Couldn't find physical volume `(null)'. Some modules may be missing from core image..
Installation finished. No error reported.
Installing for i386-pc platform.
grub-install: warning: Couldn't find physical volume `(null)'. Some modules may be missing from core image..
Installation finished. No error reported.
Generating grub configuration file ...
/usr/sbin/grub-probe: warning: Couldn't find physical volume `(null)'. Some modules may be missing from core image..
Found linux image: /boot/vmlinuz-3.16.0-4-amd64
Found initrd image: /boot/initrd.img-3.16.0-4-amd64
/usr/sbin/grub-probe: warning: Couldn't find physical volume `(null)'. Some modules may be missing from core image..
done
I was wondering about that warnings but carried on due to the message that no errors were reported and the installation finished. But my boot test with only the second drive active failed again.
Back into the system with both drives, I checked out the raid status and found this:
# cat /proc/mdstat
Personalities : [raid1]
md3 : active raid1 sda6[0]
470674432 blocks super 1.2 [2/1] [U_]
bitmap: 2/4 pages [8KB], 65536KB chunk
md2 : active raid1 sda5[0]
3903488 blocks super 1.2 [2/1] [U_]
md1 : active (auto-read-only) raid1 sda2[0] sdb2[1]
3904512 blocks super 1.2 [2/2] [UU]
md0 : active raid1 sda1[0]
9756672 blocks super 1.2 [2/1] [U_]
unused devices: <none>
Now it made all sense. The second drive was not active at all (I assume that I rebooted the system too quickly after the installation finished so mdadm didn't have enough time to finish the raid build, which caused this problem). Hence the warning "couldn't find physical volume".
I manually rebuilt the raid with mdadm commands and waited until the raid recovery finished:
# mdadm --add /dev/md0 /dev/sdb1
mdadm: added /dev/sdb1
# cat /proc/mdstat
Personalities : [raid1]
md3 : active raid1 sda6[0]
470674432 blocks super 1.2 [2/1] [U_]
bitmap: 2/4 pages [8KB], 65536KB chunk
md2 : active raid1 sda5[0]
3903488 blocks super 1.2 [2/1] [U_]
md1 : active (auto-read-only) raid1 sda2[0] sdb2[1]
3904512 blocks super 1.2 [2/2] [UU]
md0 : active raid1 sdb1[2] sda1[0]
9756672 blocks super 1.2 [2/1] [U_]
[===>.................] recovery = 15.5% (1521792/9756672) finish=0.7min speed=190224K/sec
unused devices: <none>
# mdadm --add /dev/md2 /dev/sdb5
mdadm: added /dev/sdb5
# mdadm --add /dev/md3 /dev/sdb6
mdadm: re-added /dev/sdb6
# cat /proc/mdstat
Personalities : [raid1]
md3 : active raid1 sdb6[1] sda6[0]
470674432 blocks super 1.2 [2/2] [UU]
bitmap: 0/4 pages [0KB], 65536KB chunk
md2 : active raid1 sdb5[2] sda5[0]
3903488 blocks super 1.2 [2/1] [U_]
[=============>.......] recovery = 67.8% (2649664/3903488) finish=0.1min speed=176644K/sec
md1 : active (auto-read-only) raid1 sda2[0] sdb2[1]
3904512 blocks super 1.2 [2/2] [UU]
md0 : active raid1 sdb1[2] sda1[0]
9756672 blocks super 1.2 [2/2] [UU]
unused devices: <none>
# cat /proc/mdstat
Personalities : [raid1]
md3 : active raid1 sdb6[1] sda6[0]
470674432 blocks super 1.2 [2/2] [UU]
bitmap: 0/4 pages [0KB], 65536KB chunk
md2 : active raid1 sdb5[2] sda5[0]
3903488 blocks super 1.2 [2/2] [UU]
md1 : active (auto-read-only) raid1 sda2[0] sdb2[1]
3904512 blocks super 1.2 [2/2] [UU]
md0 : active raid1 sdb1[2] sda1[0]
9756672 blocks super 1.2 [2/2] [UU]
Now that the raid-1 drives are recovered and are seen by the OS, I re-installed grub2 on both drives:
# for disk in sd{a,b} ; do grub-install --recheck /dev/$disk ; done
Installing for i386-pc platform.
Installation finished. No error reported.
Installing for i386-pc platform.
Installation finished. No error reported.
Looked much better this time!
After a shut down of the server, I removed the first drive, booted the server and this time it worked. BIOS found the grub2 bootloader on the second drive, booted from it and the OS fully worked.
Buschmann from wrote on Mar 10th, 2017:
Thank you very much for sharing this. Helped me much with a similar issue where one HDD was somehow removed from the SW RAID. Best greetings.
AWS Android Ansible Apache Apple Atlassian BSD Backup Bash Bluecoat CMS Chef Cloud Coding Consul Containers CouchDB DB DNS Database Databases Docker ELK Elasticsearch Filebeat FreeBSD Galera Git GlusterFS Grafana Graphics HAProxy HTML Hacks Hardware Icinga Influx Internet Java KVM Kibana Kodi Kubernetes LVM LXC Linux Logstash Mac Macintosh Mail MariaDB Minio MongoDB Monitoring Multimedia MySQL NFS Nagios Network Nginx OSSEC OTRS Office PGSQL PHP Perl Personal PostgreSQL Postgres PowerDNS Proxmox Proxy Python Rancher Rant Redis Roundcube SSL Samba Seafile Security Shell SmartOS Solaris Surveillance Systemd TLS Tomcat Ubuntu Unix VMWare VMware Varnish Virtualization Windows Wireless Wordpress Wyse ZFS Zoneminder