Author Topic: Creating RAID5 on LS-QVLB10 fails  (Read 2389 times)

peo

  • Calf
  • *
  • Posts: 10
Creating RAID5 on LS-QVLB10 fails
« on: December 31, 2020, 09:08:36 AM »
Because of previous failed attempts to create a RAID5 on my LS-QVLB10, I decided to go for a full erase (somewhere from the UI). It took something about three weeks of 100% activity on the disks, but now I at least know these disks should be reliable for continued use (or had their lives shortened by the  4x "write 0", "write 1", "write 0", "write 1" cycle).

I get the exact same problem when trying to create the array now as before erasing (and I even updated to the recently released new firmware version before trying to create the array):

Step 0: Select raid type and disks to include, confirm creation (ok + code)
http://tech.webit.nu/wp-content/uploads/2020/12/qvl-00-create-raid.png

Step 1: Progress 50%, "Failed to operate RAID array. Please resubmit after restart.
http://tech.webit.nu/wp-content/uploads/2020/12/qvl-01-failed-to-operate-raid.png

Step 2: Logged in again, "Array 1" is listed in the "Disks" section, but also listed under "RAID Array" as "Not configured".
I can (and also did once) reconfigure the RAID5 on the same devices as before.
http://tech.webit.nu/wp-content/uploads/2020/12/qvl-02-disk-raid-status-after.png

Step 3: NAS-Navigator shows both "unformatted" and "resyncing" (without indication of progress)
http://tech.webit.nu/wp-content/uploads/2020/12/qvl-04-nas-navigator.png

LED status: RED long short short short short
according to http://en.faq.buffalo-global.com/app/answers/detail/a_id/12769
"I14   The RAID array is being checked."

Cannot get into the NAS by Telnet or SSH (since this is "unsupported" but was possible to patch before the death of buffalo.nas-central.org) to see what's going on.

Should I just lean back and wait a week or two until the RAID checkins is done (which should be instant because of four empty drives besides the NAS OS partitions) ?

/PeO

1000001101000

  • Debian Wizard
  • Big Bull
  • *****
  • Posts: 1128
  • There's no problem so bad you cannot make it worse
Re: Creating RAID5 on LS-QVLB10 fails
« Reply #1 on: December 31, 2020, 09:31:22 AM »
You can get telnet access with ACP_Commander:
https://github.com/1000001101000/acp-commander

It might be failing that step because one of the drives is having trouble (bad sectors etc). I'd connect them to a PC and check them out before continuing if you haven't already.

peo

  • Calf
  • *
  • Posts: 10
Re: Creating RAID5 on LS-QVLB10 fails
« Reply #2 on: December 31, 2020, 10:12:27 AM »
I know about ACP Commander, but as the addons (for this model, I have archived the needed files for LS220) are no longer available, it won't work ..

You can get telnet access with ACP_Commander:
https://github.com/1000001101000/acp-commander

peo

  • Calf
  • *
  • Posts: 10
Re: Creating RAID5 on LS-QVLB10 fails
« Reply #3 on: December 31, 2020, 10:37:28 AM »
"Fake shell" with ACP Commander worked:
Code: [Select]
/>cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid6] [raid5] [raid4]
md2 : active raid5 sdd6[3] sdc6[2] sdb6[1] sda6[0]
      23393060352 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
      [>....................]  resync =  3.2% (250267008/7797686784) finish=6399.4min speed=19656K/sec

md1 : active raid1 sdd2[7] sdc2[6] sdb2[5] sda2[4]
      4999156 blocks super 1.2 [4/4] [UUUU]

md10 : active raid1 sdd5[7] sdc5[6] sdb5[5] sda5[4]
      1000436 blocks super 1.2 [4/4] [UUUU]

md0 : active raid1 sda1[0] sdc1[3] sdb1[2] sdd1[1]
      1000384 blocks [4/4] [UUUU]

unused devices: <none>

Could not get the -xfer (-xferto) to ask for any file to send (included telnet in GIT repo). Is there any way to make telnetd (or even ssh as on LS220) permanent on this device ?

Looks like my option here is just to lean back and wait for a week..


1000001101000

  • Debian Wizard
  • Big Bull
  • *****
  • Posts: 1128
  • There's no problem so bad you cannot make it worse
Re: Creating RAID5 on LS-QVLB10 fails
« Reply #4 on: December 31, 2020, 10:43:41 AM »
If "-o" doesn't give you a working telnet  (it does for most but not all model/firmware combos), you have some options:

For most tasks the "-s" option provides a good enough shell to do what you need, that will work for any configuration you can get ACP Commander to connect to.

You can transfer a telnet binary over using "-xfer" and then run it. I have that process documented here for a different model:
https://buffalonas.miraheze.org/wiki/Linkstation_LS520D#Telnet
**You would need to get the armel version of busybox for the LS-QVL

I believe this script works to enable SSH on the LS-QVL (using acp_commander):
https://github.com/rogers0/OpenLinkstation/blob/master/0_get-ssh/get-ssh.sh
**I haven't used it in years and don't really recommend messing with the built-in ssh like this but I added it for completeness

***Note that all of this assumes you use the newest version of ACP Commander from the github link I posted earlier rather than any of the older versions available elsewhere.

If you've reached the point that you find yourself wanting to modify the firmware form the command line you also have the option to replace the firmware with a Debian Linux installation. Details and installer files can be found here:
https://github.com/1000001101000/Debian_on_Buffalo



« Last Edit: December 31, 2020, 10:46:15 AM by 1000001101000 »

1000001101000

  • Debian Wizard
  • Big Bull
  • *****
  • Posts: 1128
  • There's no problem so bad you cannot make it worse
Re: Creating RAID5 on LS-QVLB10 fails
« Reply #5 on: December 31, 2020, 10:47:43 AM »
I would strongly recommend checking the drives for bad sectors before waiting two weeks to see if a raid sync will succeed or not.

I personally don't use RAID5 partly because large drives can take to long to sync etc.

peo

  • Calf
  • *
  • Posts: 10
Re: Creating RAID5 on LS-QVLB10 fails
« Reply #6 on: December 31, 2020, 11:54:11 AM »
Two of these drives have been previously used in my Synology NAS and survived the quite brutal disk replacement/upgrade procedure (rebuild and reshape at every disk change) - it's not by itself a guarantee that these are still error free, but if they had failed some time during the Synology upgrade the resync would also have failed at some point.
Quick SMART check and the extended also says these disks are ok, and I did not get any errors during the Buffalo "Erase" procedure (which is a full four-time rewrite of the disks).

I will try your hints on making permanent telnet/ssh access to this NAS (as I have on the other Buffalos, as well as on the QNAP and the Synology).

peo

  • Calf
  • *
  • Posts: 10
Offtopic: SSH access for LS-QVLB10
« Reply #7 on: December 31, 2020, 12:08:12 PM »
Some findings
Code: [Select]
/>find / -type f -name busybox
/bin/busybox

Code: [Select]
/>find / -type f -name *ssh*
/mnt/ram/var/run/sshd.pid
/etc/ssh_host_dsa_key.pub
/etc/init.d/sshd.sh
/etc/ssh_host_rsa_key.pub
/etc/ssh_host_rsa_key
/etc/pam.d/sshd
/etc/ssh_config
/etc/ssh_host_key
/etc/sshd_config
/etc/ssh_host_dsa_key
/etc/ssh_host_key.pub
/usr/lib/apt/methods/ssh
/usr/lib/perl5/vendor_perl/5.8.8/URI/ssh.pm
/usr/local/libexec/ssh-keysign
/usr/local/var/dpkg/info/openssh.md5sum
/usr/local/var/dpkg/info/openssh.list
/usr/local/webaxs/lib/perl5/site_perl/5.10.0/URI/ssh.pm
/usr/local/sbin/sshd
/usr/local/bin/ssh-add
/usr/local/bin/ssh-keygen
/usr/local/bin/ssh-keyscan
/usr/local/bin/ssh
/usr/local/bin/ssh-agent
« Last Edit: December 31, 2020, 01:12:24 PM by peo »

peo

  • Calf
  • *
  • Posts: 10
Offtopic: Success (SSH access for LS-QVLB10)
« Reply #8 on: December 31, 2020, 01:10:14 PM »
All needed files are in place, just needs some tweaking.. As with everything-Buffalo, I don't know if it survives a reboot.

Code: [Select]
java -jar acp_commander.jar -t 192.168.0.10 -pw AdminPassword -c "(echo newrootpass;echo newrootpass)|passwd"
java -jar acp_commander.jar -t 192.168.0.10 -pw AdminPassword -c "sed -i 's/PermitRootLogin/#PermitRootLogin/g' /etc/sshd_config"
java -jar acp_commander.jar -t 192.168.0.10 -pw AdminPassword -c "echo 'PermitRootLogin yes' >>/etc/sshd_config"
java -jar acp_commander.jar -t 192.168.0.10 -pw AdminPassword -c "sed -i 's/root/rooot/g' /etc/ftpusers"

or
Code: [Select]
java -jar acp_commander.jar -t 192.168.0.10 -pw AdminPassword -s

then execute the same commands in the shell:
Code: [Select]
(echo newrootpass;echo newrootpass)|passwd
sed -i 's/PermitRootLogin/#PermitRootLogin/g' /etc/sshd_config
echo "PermitRootLogin yes" >>/etc/sshd_config
sed -i 's/root/rooot/g' /etc/ftpusers
« Last Edit: December 31, 2020, 01:22:49 PM by peo »

peo

  • Calf
  • *
  • Posts: 10
Re: Creating RAID5 on LS-QVLB10 fails
« Reply #9 on: December 31, 2020, 05:13:33 PM »
I've probably found the reason for failure (not because of resyncing):

Code: [Select]
root@LS-QVLB10:~# mke2fs /dev/md2
mke2fs 1.40.5 (27-Jan-2008)
mke2fs: File too large while trying to determine filesystem size

The webui failed to create the filesystem on md2 , then it failed to mount it to /mnt/array1. The new volume should have been made available as soon as it was created, but now only the md2 creation succeeded, so no limit on mdraid, but on the file system.

I was able to create a partition inside md2, then format it, but when mounted it was shown as only 2TB.

peo

  • Calf
  • *
  • Posts: 10
Re: Creating RAID5 on LS-QVLB10 fails
« Reply #10 on: December 31, 2020, 06:45:41 PM »
Same problems with Debian (unable to mount the large partition)
Second try (with xfx instead of ext4)
http://tech.webit.nu/wp-content/uploads/2021/01/qvl-deb-00-24tb-xfs.png
http://tech.webit.nu/wp-content/uploads/2021/01/qvl-deb-01-24tb-xfs-fail.png

I just skipped the storage partition for now to resume installing Debian.

1000001101000

  • Debian Wizard
  • Big Bull
  • *****
  • Posts: 1128
  • There's no problem so bad you cannot make it worse
Re: Creating RAID5 on LS-QVLB10 fails
« Reply #11 on: December 31, 2020, 07:04:22 PM »
Oh, I didn’t see you were trying to do this with 24Tb

This architecture is limited to 16TB volumes (2^32 * 4k blocks).

Sadly neither OS seems to “know” that, you just get errors when something tries to access blocks past the limit.

peo

  • Calf
  • *
  • Posts: 10
Re: Creating RAID5 on LS-QVLB10 fails
« Reply #12 on: January 01, 2021, 08:01:36 AM »
So the solution is to either split the 24TB md2 into two mdraid arrays, or to use LVM on top of it to create two volumes ?

Any advantage on using LVM instead of redoing md2 (and md3) other than LVM is easier from the current stage (just add device, vg and create volumes) ?
The disadvantage of splitting into md2 and md3 is that I will loose the current progress of resyncing (10%).
« Last Edit: January 01, 2021, 08:46:22 AM by peo »

1000001101000

  • Debian Wizard
  • Big Bull
  • *****
  • Posts: 1128
  • There's no problem so bad you cannot make it worse
Re: Creating RAID5 on LS-QVLB10 fails
« Reply #13 on: January 01, 2021, 09:41:47 AM »
I believe anything involving a block device (including mdadm array) larger than 16TB will fail when it tries to access past the 16TB point. I would expect your resync will keep failing at that point and would expect LVM to also have problems since it would still be depending on underlying md device.

My solution was to just start using two mdadm arrays. I've found even on systems that can handle larger volumes and have higher performance there are still advantages of breaking up arrays into smaller ones that are faster to resync/rebuild/etc. Obviously this requires more drives/expense.

It's actually really frustrating how all these tools seem oblivious to the limitation, the first time I ran into it I did something similar only creating the RAID0 array actually succeeded and so did creating the filesystem. I just started getting nasty errors once I had filled up the array to a certain point.


peo

  • Calf
  • *
  • Posts: 10
Re: Creating RAID5 on LS-QVLB10 fails
« Reply #14 on: January 01, 2021, 11:18:54 AM »
I'm relatively new in manipulating mdraids and LVM manually (I put some time into it when upgrading the Synology, to learn about what happens when the disks are replaced and how their "Hybrid Raid" works on a lower level), but I think a md2+md3 setup (and ext4 directly on these) is the best way to go.

Doing so on the same disks will not waste any space, and will probably give a little bit of better performance:
Code: [Select]
md3 : active raid5 sdd7[4] sdc7[2] sdb7[1] sda7[0]
      11699606016 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [UUU_]
        resync=DELAYED
      bitmap: 0/30 pages [0KB], 65536KB chunk

md2 : active raid5 sdd6[4] sdc6[2] sdb6[1] sda6[0]
      11699606016 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/3] [UUU_]
      [>....................]  recovery =  0.3% (11845376/3899868672) finish=2773.6min speed=23361K/sec
      bitmap: 0/30 pages [0KB], 65536KB chunk

Here each partition on the four disks are 3.6TB (same size in both arrays), so I will waste a total of 7.2TB for the parity. The same would have been wasted with four partition of 7.2TB, so no difference at all.

Btw. Thanks for your guide on how to get Debian on the NAS. Really easy to follow once able to put the files on the device.