Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support dual firmware (aka A/B) devices where rootfs_data are discreete partitions #9

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

Lanchon
Copy link

@Lanchon Lanchon commented Sep 25, 2024

hi @hauke @dangowrt @robimarko

Many devices support dual firmwares (kermel/kernel_1, rootfs/rootfs_1). However openwrt only supports dual firmwares on devices for which the overlay is stoted in the trailing part of the active rootfs partition. Devices with discreet overlay partitions (rootfs_data/rootfs_data_1) are not supported.

This change allows the bootloader to choose which rootfs_data partition to use, just like it could always choose which rootfs: via a kernel command line parameter.

For example, uboot can now specify fstools_overlay_name=rootfs_data or fstools_overlay_name=rootfs_data_1. This works for block devices, UBI and MTD, as the system delegates partition name resolution to the usual fstools driver stack.

For completeness, an fstools_overlay_fstype=ext4/f2fs/auto parameter is also included. This selects which filesystem to use when it is time to format the overlay anew.

thank you!


this change has been tested to work properly in spectrum sax1v1k.

Move get_var_from_file() to common library.

Signed-off-by: Rodrigo Balerdi <[email protected]>
The kernel parameter 'fstools_overlay_name' can now be used to specify
the overlay backing storage by name. The name is resolved by the regular
stack of fstools drivers.

Signed-off-by: Rodrigo Balerdi <[email protected]>
The kernel parameter 'fstools_overlay_fstype' can now be used to specify
the preferred overlay fileystem type. The supported values are 'ext4',
'f2fs' and 'auto'. Type 'auto' is the default and represents the usual
behavior of selecting the filesystem type based on available space.

Signed-off-by: Rodrigo Balerdi <[email protected]>
@Lanchon Lanchon changed the title Support dual firmware (aka A/B) device where rootfs_data are discreete partitions Support dual firmware (aka A/B) devices where rootfs_data are discreete partitions Sep 25, 2024
@Lanchon
Copy link
Author

Lanchon commented Sep 28, 2024

hi @hauke, sorry to bother you.

openwrt does not support dual-firmware in devices with discreet rootfs_data partitions. this is a serious limitation but easy to fix. who should i bring in to review this change? thank you!

@Lanchon
Copy link
Author

Lanchon commented Sep 28, 2024

tangential note:

i must admit that (in addition to these changes) i'd also like seeing a u-boot env var that takes precedence over fstools_overlay_fstype to select the overlay filesystem.

in time i've heard many people wanting ext4 in the overlay in the forums, and i see no reason why we shouldn't accommodate them. a simple uboot var could be set to solve the issue for those who care.

i could also do that change if desired. there is precedence on customizing sysupgrade with optional uboot vars, such as providing a limit for how big you want the overlay fs when it is a volume inside a UBI partition.

@Lanchon
Copy link
Author

Lanchon commented Oct 24, 2024

@dangowrt so there's no interest in supporting dual firmware?

@dangowrt
Copy link
Member

dangowrt commented Oct 24, 2024

I think there could be interest in supporting existing vendor bootloaders which offer A/B dual-boot feature (which is what this PR is about, ie. supporting the existing A/B on NBG7815, right?).
As OpenWrt typically uses "the remaining space" on the flash for a read/write overlay, when implemented from scratch it should be done in such way that there still is only one rootfs_data partition which gets wiped once switching from A to B for anything else than sysupgrade (ie. failure related). And that then obviously means that in case of failure the configuration would be lost.
For that reason I believe a recovery vs. production dualboot approach serves OpenWrt better.

@dangowrt dangowrt closed this Oct 24, 2024
@dangowrt dangowrt reopened this Oct 24, 2024
@Lanchon
Copy link
Author

Lanchon commented Oct 24, 2024

@dangowrt

which is what this PR is about, ie. supporting the existing A/B on NBG7815, right?

no, i don't have that device, this is about spectrum sax1v1k. the relevant partitions of this device are:

  20           68130          330273   128.0 MiB   FFFF  rootfs
  21          330274          346657   8.0 MiB     FFFF  0:WIFIFW
  22          346658          608801   128.0 MiB   FFFF  rootfs_1
  23          608802          625185   8.0 MiB     FFFF  0:WIFIFW_1
  24          625186         1673761   512.0 MiB   FFFF  rootfs_data
  25         1673762         2722337   512.0 MiB   FFFF  rootfs_data_1

  32         5164578         5172769   4.0 MiB     FFFF  rsvd_1
  33         5172770         5180961   4.0 MiB     FFFF  rsvd_2
  34         5180962         5185057   2.0 MiB     FFFF  rsvd_3
  35         5185058         5217825   16.0 MiB    FFFF  rsvd_4
  36         5217826         5283361   32.0 MiB    FFFF  rsvd_5
  37         5283362         5414437   64.0 MiB    FFFF  rsvd_6

  38         5414438        15204321   4.7 GiB     FFFF  user_data

this device is has secure boot enabled and a locked down bootloader. the bootloader cannot be interrupted. initial access is gained by hacking the firmware userland. then uboot vars are programmed that will patch uboot in memory before loading an unsigned kernel.

the situation was very precarious because a borked system would have become an unrecoverable brick. i coded a solution for this, allowing breaking the boot process if you have serial access, an also externally triggering a recovery (initramfs) OS at any time without serial access.

on top of that, i also wanted to add the possibility of completely rolling back a sysupgrade externally without serial access, but i'm not going to do that if both systems share rootfs_data. i think it just doesn't make sense to do that.

the two rootfs_data partitions are there, but i can't use them because openwrt is missing this capability. the partitions cannot be merged or renamed, as in my experience most qualcomm systems from the last decade with secure boot enable sign the GPT, so any change results in a brick. (not that i'd want to merge anything: i think 512 MB rootfs_data is more than enough, and IMHO the best and safest way to leverage an 8GB emmc like this one in openwrt is to use a big partition (eg: the existing user_data with 4.7 GiB) as extra storage that survives sysupgrades.)

actual status of official devices

As OpenWrt typically uses "the remaining space" on the flash for a read/write overlay, when implemented from scratch it should be done in such way that there still is only one rootfs_data partition

i respectfully disagree. this might have been the case in 128 MB devices, but there is no use in having 3.5 GB or 7.5 GB rootfs_data partitions these days.

more importantly, this is not how actual official openwrt devices are implemented. many of these devices have separate rootfs_data partitions for A and B (placing rootfs_data after rootfs via loop) in both nand and emmc storages.

example: whw03v2

512 MB NAND device where rootfs_data could be 450 MB but is only 120 MB. here are the partitions:

[    1.328823] 0x000000700000-0x00000a800000 : "kernel"     : 161 MiB (6 MiB kernel + rootfs)
[    1.490871] 0x000000d00000-0x00000a800000 : "rootfs"     : 155 MiB
[    1.647219] 0x00000a800000-0x000014900000 : "alt_kernel" : 161 MiB (6 MiB kernel + rootfs)
[    1.809863] 0x00000ae00000-0x000014900000 : "alt_rootfs" : 155 MiB
[    1.959552] 0x000014900000-0x000014b00000 : "sysdiag"    :   2 MiB (unused)
[    1.961963] 0x000014b00000-0x000020000000 : "syscfg"     : 181 MiB (stock state, ext4)
  • a 330 MB rootfs_data could have been made out of the last 3 partitions, and shared by A and B, but instead each system has its 120 MB trailing area of rootfs. reasons are compatibility with stock, ease of rolling back to stock, etc.
  • a 450 MB rootfs_data could have been made by moving alt_kernel (it is only referenced by a uboot var) or by using the mtdconcat driver to add rootfs to the 330 MB partition.

(btw i proposed solutions to this, but they are not relevant here.)

example: whw03v1

a 4 GB emmc GPT device where rootfs_data could be 3.5 GB but is less than 100 MB. this also has separate rootfs_data for A and B (as loop devices after rootfs). for this device i provided a solution: optional steps to repartition the device to use 2 x 512 MB rootfs_data (i don't see the point in using more) and the rest of the emmc goes to an extra partition that survives sysupgrades, which you can use if you happen to need it.

there are many more examples in the list linked above.

my dual boot experience

after using proper dual boot devices (with dual rootfs_data) i'm really not going back. situations in which a misconfiguration would have required ripping off the device from its network and access it failsafe in isolation happened many times. this is inconvenient enough, but sometimes the devices where remote, which would have meant somebody knowledgeable would have needed to commute. thanks to proper dual boot, these situations just required a special power up sequence to roll back changes, and the rest was handled via VPN.

reasons for this change

  • some devices, such as the sax1v1k, just have the extra rootfs_data partition there. and touching the GPT at all very probably results in a brick, so it can't be merged. so there is no point in not allowing people to use it.
  • on nand devices, which tend to be smaller, repartitioning is an easy choice: just boot the new kernel and the repartition is effectively done. on GPT devices, repartition is much more work, and typically not done in standard installs. also going back to stock from a repartitioned device is more challenging. many people using GPT devices end up using a very small portion of their storage because they don't want to repartition. a way out of this mess is using an existing partition not named rootfs_data for that purpose, which this change allows. so this change is not only for A/B systems, but also for single boot systems with difficult stock GPTs.
  • on GPT devices with secure boot enabled, this change becomes essential. you cannot modify the GPT (including names), so how do you handle the case of a device that has a good partition candidate for rootfs_data but is not named exactly that? (answer: you don't, and cram everything into a small rootfs.) this is more general than just the A/B case.
  • even if dual rootfs_data were not the officially preferred install method, there is no reason not to allow advance users to customize their installs. a user could choose to split his "standard" big rootfs_data into 2 smaller partitions, modify the uboot env, and be happy with his dual boot, dual data system. GPT systems allow more end-user customization than nand systems; there is no reason not to support that.

conclusion: better support for GPT

so in summary, openwrt partition handling was designed for nand flash, where the device port author could freely name freely defined partitions, so names could be hard-coded in code as they are. in GPT devices names preexist, and they are either very cumbersome to change (repartition during install) to downright impossible (signed GPTs). in light of this, a way to configure names is needed to better support GPT systems.

configuration via the kernel command line is for sure needed. it might be desirable to have other avenues for that, idk, but this is a start.

@Lanchon
Copy link
Author

Lanchon commented Nov 23, 2024

@dangowrt

could you maybe review this or suggest someone who should?

@Lanchon
Copy link
Author

Lanchon commented Nov 27, 2024

hi @robimarko,

what would be the best way to have a discussion about this?

maybe someone in the core project is interested, but i don't know how to find that out

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants