This guide is for administrators who are managing a Morph distributed build network on a Calxeda Highbank ARM server. Note that Calxeda have gone out of business: for new deployments we recommend setting up a cluster of NVIDIA Jetson boards to do distributed builds (or something else, if you want).
This guide assumes you have a Baserock build environment set up. Follow the 'Preparation' section in HOW-TOs - Using distbuild for Baserock on ARM if you have not done this already.
Placeholders that are used in this guide:
- $trove -- hostname and trove-id of your Trove - $infrastructure.git -- your infrastructure definitions repo. (This can be a keyed Morph URL, e.g. baserock:baserock/infrastructure.git, or a full URL like git://$trove/$trove/site/definitions.) - $cxmanage -- hostname of a machine that contains `ipmitool` and has network access to your Calxeda server - $version - a version label. This can be anything you want that is valid as a directory name. I recommend using the date, e.g. `2015-04-20`.
You will need an 'infrastructure' repository, forked from the Baserock
You may have this in your Trove as
This describes how to upgrade your distbuild network to an imaginary 20.00 release of Baserock.
First, clone infrastructure definitions
git clone $infrastructure.git
Merge in the latest tagged release of the Baserock reference systems from the reference systems Git repository.
cd infrastructure git remote add upstream git://git.baserock.org/baserock/baserock/definitions.git git remote update upstream git merge --no-ff baserock-20.00
Build the new version of
using your distbuild network. You need to push your branch first, so the
distbuild network can see it.
git push origin HEAD morph distbuild systems/build-system-armv7lhf-highbank.morph --local-changes=ignore
Deploy the cluster for your distbuild network. I'll assume that this is called
distbuild-cluster.morph for the purpose of this guide, and contains a system
distbuild, deployed with
This will create a new rootfs for each node, it will not overwrite the running
rootfs, so you can keep using the distbuild network while this completes.
morph upgrade distbuild-cluster.morph --local-changes=ignore \ distbuild.VERSION_LABEL=$version
When you are ready, you need to restart the distbuild nodes in the updated system. This will stop any builds that are running, so it's a good idea to check first if there are any running builds:
Restart all of the nodes on your distbuild network using
impitool. First run
ssh root@$cxmanage to log into the right machine. You then need to check and
reset the power of each node. The example below assumes your nodes are on the
subnet 172.17.1.0 with the first node's IPMI interface at 172.17.1.3, the second
node's IPMI interface at 172.17.1.7, and so on. It also assumes you are using 8
nodes (0 to 7). You can use commands
ipmitool chassis power status,
chassis power off and
on as well as
reset, which may be useful if not all
of the nodes on the server are
currently powered on, or you don't know the status of them.
for i in `seq 0 7`; do ipmi_address=172.17.1.$(expr $i \* 4 + 3) ipmitool -U admin -P admin -H $ipmi_address chassis power reset done
If you find that a new version does not work for some reason and you want to roll back to the previous version, you need to update the 'default' symlinks on the NFS server to point to the old version.
First, log in to the Trove (assuming you use Trove as your NFS server) with
ssh root@$trove. Then, set up an environment variable listing the name of
each node. The example below assumes your nodes are called
nodes="node0 node1 node2"
You can see the 'default' symlink for each node with this command. This tells you which directory each node will read its root filesystem from next time it boots.
cd /srv/nfsboot for name in $nodes; do readlink $name/systems/default; done
To cause each node to boot from an older system version, update the 'default' symlink for each node. For example, to switch them to version with label '2015-01-01', do this:
cd /srv/nfsboot for name in $nodes; do ln -sf /srv/nfsboot/$name/systems/2015-01-01 $name/systems/default done
You can then reboot the nodes as described above. When you reboot the nodes, any running builds will be cancelled.
If you find that your distbuild network doesn't work, start by logging into the controller node and checking:
systemctl status distbuild-setup.service systemctl status morph-controller.service systemctl status morph-controller-helper.service systemctl status morph-worker.service systemctl status morph-worker-helper.service systemctl status morph-cache-server.service
You can find full logs for these services in