Jetson Distbuild Cluster

This documentation follows on from the generic distbuild introduction, and describes the process involved in creating a distbuild network of Jetson-TK1 boards.

Preconditions:

  1. You have setup a development environment on a jetson board as per the instructions at http://wiki.baserock.org/guides/baserock-jetson/
  2. You have a trove that you have push/pull access to (in this documentation we will refer to this as TROVE_ID and TROVE_HOST)
  3. You have another (ideally more) Jetson boards to deploy the distbuild image to

Build the Jetson distbuild image

Now you have a Jetson board setup to develop the image, you can build the distbuild image. Please make sure you have followed the instructions from http://wiki.baserock.org/quick-start/#index3h2 onwards to setup morph and your second hard drive correctly.

We'll also need to upgrade to the latest morph version, see http://wiki.baserock.org/using-latest-morph/

Now you are ready to build!

git clone git://git.baserock.org/baserock/baserock/definitions --branch master
cd definitions
morph --verbose build systems/build-system-armv7lhf-jetson.morph

Once this is done (should be quick from the git.baserock.org cache) you'll be ready to deploy.

Deploying a Jetson based distbuild network

You'll now want to create a distbuild cluster morph, which will deal with generating the controller and worker nodes needed.

Create some worker keys that will be used to grant access to the trove you will use:

mkdir ssh_keys && cd ssh_keys
ssh-keygen -t rsa -b 2048 -f worker.key \
    -C worker@JETSON_DISTBUIlD_NETWORK -N ''
cd ../

Now add this to your trove:

ssh git@TROVE_HOST user add jetsondistbuild \
    your@emailaddress.com \
    "Jetson distbuild network"

And finally add the ssh key:

ssh git@TROVE_HOST as jetsondistbuild \
    sshkey add jetson-id < ssh_keys/worker.key.pub

Now we can create the deployment cluster. In this example we're very lucky, as we have 5 Jetson boards to use. We'll have one controller node (tyrell) and 5 workers (tyrell, batty, pris, zhora, leon), if you only have two boards, it's probably a better idea to use one as your build board, and one as the testing board, so set

WORKERS: tyrell

instead. If you're lucky, here's our 5 cluster:

name: jetson-distbuild-cluster
kind: cluster
description: |
    Jetson distbuild cluster
systems:
- morph: systems/build-system-armv7lhf-jetson.morph
  deploy-defaults:
    TROVE_HOST: TROVE_HOST
    TROVE_ID: TROVE_ID
    CONTROLLERHOST: tyrell
    ROOT_DEVICE: "/dev/mmcblk0p1"
    DTB_PATH: "boot/tegra124-pm375.dtb"
    DISTBUILD_CONTROLLER: no
    DISTBUILD_WORKER: yes
    BOOTLOADER_CONFIG_FORMAT: "extlinux"
    BOOTLOADER_INSTALL: "none"
    KERNEL_ARGS: console=ttyS0,115200n8 no_console_suspend=1 lp0_vec=2064@0xf46ff000 video=tegrafb mem=1862M@2048M memtype=255 ddr_die=2048M@2048M section=256M pmuboard=0x0177:0x0000:0x02:0x43:0x00 vpr=151M@3945M tsec=32M@3913M otf_key=c75e5bb91eb3bd94560357b64422f85 usbcore.old_scheme_first=1 core_edp_mv=1150 core_edp_ma=4000 tegraid=40.1.1.0.0 debug_uartport=lsport,3 power_supply=Adapter audio_codec=rt5640 modem_id=0 android.kerneltype=normal usb_port_owner_info=0 fbcon=map:1 commchip_id=0 usb_port_owner_info=0 lane_owner_info=6 emc_max_dvfs=0 touch_id=0@0 tegra_fbmem=32899072@0xad012000 board_info=0x0177:0x0000:0x02:0x43:0x00 tegraboot=sdmmc gpt
    FSTAB_SRC: LABEL=src /srv/distbuild auto defaults,rw,noatime 0 2
    INSTALL_FILES: distbuild/manifest
    WORKER_SSH_KEY: ssh_keys/worker.key
  deploy:
    build-controller:
      type: extensions/rawdisk
      location: /src/jetson-distbuild-tyrell.img
      DISK_SIZE: 3G
      DISTBUILD_CONTROLLER: yes
      HOSTNAME: tyrell
      WORKERS: tyrell, batty, pris, zhora, leon
    batty:
      type: extensions/rawdisk
      location: /src/jetson-distbuild-batty.img
      DISK_SIZE: 3G
      HOSTNAME: batty
    pris:
      type: extensions/rawdisk
      location: /src/jetson-distbuild-pris.img
      DISK_SIZE: 3G
      HOSTNAME: pris
    zhora:
      type: extensions/rawdisk
      location: /src/jetson-distbuild-zhora.img
      DISK_SIZE: 3G
      HOSTNAME: zhora
    leon:
      type: extensions/rawdisk
      location: /src/jetson-distbuild-leon.img
      DISK_SIZE: 3G
      HOSTNAME: leon

Save this as jetson-distbuild-cluster.morph

To deploy use morph deploy:

morph deploy jetson-distbuild-cluster.morph

This will generate the root file systems you need to flash in /src, once this is done you're ready to flash them to your boards

Flashing the distbuild images

Now, on your laptop create a place for these files to be extracted:

mkdir ~/distbuild && cd ~/distbuild

Copy over the tar files generated by the deploy step:

scp root@ip.of.devel.board:/src/jetson-distbuild-* .

You'll need to extract the pre-built u-boot image from one of these images:

export SDK_PATH=/path/to/Linux_for_Tegra &&
mkdir $SDK_PATH/baserock &&
sudo mount -t btrfs jetson-distbuild-batty.img /mnt &&
cp /mnt/systems/factory/run/boot/u-boot.bin $SDK_PATH/baserock &&
sudo umount /mnt

If your kernel doesn't support BTRFS you can download the image from here - http://download.baserock.org/baserock/tegra/u-boot.bin, or build it yourself using the baserock/arm/tegra-uboot-btrfs branch from http://git.baserock.org/cgit/delta/u-boot.git, but make sure you copy the built image (in the case you built it, use the device tree version) to a baserock/ folder in your Linux_for_Tegra directory.

We use a special build of u-boot with btrfs support, this allows you to flash a baserock image that can be upgraded from your developement machine/vm very easily, meaning you won't have to put your board into recovery mode ever again to upgrade things like the kernel! See the baserock instructions for upgrading systems for more information on this.

Once you have done this, you can start flashing. Attach the already formatted external hard drives to the boards. Then power-on and put the board in recovery mode, attach it to the computer, change into your Linux_for_Tegra directory, and run the baserock-flashing command. It's a good idea to flash the workers before the controllers, since they will reboot after the flash, and the controller needs to connect to each board. For example, to flash batty:

sudo ./baserock-jetson-flash.sh ~/distbuild/jetson-distbuild-batty.img

Do this for each board, it's a little slow, so (depending on the number of boards you need to flash) now would be a good time to make a cup of tea (or a pot!). However, once you have flashed baserock, you'll not need to do this again.

Assuming all the boards are connected to the network, and can talk to each other, it is now ready to go!

Using the distbuild network

To run a build using your distbuild network:

morph distbuild --controller-initiator-address=tyrell <system>

Distbuild places build logs for each chunk in the current working directory, for example the log for a build of pango-misc would be stored in build-step-pango-misc.log, so you could view build progress using:

tailf build-step-pango-misc.log

Since the build has been distributed to ARM nodes, you can use an x86 VM to do your ARM system development using the distbuild command. Don't forget to set your trove-host = TROVE_HOST and trove-id=TROVE_ID in your VM morph.conf though! This potentially frees up your Jetson development board to be added to the cluster, as deckard

If you only have the two boards, why not use the distbuild "network" of one for building, and then flash your image in the same way as described in "Flashing the distbuild images" to the other one?