Subversion Repositories DevTools

Rev

Rev 1048 | Go to most recent revision | Blame | Compare with Previous | Last modification | View Log | RSS feed

Notes on the blat package transfer system
-----------------------------------------

Reason for its creation
-----------------------
Need to transfer packages from dpkg_archive to remote sites in a timely manner.
Rsync was considered but it has several problems:

1) Does not handle symlinks in a suitable manner
2) Works with all the files in the repository. Experience has
   shown that this can be very slow
3) Still requires significant scripting in order to be useful

Blat can make several assumptions about the package system.
Blat will:
    Support multiple transfer target destinations
    Allow for rapid detection of new packages that need to be transferred
    Allow for multiple Releases to be synchronized
    Allow for all (not-closed) releases in a Project to be synchronized
    Easily configured - and can be configured on the fly
    Atomically transfer packages
    Transfer a PackageList for future cleanup operations
    Logging and debug facilities

Overview of Blat
---------------
There are two main components in Blat
    Daemon supervisor
        Responsible for start and restarting configured daemons
    Transfer Daemons
        Responsible for the package sync operations for one target
        Multiple Daemons ( targets ) are supported

Each Blat Daemon performs three main operations
    1) Fast package transfer
    2) Repository synchronization
    3) PackageList creation

Fast package transfer
===============================
This is mechanism whereby Blat will detect the need to transfer a newly built
package to the target system.

It works by monitoring a directory of tags. It is the responsibility of Release
Manager to populate the directory.

The responsiveness of the detection can be configured, but a period of 5
seconds is suggested.

Repository synchronization
===============================
The daemon will request a list of packages that are present on the target and
determine the list of packages that should be on the target. Discrepancies will
be transferred to the target. Excess packages are left on the target.

Blat will request the target to create and transfer a list of packages.
This is done by invoking a small program on the target to perform the work.

Blat will interrogate the Release Manager database for Releases to be processed
and packages in those Releases.

A package will be transferred to the target if:
    * The package is required, but not present on the target
    * The time-stamps of the descpkg files differ

Package transfer may be delayed if the source package is writable, unless it
has been writable for longer than a configured time period.

The frequency of the Repository synchronization can be configured. A time of
several hours is suggested.


PackageList creation
===============================
Blat will create and send to the target a list of package-version that are
in the current set. This list may be used to clean out the package archive,
but this functionality has not yet been implemented.


Host System Requirements
========================
1) Unix
   It has been designed for a Unix environment - not Windows
2) Perl
   Blat is written in Perl
3) Java
   Required for the Database interface
4) Shell
   Start and stop scripts are in shell
5) Utilities
    ssh
    gtar
    gzip

Target System Requirements
========================
1) Unix
   It has been designed for a Unix environment - not Windows
2) Perl
3) Shell
   Blat will execute a number of scripts on the target inn order
   to control the process. These are in shell and perl
4) Utilities
    ssh
    gtar
    gunzip
5) User with write access to the dpkg_archive
6) Link for the users home directory to the package archive
   This link is called 'dpkg_archive'

Shared requirements
===================
Blat uses ssh for the transfer process. It uses an 'identity' file to allow
passwordless authentication with the target. The public part of the identify
file must be appended to the target users .ssh/authorized_keys file.

The private part of the identity file is held by the Blat Daemon.

Design assumptions
================================================================================
Blat is designed to transfer dpkg_archive packages in one direction.

Blat makes assumptions on the structure of a package
    - They contain a descpkg file
    - They are read-only when fully released
    - The contents of packages does not change
    - It is not necessary to check every file in the package

The Blat master is designed to run in a single directory tree.
The config file should be in a 'config' directory under the location
of the blat master program.

Installation :: Target System
=============================
1) Create or aquire a user that has write access to the packaeg archive

2) Create or aquire a passwordless identity file and associated public key
   of the identity file. One set is available in the 'ssh' subdirectory.

   Append the public part of the idenity file (id_rsa_pkg_admin.pub) to
   ~/.ssh/authorized_keys

3) Create a link from the users home directory to dpkg_archive
   The must be called dpkg_archive

4) Transfer the blat receiver scripts to a directory accessable to the
   transfer user. ie: ~/bin
   The reqired receiver files are:
        get_plist.pl
        receive_file
        receive_package
        delete_package
   Ensure the programs are execuatable by the transfer user.

Installation :: Host System
=============================
This section really deals with the configuration of a new target.

1) Create a new config file in Blat's config directory - with a .conf
   suffix. This is best done by cloning an existing entry.

   Note: The blat master will automatically spawn a daemon as soon
   as a new config file is seen. Its best to create the file elsewhere
   and copy it to the directory when ready.

   Note: The Blat daemon will detect changes to its own config file and
   re-read it on the fly.

Useful Tricks
=============

kill -usr1 pid-of-daemon
    Will force the daemon to perform a repository sync check.

kill -hup pid-of-daemon
    Will force the daemon to roll its own log files

Debug verbosity is controlled via the 'verbose' config item

The pkg.xxxx config items are very special.
If the named package-versio is a symlink, then both the
link and the liink the package addresses will be transferred.
The link MUST address another version of the same package.
This is intended to support the 'jats2_current' link.
When a new version of JATS is released, then the new package
will be transferred, as will the new link.

Config items that control a time period allow the following sufixes:
    s - Seconds. Same as no suffix
    m - Minutes
    h - Hours
    d - Days
Multiple are allowed. ie: 1h10h

Config items that control a file size in blocks allow the following suffixes:
    k - Kilobytes (Same as no suffix)
    b - Blocks    (Same as no suffix)
    m - Megabytes
    g - Gigabytes


ToDo
======================

1) Purging packages on target
   Can set deletePackages to delete excess packages when the release sync
   is performed.

2) Better handling of soft-links for core_devl
   Works, but its prone to error
   There is no test to ensure the link exists. If the link
   is deleted, then it won't be recreated.