Subversion Repositories DevTools

Rev

Rev 6320 | Details | Compare with Previous | Last modification | View Log | RSS feed

Rev Author Line No. Line
1038 dpurdie 1
Notes on the blat package transfer system
2
-----------------------------------------
3
 
4
Reason for its creation
5
-----------------------
6
Need to transfer packages from dpkg_archive to remote sites in a timely manner.
7
Rsync was considered but it has several problems:
8
 
9
1) Does not handle symlinks in a suitable manner
10
2) Works with all the files in the repository. Experience has
11
   shown that this can be very slow
12
3) Still requires significant scripting in order to be useful
13
 
14
Blat can make several assumptions about the package system.
15
Blat will:
16
    Support multiple transfer target destinations
17
    Allow for rapid detection of new packages that need to be transferred
18
    Allow for multiple Releases to be synchronized
19
    Allow for all (not-closed) releases in a Project to be synchronized
20
    Easily configured - and can be configured on the fly
21
    Atomically transfer packages
22
    Transfer a PackageList for future cleanup operations
23
    Logging and debug facilities
24
 
25
Overview of Blat
26
---------------
1042 dpurdie 27
There are three main components in Blat
1038 dpurdie 28
    Daemon supervisor
29
        Responsible for start and restarting configured daemons
30
    Transfer Daemons
31
        Responsible for the package sync operations for one target
32
        Multiple Daemons ( targets ) are supported
7412 dpurdie 33
        Multiple Daemon types are supported
34
            dpkg_archive sync (original)
35
            s3Sync (AWS S3 bucket sync for CI/CD)
1042 dpurdie 36
    On Target utilities
37
        A set of scripts that support Blat
38
        These are transferred to the target machine.
1038 dpurdie 39
 
40
Each Blat Daemon performs three main operations
41
    1) Fast package transfer
42
    2) Repository synchronization
43
    3) PackageList creation
1042 dpurdie 44
    4) Package aging (Optional)
1038 dpurdie 45
 
1042 dpurdie 46
Each Blat target can perform the following:
47
    1) Package aging
48
    2) dpkg_archive content indexing
49
 
1038 dpurdie 50
Fast package transfer
51
===============================
52
This is mechanism whereby Blat will detect the need to transfer a newly built
53
package to the target system.
54
 
55
It works by monitoring a directory of tags. It is the responsibility of Release
56
Manager to populate the directory.
57
 
58
The responsiveness of the detection can be configured, but a period of 5
59
seconds is suggested.
60
 
61
Repository synchronization
62
===============================
63
The daemon will request a list of packages that are present on the target and
64
determine the list of packages that should be on the target. Discrepancies will
65
be transferred to the target. Excess packages are left on the target.
66
 
67
Blat will request the target to create and transfer a list of packages.
68
This is done by invoking a small program on the target to perform the work.
69
 
70
Blat will interrogate the Release Manager database for Releases to be processed
71
and packages in those Releases.
72
 
73
A package will be transferred to the target if:
74
    * The package is required, but not present on the target
75
    * The time-stamps of the descpkg files differ
76
 
77
Package transfer may be delayed if the source package is writable, unless it
78
has been writable for longer than a configured time period.
79
 
80
The frequency of the Repository synchronization can be configured. A time of
81
several hours is suggested.
82
 
83
PackageList creation
84
===============================
85
Blat will create and send to the target a list of package-version that are
86
in the current set. This list may be used to clean out the package archive,
87
but this functionality has not yet been implemented.
88
 
1042 dpurdie 89
Package aging
90
=============
91
Blat can be configured to delete packages that are no longer a part of the
92
current package-version set. There are 4 methods:
1038 dpurdie 93
 
1042 dpurdie 94
1) None
95
   Packages will never be deleted by Blat on the target.
96
   The target file system will need to be managed to prevent it filling up.
97
 
98
2) Immediate
99
   Packages will be deleted as soon as they are not a part of the current
100
   package-version set.
101
 
102
3) Aged by blat master
103
   Packages will be marked for deletion and the blat master will delete
104
   the packages after a configured number of days.
105
 
106
4) Aged by blat target
107
   Packages will be marked for deletion and the blat target will delete
108
   the packages after a configured number of days. This operation requires
109
   that a cron job be configured on the target machine.
110
 
111
dpkg_archive content indexing
112
=============================
113
Blat provides a utility that can be run by the transfer target, as a cron job,
114
that will maintain a list of files and folders in the package archive.
115
 
116
This list greatly simplifies the process of locating a file in the archive.
117
The user simply greps the package list, rather than search the directory tree.
118
 
119
The file list is in a file .../dpkg_archive/.dpkg_archive/dpkg_archive_list.txt
120
 
7412 dpurdie 121
S3 Bucket Delivery
122
===============================
123
Blat has been extended to provide CI/CD support via an S3 bucket
124
The s3Sync task will maintain a single S3 bucket with ZIP files of 
125
packages from  Releases that support S3Sync
1042 dpurdie 126
 
7412 dpurdie 127
 
1038 dpurdie 128
Host System Requirements
129
========================
130
1) Unix
131
   It has been designed for a Unix environment - not Windows
132
2) Perl
133
   Blat is written in Perl
134
3) Java
135
   Required for the Database interface
136
4) Shell
137
   Start and stop scripts are in shell
138
5) Utilities
139
    ssh
140
    gtar
141
    gzip
7412 dpurdie 142
    aws cli (for s3Sync)
1038 dpurdie 143
 
7412 dpurdie 144
Target System Requirements - dpkg_archive sync
1038 dpurdie 145
========================
146
1) Unix
147
   It has been designed for a Unix environment - not Windows
148
2) Perl
149
3) Shell
6320 dpurdie 150
   Blat will execute a number of scripts on the target in order
151
   to control the process. These are in Shell and Perl
1038 dpurdie 152
4) Utilities
153
    ssh
154
    gtar
155
    gunzip
6320 dpurdie 156
5) User with write access to the dpkg_archive - (pkgadmin)
1038 dpurdie 157
6) Link for the users home directory to the package archive
158
   This link is called 'dpkg_archive'
159
 
160
Shared requirements
161
===================
162
Blat uses ssh for the transfer process. It uses an 'identity' file to allow
163
passwordless authentication with the target. The public part of the identify
164
file must be appended to the target users .ssh/authorized_keys file.
165
 
166
The private part of the identity file is held by the Blat Daemon.
167
 
168
Design assumptions
169
================================================================================
170
Blat is designed to transfer dpkg_archive packages in one direction.
171
 
172
Blat makes assumptions on the structure of a package
173
    - They contain a descpkg file
174
    - They are read-only when fully released
175
    - The contents of packages does not change
176
    - It is not necessary to check every file in the package
177
 
178
The Blat master is designed to run in a single directory tree.
179
The config file should be in a 'config' directory under the location
180
of the blat master program.
181
 
182
Installation :: Target System
183
=============================
1042 dpurdie 184
1) Create or acquire a user that has write access to the package archive
1038 dpurdie 185
 
1042 dpurdie 186
2) Create or acquire a passwordless identity file and associated public key
1038 dpurdie 187
   of the identity file. One set is available in the 'ssh' subdirectory.
188
 
1042 dpurdie 189
   Append the public part of the identity file (id_rsa_pkg_admin.pub) to
1038 dpurdie 190
   ~/.ssh/authorized_keys
191
 
1048 dpurdie 192
   I suggest using 'ssh-copy-id'.
193
 
1038 dpurdie 194
3) Create a link from the users home directory to dpkg_archive
195
   The must be called dpkg_archive
196
 
1042 dpurdie 197
4) Transfer the blat receiver scripts to a directory accessible to the
1038 dpurdie 198
   transfer user. ie: ~/bin
1042 dpurdie 199
   The required receiver files are:
1038 dpurdie 200
        get_plist.pl
201
        receive_file
202
        receive_package
203
        delete_package
1042 dpurdie 204
        pkg_mon.pl
205
        pkg_purge.pl
206
   Ensure the programs are executable by the transfer user.
6320 dpurdie 207
   Only get_plist.pl is really needed. The others will be transferred
208
   when detected missing. 
1038 dpurdie 209
 
1042 dpurdie 210
5) Set up cron jobs (optional)
211
   Will be used to maintain package information
212
   Suggest crontab entry - may vary for each installation
213
 
1464 dpurdie 214
 
215
 
1042 dpurdie 216
 
1038 dpurdie 217
Installation :: Host System
218
=============================
219
This section really deals with the configuration of a new target.
220
 
221
1) Create a new config file in Blat's config directory - with a .conf
222
   suffix. This is best done by cloning an existing entry.
223
 
224
   Note: The blat master will automatically spawn a daemon as soon
225
   as a new config file is seen. Its best to create the file elsewhere
226
   and copy it to the directory when ready.
227
 
228
   Note: The Blat daemon will detect changes to its own config file and
229
   re-read it on the fly.
230
 
231
Useful Tricks
232
=============
233
 
234
kill -usr1 pid-of-daemon
235
    Will force the daemon to perform a repository sync check.
236
 
237
kill -hup pid-of-daemon
238
    Will force the daemon to roll its own log files
239
 
3847 dpurdie 240
kill pid-of-daemon
241
    Will force the daemon to exit. It will be restarted.
242
 
243
Remove the daemons pid file
244
    Will force the daemon to exit. It will be restarted.
245
    Useful for debugging on a live system
246
 
247
kill -usr1 pid_of_master
248
    Will signal -usr1 to all daemons
249
    Will force all daemons to perform a repository sync check.
250
 
251
kill -hup pid_of_master
252
    Will signal -hup to all daemons
253
    Will force all daemon to roll their own log files
254
 
255
kill pid_of_master
256
    Will shut down system gracefully by sending kill to all
257
    children.
258
 
7412 dpurdie 259
ssh-to <name or ip address>
260
    Will ssh to the target machine as the pkgadmin user
261
 
262
ssh-copy-id -i ssh/id_rsa_pkg_admin pkgadmin@<name or ip address>
263
    Will copy the ssh identity file to the target machine
264
    You will need the password of the 'pkgadmin' user as configured on the target machine
265
 
1038 dpurdie 266
Debug verbosity is controlled via the 'verbose' config item
267
 
268
The pkg.xxxx config items are very special.
1042 dpurdie 269
If the named package-version is a symlink, then both the
270
link and the package addresses will be transferred.
1038 dpurdie 271
The link MUST address another version of the same package.
272
This is intended to support the 'jats2_current' link.
273
When a new version of JATS is released, then the new package
4456 dpurdie 274
will be transferred, as well the new link.
1038 dpurdie 275
 
276
Config items that control a time period allow the following sufixes:
277
    s - Seconds. Same as no suffix
278
    m - Minutes
279
    h - Hours
280
    d - Days
281
Multiple are allowed. ie: 1h10h
282
 
283
Config items that control a file size in blocks allow the following suffixes:
284
    k - Kilobytes (Same as no suffix)
285
    b - Blocks    (Same as no suffix)
286
    m - Megabytes
287
    g - Gigabytes
288
 
289
 
290
ToDo
291
======================
1042 dpurdie 292
1) Better handling of soft-links for core_devl
1038 dpurdie 293
   Works, but its prone to error
294
   There is no test to ensure the link exists. If the link
295
   is deleted, then it won't be recreated.