| 1038 |
dpurdie |
1 |
Notes on the blat package transfer system
|
|
|
2 |
-----------------------------------------
|
|
|
3 |
|
|
|
4 |
Reason for its creation
|
|
|
5 |
-----------------------
|
|
|
6 |
Need to transfer packages from dpkg_archive to remote sites in a timely manner.
|
|
|
7 |
Rsync was considered but it has several problems:
|
|
|
8 |
|
|
|
9 |
1) Does not handle symlinks in a suitable manner
|
|
|
10 |
2) Works with all the files in the repository. Experience has
|
|
|
11 |
shown that this can be very slow
|
|
|
12 |
3) Still requires significant scripting in order to be useful
|
|
|
13 |
|
|
|
14 |
Blat can make several assumptions about the package system.
|
|
|
15 |
Blat will:
|
|
|
16 |
Support multiple transfer target destinations
|
|
|
17 |
Allow for rapid detection of new packages that need to be transferred
|
|
|
18 |
Allow for multiple Releases to be synchronized
|
|
|
19 |
Allow for all (not-closed) releases in a Project to be synchronized
|
|
|
20 |
Easily configured - and can be configured on the fly
|
|
|
21 |
Atomically transfer packages
|
|
|
22 |
Transfer a PackageList for future cleanup operations
|
|
|
23 |
Logging and debug facilities
|
|
|
24 |
|
|
|
25 |
Overview of Blat
|
|
|
26 |
---------------
|
| 1042 |
dpurdie |
27 |
There are three main components in Blat
|
| 1038 |
dpurdie |
28 |
Daemon supervisor
|
|
|
29 |
Responsible for start and restarting configured daemons
|
|
|
30 |
Transfer Daemons
|
|
|
31 |
Responsible for the package sync operations for one target
|
|
|
32 |
Multiple Daemons ( targets ) are supported
|
| 7412 |
dpurdie |
33 |
Multiple Daemon types are supported
|
|
|
34 |
dpkg_archive sync (original)
|
|
|
35 |
s3Sync (AWS S3 bucket sync for CI/CD)
|
| 1042 |
dpurdie |
36 |
On Target utilities
|
|
|
37 |
A set of scripts that support Blat
|
|
|
38 |
These are transferred to the target machine.
|
| 1038 |
dpurdie |
39 |
|
|
|
40 |
Each Blat Daemon performs three main operations
|
|
|
41 |
1) Fast package transfer
|
|
|
42 |
2) Repository synchronization
|
|
|
43 |
3) PackageList creation
|
| 1042 |
dpurdie |
44 |
4) Package aging (Optional)
|
| 1038 |
dpurdie |
45 |
|
| 1042 |
dpurdie |
46 |
Each Blat target can perform the following:
|
|
|
47 |
1) Package aging
|
|
|
48 |
2) dpkg_archive content indexing
|
|
|
49 |
|
| 1038 |
dpurdie |
50 |
Fast package transfer
|
|
|
51 |
===============================
|
|
|
52 |
This is mechanism whereby Blat will detect the need to transfer a newly built
|
|
|
53 |
package to the target system.
|
|
|
54 |
|
|
|
55 |
It works by monitoring a directory of tags. It is the responsibility of Release
|
|
|
56 |
Manager to populate the directory.
|
|
|
57 |
|
|
|
58 |
The responsiveness of the detection can be configured, but a period of 5
|
|
|
59 |
seconds is suggested.
|
|
|
60 |
|
|
|
61 |
Repository synchronization
|
|
|
62 |
===============================
|
|
|
63 |
The daemon will request a list of packages that are present on the target and
|
|
|
64 |
determine the list of packages that should be on the target. Discrepancies will
|
|
|
65 |
be transferred to the target. Excess packages are left on the target.
|
|
|
66 |
|
|
|
67 |
Blat will request the target to create and transfer a list of packages.
|
|
|
68 |
This is done by invoking a small program on the target to perform the work.
|
|
|
69 |
|
|
|
70 |
Blat will interrogate the Release Manager database for Releases to be processed
|
|
|
71 |
and packages in those Releases.
|
|
|
72 |
|
|
|
73 |
A package will be transferred to the target if:
|
|
|
74 |
* The package is required, but not present on the target
|
|
|
75 |
* The time-stamps of the descpkg files differ
|
|
|
76 |
|
|
|
77 |
Package transfer may be delayed if the source package is writable, unless it
|
|
|
78 |
has been writable for longer than a configured time period.
|
|
|
79 |
|
|
|
80 |
The frequency of the Repository synchronization can be configured. A time of
|
|
|
81 |
several hours is suggested.
|
|
|
82 |
|
|
|
83 |
PackageList creation
|
|
|
84 |
===============================
|
|
|
85 |
Blat will create and send to the target a list of package-version that are
|
|
|
86 |
in the current set. This list may be used to clean out the package archive,
|
|
|
87 |
but this functionality has not yet been implemented.
|
|
|
88 |
|
| 1042 |
dpurdie |
89 |
Package aging
|
|
|
90 |
=============
|
|
|
91 |
Blat can be configured to delete packages that are no longer a part of the
|
|
|
92 |
current package-version set. There are 4 methods:
|
| 1038 |
dpurdie |
93 |
|
| 1042 |
dpurdie |
94 |
1) None
|
|
|
95 |
Packages will never be deleted by Blat on the target.
|
|
|
96 |
The target file system will need to be managed to prevent it filling up.
|
|
|
97 |
|
|
|
98 |
2) Immediate
|
|
|
99 |
Packages will be deleted as soon as they are not a part of the current
|
|
|
100 |
package-version set.
|
|
|
101 |
|
|
|
102 |
3) Aged by blat master
|
|
|
103 |
Packages will be marked for deletion and the blat master will delete
|
|
|
104 |
the packages after a configured number of days.
|
|
|
105 |
|
|
|
106 |
4) Aged by blat target
|
|
|
107 |
Packages will be marked for deletion and the blat target will delete
|
|
|
108 |
the packages after a configured number of days. This operation requires
|
|
|
109 |
that a cron job be configured on the target machine.
|
|
|
110 |
|
|
|
111 |
dpkg_archive content indexing
|
|
|
112 |
=============================
|
|
|
113 |
Blat provides a utility that can be run by the transfer target, as a cron job,
|
|
|
114 |
that will maintain a list of files and folders in the package archive.
|
|
|
115 |
|
|
|
116 |
This list greatly simplifies the process of locating a file in the archive.
|
|
|
117 |
The user simply greps the package list, rather than search the directory tree.
|
|
|
118 |
|
|
|
119 |
The file list is in a file .../dpkg_archive/.dpkg_archive/dpkg_archive_list.txt
|
|
|
120 |
|
| 7412 |
dpurdie |
121 |
S3 Bucket Delivery
|
|
|
122 |
===============================
|
|
|
123 |
Blat has been extended to provide CI/CD support via an S3 bucket
|
|
|
124 |
The s3Sync task will maintain a single S3 bucket with ZIP files of
|
|
|
125 |
packages from Releases that support S3Sync
|
| 1042 |
dpurdie |
126 |
|
| 7412 |
dpurdie |
127 |
|
| 1038 |
dpurdie |
128 |
Host System Requirements
|
|
|
129 |
========================
|
|
|
130 |
1) Unix
|
|
|
131 |
It has been designed for a Unix environment - not Windows
|
|
|
132 |
2) Perl
|
|
|
133 |
Blat is written in Perl
|
|
|
134 |
3) Java
|
|
|
135 |
Required for the Database interface
|
|
|
136 |
4) Shell
|
|
|
137 |
Start and stop scripts are in shell
|
|
|
138 |
5) Utilities
|
|
|
139 |
ssh
|
|
|
140 |
gtar
|
|
|
141 |
gzip
|
| 7412 |
dpurdie |
142 |
aws cli (for s3Sync)
|
| 1038 |
dpurdie |
143 |
|
| 7412 |
dpurdie |
144 |
Target System Requirements - dpkg_archive sync
|
| 1038 |
dpurdie |
145 |
========================
|
|
|
146 |
1) Unix
|
|
|
147 |
It has been designed for a Unix environment - not Windows
|
|
|
148 |
2) Perl
|
|
|
149 |
3) Shell
|
| 6320 |
dpurdie |
150 |
Blat will execute a number of scripts on the target in order
|
|
|
151 |
to control the process. These are in Shell and Perl
|
| 1038 |
dpurdie |
152 |
4) Utilities
|
|
|
153 |
ssh
|
|
|
154 |
gtar
|
|
|
155 |
gunzip
|
| 6320 |
dpurdie |
156 |
5) User with write access to the dpkg_archive - (pkgadmin)
|
| 1038 |
dpurdie |
157 |
6) Link for the users home directory to the package archive
|
|
|
158 |
This link is called 'dpkg_archive'
|
|
|
159 |
|
|
|
160 |
Shared requirements
|
|
|
161 |
===================
|
|
|
162 |
Blat uses ssh for the transfer process. It uses an 'identity' file to allow
|
|
|
163 |
passwordless authentication with the target. The public part of the identify
|
|
|
164 |
file must be appended to the target users .ssh/authorized_keys file.
|
|
|
165 |
|
|
|
166 |
The private part of the identity file is held by the Blat Daemon.
|
|
|
167 |
|
|
|
168 |
Design assumptions
|
|
|
169 |
================================================================================
|
|
|
170 |
Blat is designed to transfer dpkg_archive packages in one direction.
|
|
|
171 |
|
|
|
172 |
Blat makes assumptions on the structure of a package
|
|
|
173 |
- They contain a descpkg file
|
|
|
174 |
- They are read-only when fully released
|
|
|
175 |
- The contents of packages does not change
|
|
|
176 |
- It is not necessary to check every file in the package
|
|
|
177 |
|
|
|
178 |
The Blat master is designed to run in a single directory tree.
|
|
|
179 |
The config file should be in a 'config' directory under the location
|
|
|
180 |
of the blat master program.
|
|
|
181 |
|
|
|
182 |
Installation :: Target System
|
|
|
183 |
=============================
|
| 1042 |
dpurdie |
184 |
1) Create or acquire a user that has write access to the package archive
|
| 1038 |
dpurdie |
185 |
|
| 1042 |
dpurdie |
186 |
2) Create or acquire a passwordless identity file and associated public key
|
| 1038 |
dpurdie |
187 |
of the identity file. One set is available in the 'ssh' subdirectory.
|
|
|
188 |
|
| 1042 |
dpurdie |
189 |
Append the public part of the identity file (id_rsa_pkg_admin.pub) to
|
| 1038 |
dpurdie |
190 |
~/.ssh/authorized_keys
|
|
|
191 |
|
| 1048 |
dpurdie |
192 |
I suggest using 'ssh-copy-id'.
|
|
|
193 |
|
| 1038 |
dpurdie |
194 |
3) Create a link from the users home directory to dpkg_archive
|
|
|
195 |
The must be called dpkg_archive
|
|
|
196 |
|
| 1042 |
dpurdie |
197 |
4) Transfer the blat receiver scripts to a directory accessible to the
|
| 1038 |
dpurdie |
198 |
transfer user. ie: ~/bin
|
| 1042 |
dpurdie |
199 |
The required receiver files are:
|
| 1038 |
dpurdie |
200 |
get_plist.pl
|
|
|
201 |
receive_file
|
|
|
202 |
receive_package
|
|
|
203 |
delete_package
|
| 1042 |
dpurdie |
204 |
pkg_mon.pl
|
|
|
205 |
pkg_purge.pl
|
|
|
206 |
Ensure the programs are executable by the transfer user.
|
| 6320 |
dpurdie |
207 |
Only get_plist.pl is really needed. The others will be transferred
|
|
|
208 |
when detected missing.
|
| 1038 |
dpurdie |
209 |
|
| 1042 |
dpurdie |
210 |
5) Set up cron jobs (optional)
|
|
|
211 |
Will be used to maintain package information
|
|
|
212 |
Suggest crontab entry - may vary for each installation
|
|
|
213 |
|
| 1464 |
dpurdie |
214 |
|
|
|
215 |
|
| 1042 |
dpurdie |
216 |
|
| 1038 |
dpurdie |
217 |
Installation :: Host System
|
|
|
218 |
=============================
|
|
|
219 |
This section really deals with the configuration of a new target.
|
|
|
220 |
|
|
|
221 |
1) Create a new config file in Blat's config directory - with a .conf
|
|
|
222 |
suffix. This is best done by cloning an existing entry.
|
|
|
223 |
|
|
|
224 |
Note: The blat master will automatically spawn a daemon as soon
|
|
|
225 |
as a new config file is seen. Its best to create the file elsewhere
|
|
|
226 |
and copy it to the directory when ready.
|
|
|
227 |
|
|
|
228 |
Note: The Blat daemon will detect changes to its own config file and
|
|
|
229 |
re-read it on the fly.
|
|
|
230 |
|
|
|
231 |
Useful Tricks
|
|
|
232 |
=============
|
|
|
233 |
|
|
|
234 |
kill -usr1 pid-of-daemon
|
|
|
235 |
Will force the daemon to perform a repository sync check.
|
|
|
236 |
|
|
|
237 |
kill -hup pid-of-daemon
|
|
|
238 |
Will force the daemon to roll its own log files
|
|
|
239 |
|
| 3847 |
dpurdie |
240 |
kill pid-of-daemon
|
|
|
241 |
Will force the daemon to exit. It will be restarted.
|
|
|
242 |
|
|
|
243 |
Remove the daemons pid file
|
|
|
244 |
Will force the daemon to exit. It will be restarted.
|
|
|
245 |
Useful for debugging on a live system
|
|
|
246 |
|
|
|
247 |
kill -usr1 pid_of_master
|
|
|
248 |
Will signal -usr1 to all daemons
|
|
|
249 |
Will force all daemons to perform a repository sync check.
|
|
|
250 |
|
|
|
251 |
kill -hup pid_of_master
|
|
|
252 |
Will signal -hup to all daemons
|
|
|
253 |
Will force all daemon to roll their own log files
|
|
|
254 |
|
|
|
255 |
kill pid_of_master
|
|
|
256 |
Will shut down system gracefully by sending kill to all
|
|
|
257 |
children.
|
|
|
258 |
|
| 7412 |
dpurdie |
259 |
ssh-to <name or ip address>
|
|
|
260 |
Will ssh to the target machine as the pkgadmin user
|
|
|
261 |
|
|
|
262 |
ssh-copy-id -i ssh/id_rsa_pkg_admin pkgadmin@<name or ip address>
|
|
|
263 |
Will copy the ssh identity file to the target machine
|
|
|
264 |
You will need the password of the 'pkgadmin' user as configured on the target machine
|
|
|
265 |
|
| 1038 |
dpurdie |
266 |
Debug verbosity is controlled via the 'verbose' config item
|
|
|
267 |
|
|
|
268 |
The pkg.xxxx config items are very special.
|
| 1042 |
dpurdie |
269 |
If the named package-version is a symlink, then both the
|
|
|
270 |
link and the package addresses will be transferred.
|
| 1038 |
dpurdie |
271 |
The link MUST address another version of the same package.
|
|
|
272 |
This is intended to support the 'jats2_current' link.
|
|
|
273 |
When a new version of JATS is released, then the new package
|
| 4456 |
dpurdie |
274 |
will be transferred, as well the new link.
|
| 1038 |
dpurdie |
275 |
|
|
|
276 |
Config items that control a time period allow the following sufixes:
|
|
|
277 |
s - Seconds. Same as no suffix
|
|
|
278 |
m - Minutes
|
|
|
279 |
h - Hours
|
|
|
280 |
d - Days
|
|
|
281 |
Multiple are allowed. ie: 1h10h
|
|
|
282 |
|
|
|
283 |
Config items that control a file size in blocks allow the following suffixes:
|
|
|
284 |
k - Kilobytes (Same as no suffix)
|
|
|
285 |
b - Blocks (Same as no suffix)
|
|
|
286 |
m - Megabytes
|
|
|
287 |
g - Gigabytes
|
|
|
288 |
|
|
|
289 |
|
|
|
290 |
ToDo
|
|
|
291 |
======================
|
| 1042 |
dpurdie |
292 |
1) Better handling of soft-links for core_devl
|
| 1038 |
dpurdie |
293 |
Works, but its prone to error
|
|
|
294 |
There is no test to ensure the link exists. If the link
|
|
|
295 |
is deleted, then it won't be recreated.
|