Implementing a Backup Catalog…on a Student Budget
Setting Up a CatBackup Machine
Intended Audience: Novice
You have some SQL experience, some Perl,
are comfortable with everyday Unix commands,
can use a Unix
based text editor.
Hardware and O.S.
Obtain a suitable PC. It doesn’t have to be very big or very fast (if you have a larger collection, build times will
benefit from a faster machine). Our first machine ran at 450Mhz. The rack machine we presently us
e is a
900Mhz Pentium III. It has three drives in a RAID configuration, resulting in about a 33gig “drive”, similar in
size to the first machine. We’ve used RedHat 7.2 and 7.3 in the past, and are now using 8.0 on our production
RedHat still has
RedHat Linux (currently version 9) as we know it available through April 2004. Go to
to download (free). After that, I imagine that it will be
available from the many mirror sites for some time to come. If you
want to buy it, you’ll probably want to
buy it now. Fedora is what’s replacing it; RedHat says it’s for “developers and early high
tech enthusiasts using
Linux in non
critical computing environments”. I think that’s their way of saying that they’re not goi
officially support it. Go to
for details on Fedora. To
download Fedora, go to
. If you’re feeling adventurous, you could try
one of the many other Linux distributions.
The instructions here address RedHat Linux 8, but should probably be the same, if not similar, for the other
versions or distributions.
If you want to download, you’ll need to have a fast connection. You could do it over a phone modem , but it
a very long time (days) if it worked at all. You’ll be downloading around two gigs of data in three
pieces. Select “burn disk image” when you put the .iso files on your blank CDs.
If you’re new to the system administrations aspect, I recommend you look at
the RedHat installation guide and
if you feel uncomfortable about doing some of that, having a knowledgeable person assist you might be a good
idea. Security consideration: you may want to consult with someone knowledgeable on this to make sure that
ill have an at least reasonably secure configuration.
Install Linux on your machine (start with disk 1). The install process (wizard) walks you through the steps.
You’ll want to setup adequately large partitions except for /usr/local, which should be “fil
l to disk”, or “free
space”. We have the following partitions on our catbackup machine:
free or remaining space
in the right column are approximate allocations. You can choose Disk Druid or fdisk to do this.
Your installation probably includes Perl and Apache; you may want to install newer (more secure?) versions. I
had Perl 5.6.1, and Apache 1.3.23 (tes
t bed machine not accessible outside the university, so that’s ok;
production machine has newer versions installed).
You’ll need to make some changes to the
file for Apache; do this as root. This file is often found in
. If it’s
This should show you where to find it.
Changes needed for httpd.conf:
(occurs more than once, leave defaults as is. you may need to add ExecCGI)
(search for this line)
script.cgi (and uncomment it)
(Aliases section, add the following)
Alias /images/ “/var/www/images/”
Your IP address refers to your catbackup machine. Some of these settings may already be as desired. If you
have multiple databases on your catbackup machine, you’ll have to create virtual URLs for each
database/catalog. You should find templates for this in the file. I used name
based virtual hosting. (Some of the
above lines will be in each virtual host
definition section, and will be different for each virtual host. You still
should do the above.)
For the remainder of the software, you’ll want the newest stable versions. In each case, I got the xxx.tar.gz files.
Download Postgres from the Post
(These instructions are for version 7.3.4 and the commands may be the same for your version)
, (source distribution), which w
ill be a subdirectory “down” one level
from the one in which you unzipped and untar’d the original file (the v’s represent the version number).
Enter the following commands:
su (to root, if you’re not already)
chown SOBackup /usr/local/pgsql/data
D /usr/local/pgsql/data >logfile 2>&1 &
These last two commands create a test database and log you into it;
logs you out of the database. While
you’re logged in, issue the
(slash ell) command to get a listing of databases. You
should see a database called
SOBackup. If you don’t, log out and use the createdb command above to create it.
Test things out to make sure all is ok.
Making sure Postgres starts on system bootup:
where xxx is the complete path spec for your source distribution directory referenced above.
You’ll have to edit this
chmod 755 postgresql
To verify that Postgres starts up, rebo
ot and then:
ef | grep postgres
As root, give the SOBackup account a password:
Get the tar’d and gzip’d modules, following this procedure:
go to the URL for each piece of software
click on the distribution for th
e module you’re getting
click the download button on the following page.
Get DBD (for Postgres) and DBI from the CPAN site.
for the DBI module, and (1.34 9/03)
for the DBD
ule (1.22 9/03)
Get the Unicode modules from CPAN also.
for Jcode (0.83 9/03)
The numbers in parentheses at the ends of the lines indicate the most current version available as of that date.
DBI and DBD enable the catbackup softwar
e to talk to the Postgres database(s). The other modules are used for
the Unicode character translation (you may not need them, depending on your Voyager version and Perl
These modules must be installed in the specified order:
The procedure is the same for each and you should be root.
gunzip and untar the module, then cd to its directory just created.
Look at the README file; make sure your version of
Perl is new enough.
Issue the following commands, and scan the output from each to make sure that things are ok:
Creating the Home Environment
Log in as user SOBackup, and in that home directory,
create the following directories:
pwd (should show /home/SOBackup, if not cd to it)
Protect the SOBackup directories to facilitate web access:
chmod 755 control
Log in as root and make the following directories you’re creating belong to user SOBackup:
cd to /usr/local
chown SOBackup SOB
chgrp SOBackup SOB
chown SOBackup indata
chgrp SOBackup ind
chown SOBackup temp
chgrp SOBackup temp
chown SOBackup words
chgrp SOBackup words
Log in as SOBackup:
cd /home/SOBackup (if you’re not already there)
s /usr/local/SOB/indata indata
s /usr/local/SOB/words words
These last three lines create soft links to these new directories. Above, you created directories in
Creating these links to the actual directories makes it look as if these directories are i
directories will contain a number of huge files that would not fit in the
partition. Recall that when
was created, it got all the remaining space on the hard drive. This is where all the incomng and in
is located. Recall from the Postgres setup that its data also resides in the
(If you will be using multiple databases, you’ll want to keep things separate. I created subdirectories under
is ok as is.)
to facilitate SOBackup’s use of Postgres, you’ll need to add its binaries’ location to your path.
(unless you set up a different shell) in
You’ll find a line in there that looks something like the following, but may cont
ain more data:
Using this example, you should end up with the following:
Note that the colon “:” separates the individual entries in the line.
Getting Catbackup Specific Files
is a number of files that are unique to the catbackup implementation. These are available from
Go to the section for EUGM 2004. Click on
download it. gunzip and untar it (see earlier instructions). T
his process will create, off of your current directory,
a directory called
. You should print out the
file there and follow its instructions on
customization. Further subdirectories contain the catbackup files, organized by category.
les for the Build Process
Login as SOBackup
Put the build files in here:
where X represents the database. This is useful when more than one database is being used on your catalog
backup machine. If you have only one database, you can leave the X part off. The rest of the installation refers
to a single database installat
ion. The order of the files shown above is also the order of execution.
Marc2unicode is presently required with Perl 5.8.0. Once you upgrade to a Voyager version that works with
Unicode, this step probably will not be needed, and your build process would
take appreciably less time.
is the “cron” script to call buildx.
merely provides a convenient way to call buildnew.pl, supplying it with X.
l handles the marching of the database versions through several generations, and
creates the current
version of the database.
is the backbone of the build process, and logs all activity.
The incoming data file is gunzip’d.
does the MARC to Unicode translation on the data file.
processes the cour
se reserve and opac suppression information in there. Some
housekeeping/clean up is done.
takes this preprocessed data and prepares it for use on the catbackup machine.
creates callnumber and bib tables and indexes. This run
but some overlap in processing is possible. You may need to adjust the value of the “wait”
variable in the
script to avoid this.
creates a dictionary of searchable words.
builds the other tables
and indexes that make up a backup catalog database.
Then “pointers” to the generations of the database are updated.
Lastly, the oldest version of the database is deleted.
contains data used for the course reserve handling.
There are two fiel
ds, the first is a department abbreviation (6 characters), and the second is the full
department name (remaining characters). This file is used in
. You will have to populate this
file with data for your institution.
l: this pr
ogram, along with the
file, ensures that items suppressed in OPAC, but
on course reserve, do show up in the backup catalog.
Apply the following protections:
chmod 755 *.pl *.sql *.script dobuild
chmod 664 dept.map wmuDB
It’s what the user sees when first interacting with your catbackup implementation. The user enters parameters
and starts a search here. This file should be called
, or you can call it something else, or put it
elsewhere, and ma
l a symlink to that other file (see use of the
command above). You will have
to customize it for your installation, particularly the “
” line, which needs a URL that references
your catbackup machine.
Protect this file:
For multiple databases, you’ll need separate HTML files for each database. I put them in their own directories
Login as root.
Put these files there:
The xxxsmall_logo.gif is an optional file supplied by you. It would contain a small
copy of an image associated
with this catalog (the one you’re backing up on this machine). You’ll probably want a different one for each
catalog if you have more than one. The xxxinstitution_logo.gif file is also optional. It typically would be an
your institution logo.
Install XsearchMe.cgi and, as root, apply the needed protection:
chmod 755 XSearchMe.cgi
Edit it, searching for
, then change the url in those lines to point to your machine. For multiple da
you’ll have multiple
files here, otherwise drop the X identifier.
HOW TO FEED YOUR CATBACKUP MACHINE
On Your Voyager Machine
There are several files needed for the extract process:
gets scheduled to run in cron. I
does just what its name implies.You’ll need to put the readonly username and password
for your database in the script.
queries the Voyager database to get the data for this backup extract, and sorts and gzips the re
handles the secureftp file transfer.
is copied to the transfer directory. when the cron process on catbackup finds it, the build process
is used as a data staging area for the extract process.
ts for cron
On your Voyager machine, run
from cron. On your catbackup machine,
should be scheduled in cron. It should run as SOBackup.
How to Use cron
A “cron job” is a task scheduled to run via Unix’s cron process. It takes
its instructions from a crontab file.
lists the contents of your crontab file,
(sorry!) on the file so that you can
edit it. The contents of the crontab file follow a specific format:
[minute] [hour] [day of month] [m
onth] [day of week] task
(starting with Sunday at 0)
Here’s the crontab entry to run the build for one of our catbackup databases:
0 3 * * 1,2,4,5 /home/SOBackup/Build/buildwmu.script >/tmp/wmub
will run at 3 AM on Monday, Tuesday, Thursday and Friday of each week. It can run on any
day of the month, so we specify “*” for any day, and it will run every month, thus the “*”. The
redirects program output intera
ctively seen on the screen (stdout) to the specified path and file. If we experience
problems, we can go look at
to check for error messages, etc. You should schedule your
build runs via cron as user SOBackup.
When the appropriate
scripts have been correctly set up with cron, you’ll have a fresh backup image on your
catbackup machine whenever the scripts run. In our case, we run the extract and build process on Monday,
Tuesday, Thursday, and Friday. We omit Wednesday because we have
a full backup of our Voyager machine
running that night. The build process is finished by around 7 a.m..
Searches of the backup catalog are logged. I created
to provide usage reports, useful for those times
when we rely on the backup catalog as
the catalog. The search process creates monthly log files. You tell
logcat.pl which logfile to use and which days to look at. The report, as supplied, is set up for our situation with
two institutions on two databases in one Voyager implementation. You’ll
need to edit
for your use.
Searching for “wmu” and “kvcc” will show you which areas to change.
Our Voyager box is a Sun UltraSPARCII with 4 processors, 4 gigs RAM, and over 80 gigs drive space at the
time of this writing. The bac
kup machine specs are listed earlier in this document.
We have somewhere over a million and quarter bib records. The extract takes about two and a half hours. The
build process takes about three and three quarter hours. Searches are very fast. Your milea
ge will vary.
I may have forgotten some details. You’ll have to be resourceful (like I was when I put this together and
modified the basic process, a lot). If something does not work, check permissions on related files and
at the web server log file may also be helpful. In the setup detailed above, that is
. I will not provide support. I have a job. I might answer simple quick questions.
You implement this software and anything associated with it at
COMMENTS ON COST:
Price of PC
as much as several thousand, or possibly free if you have an available machine
Price of Software
Cost of Labor
free (built into overhead)
Benefit of Implementation
priceless, when you need it! (
and it might not cost you anything!)