This document describes the steps necessary to install SAMGrid on a Grid3/OSG site. The SAMGrid system will use the established gatekeeper and its own jobmanager to handle jobs. The SAMGrid system will also use the VDT installation at the Grid3/OSG site, but does require the use of VO specific services. The SAMGrid system consists of a suite of servers running on the head node. In the implementation described here the system needs to run a single SAM server to communicate with the VO specific service provided by an offsite SAM station.
The procedure for installing and configuring SAMGrid at a Grid3/OSG site is outlined below with links to detailed information.
Installing SAMGrid At a Grid3/OSG Site
Register for FNAL product access.
The FNAL package management system is used to install the servers. As such the FNAL package management system needs to be installed. Access to FNAL packages requires registration. Please register at: http://www.fnal.gov/cd/forms/upd_registration.html
Bootstrap the FNAL package management system.
The FNAL packages ups and upd form the FNAL package management system.
To bootstrap the system from scratch see: http://www.fnal.gov/docs/products/ups/ReferenceManual/html/bootstrap.html#8463
The bootstrap files can be found at: ftp://ftp.fnal.gov/products/bootstrap/current/
Additional documentation can be found at: ftp://ftp.fnal.gov/products/bootstrap/current/manual_install.html
![]() | The installer may wish to customize the installation. The standard installation installs the FNAL packages as the user products. For simplicity in administering the SAM and SAMGrid servers change the products account to the sam account after the installation so that all further product installations will be done as user sam. The sam UID is 7816. Change what were products files to sam files. |
Install and configure SAMGrid.
Install the packages vdt, sam_gsi_config, tomcat, xmldb_server, xmldb_client, jim_config, sam_gsi_config_util, jim_job_managers, jim_sandbox, jim_gridftp, jim_advertise, sam, sam_bootstrap, sam_cp, sam_gridftp, sam_fcp, and server_run. These will bring in dependency packages.
| The document that describes the installation of a SAMGrid execution site is at: |
| http://www-d0.fnal.gov/computing/grid/SAMGridManual.htm |
| Also read as preparation: |
| http://www-d0.fnal.gov/computing/grid/SAMGridFabricJobSubmission.htm |
Omit the packages for a Submission site and a Monitoring site, and install the packages for an Execution site. All references to the products user should be changed to the sam user if as recommended above the sam user manages the FNAL package system. Also create a user account named samgrid. This is the account SAMGrid jobs will run as. Do install the FNAL vdt package per the instructions. This will not actually install vdt via pacman (Globus, Condor, etc.), but will enter the existence of vdt into the FNAL package database. After installation the vdt package is tailored. Here the installer can point the FNAL package database to the actual location of VDT used for Grid3/OSG operation.
![]() | The vdt tailoring should be done before any other packages are installed. |
The next package to install, sam_gsi_config, is the GSI security for SAM. As a Grid3/OSG site the host certificate already exists and instructions for that can be skipped. Before installing, as root create the directory /etc/grid-security/sam and make it owned by the sam user so that sam has write access to it. In this directory and its subdirectories the security devices for SAM are stored. Create the directories /etc/grid-security/sam/gridftp and /etc/grid-security/sam/jim_gridftp as user sam. This scheme separates the security for SAM from the security of the Grid3/OSG installation as much as possible. The only shared security file is the gatekeeper's grid-mapfile. When tailoring sam_gsi_config the location of the SAM certificates should be given as /etc/grid-security/sam. Tailor sam_gsi_config for a gatekeeper, jim_gridftp, and sam_gridftp.
After tailoring sam_gsi_config, to effect the proper security device separation a custom sam_gsi_config configuration must be done. This is accomplished by editing the sam_gsi_config configuration file in the products database thusly:
$ setup sam_gsi_config
Edit $SAM_GSI_CONFIG_FILE.
Change the
SAM_GSI_JIM_GRIDFTP_CERT_DIR value to
/etc/grid-security/sam/jim_gridftp/certificates.
Change the
SAM_GSI_JIM_GRIDFTP_MAPFILE value to
/etc/grid-security/sam/jim_gridftp/certificates/grid-mapfile.
Run $ ups install_ca sam_gsi_config -q vdt as both sam and root to install the certificates from the CA.
To use SAM transport mechanisms a SAM service
certificate is needed from the DOEGrids CA. Please
see http://d0db.fnal.gov/sam/doc/install/fileTransfer.shtml#sam_gridftp
for more detailed (and contemporary) instructions on getting a SAM
service certificate. When the certificate arrives place it in
/etc/grid-security/sam and send the SAM
certificate subject to <d0sam-admin@fnal.gov> asking
that it be added to the SAM station master grid-mapfile.
After the SAM certificate is in place be sure to update the package default grid-mapfiles using the sam_gsi_get_gridmap product e.g.:
$ setup sam_gsi_config_util -q vdt $ sam_gsi_get_gridmap --gridftp # etc. $ sam_gsi_get_gridmap --help # for more info
![]() | The grid user must be mapped to the local user samgrid (if it was so configured) in the gatekeeper grid-mapfile, and to local user sam in the jim_gridftp grid-mapfile. |
As per the instructions install tomcat and xmldb_server, and tailor them. Then be sure to run the xmldb server so it can be used for subsequent package installations. The XMLDB service created from these two packages is another VO specific service that can be run remotely in addition to the SAM station. Securing the use of the XMLDB service from the VO precludes installation of these packages.
The local_storage location referred to during the configuration of jim_config is known as durable storage in that files there are stored temporarily but exist beyond the end of a job. For instance thumbnail files are stored there until ready for merging. It is the location from which the SAM stager at the execution site serves files. The files are served to the FSS at the SAM station for disposition either to a durable location or to tape. The SAM station location to be used is configured via:
$ ups configure_site jim_configOne of the questions is station name. The configuration of jim_config is stored in the file system at $PRODUCTS/jim_config/site/site_config and in the XMLDB database. Check it via the interface to the configuration:
$ setup jim_config $ jim_config_manager_cmd.py gss
The SAM station name is also needed when configuring sam_batch_adapter, see below.
Install and configure SAM.
Install SAM per the instructions at http://d0db.fnal.gov/sam/doc/install/. The following packages and their dependencies should be installed: sam_station, sam_config, sam_bootstrap, sam_gridftp, and sam_cp. Ignore references to sam_bbftp as it is obsolete. Ignore references to sam_kerberos_rcp and sam_encp as they are not appropriate for this installation. See the SAM configuration document http://d0db.fnal.gov/sam/doc/install/samConfig.shtml and the sam_bootstrap document http://d0db.fnal.gov/sam/doc/install/samBootstrap.shtml. See the document describing SAM transport configuration http://d0db.fnal.gov/sam/doc/install/fileTransfer.shtml.
Since the site will be using a external SAM station,
when tailoring sam_bootstrap configure a stager only, with the options
--with-fss and --without-sm.
This will create a server list file with just one line similar to
stager station_prd v4_2_1_77 station_name --with-fss --without-sm --logfile=stager_log
Where station_prd is the station
configuration identifier and station_name
is the SAM station name.
When sam_batch_adapter is configured
a config file is created in $SAM_BATCH_ADAPTER_CONFIG_DIR (setup
sam_batch_adapter to define this variable). The file must have a name
of the form
station-name__config__.py
where station-name is the name of the
SAM station being used by the execution site. In
this file the StationName_ variable should be set to
the SAM station to be used. Make sure the path to
the sam batch handler script is correct.
If a firewall or other network security is implemented at the site, then ensure that access to the head node is allowed for the SAM station and the SAMGrid submission sites. A list of the sites is available at http://samgrid.fnal.gov:8080/list_of_schedulers.php?.
Obtain the use of and configure an external SAM station.
An external SAM station is needed for SAMGrid operation (assuming one cannot install a full station at the Grid3/OSG site). The SAM station needs to be configured appropriately to serve the execution site. The configuration of the execution site must specify the SAM station to use. See jim_config and sam_batch_adapter above for details. Secure the use of a SAM station and an agreeable manager for use by the execution site. A SAM station physically close to the execution site is more efficient but any could be used.
![]() | Use of an external SAM station is only possible if the worker nodes allow incoming network connections. This is because the SAM station must be able to do call backs to the client. Without this criterion satisfied the accessibility of the workers by the station can be provided by a station on the local network of the workers allowing direct access to them. Workers on private networks may use Network Address Port Translation (Masquerading) techniques to provide call back access without compromising security. Future SAMGrid releases will lift the requirement on client accessibility by implementing polling interfaces in place of call backs. |
The SAM station must run a jim_gridftp data server to communicate
with the execution site. The SAM station must be configured for the
group mcc99 to run Monte Carlo jobs and group d0production to run
reprocessing jobs. The sam_cp package on the SAM station must be
configured to use sam_gridftp as the transport. The host name of the
execution site must be added to the SAM database (send an email to <d0sam-admin@fnal.gov>). The SAM station
must run an FSS server for the execution site. The
FSS must be configured to route files from the head
node to a location where they can be stored to tape or to a durable
location. The rule of thumb for a
durable location storage at the SAM station or execution site is 500MB per processor at
the execution site. The SAM storage location at the execution site
(the jim_config
local_storage) needs to be registered with the SAM database as a
storage location. A SAM administrator can do that with commands
similar to:
octarine-clued0:~> samadmin add data disk \ --fullpath=cmssrv10.fnal.gov:/storage/remote/data1/sam octarine-clued0:~> samadmin add disk location \ --fullpath=cmssrv10.fnal.gov:/storage/remote/data1/sam Disk location cmssrv10.fnal.gov:/storage/remote/data1/sam has been registered: id = 3646, type = 'disk'Send email to
<d0sam-admin@fnal.gov> asking to add the
durable storage location to the SAM database.
Other considerations when installing SAMGrid.
The D-Zero Monte Carlo requires about 4GB of space for the sandbox on the worker nodes. Ensure that sufficient local disk space is available and configured on the worker nodes.
![]() | The worker nodes of the cluster must be able to access the head node using the head node's fully qualified domain name. The worker nodes also need to know their own fully qualified domain name in order for sam_cp to work. The worker nodes must be time synchronized. |
This sections documents some tests to verify the operation of the site along with suggestions for diagnosing problems.
The execution site and SAM station should show up on the SAMGrid List of Resources at http://samgrid.fnal.gov:8080/list_of_resources.php?
As a partial test of the jim_gridftp head server get a valid grid proxy then:
$ jim_gridftp `hostname`:/path/to/anyfile `pwd`/tmp \ --user-proxy=/tmp/x509_u1012 --server-type=head_serverwhere the proxy file is your proxy.
Check that the stager at the execution site is recognized by the SAM station by clicking on the SAM station entry on the SAM-At-A-Glance page: http://d0db-prd.fnal.gov/sam_local/SamAtAGlance/.
The xmldb server must be running for the package information to be stored in the database. You can check if it is running and browse the database by looking at http://your.configured.server:7080/Xindice/.
The globus and jobmanager logs are useful:
$ setup vdt $ cd $GLOBUS_LOCATION/varCheck the logs globus-gatekeeper.log and jim_jobmanager_samgrid.log.
Check jim_config configuration:
$ setup jim_config $ jim_config_manager_cmd.py gss
Check jim_gridftp configuration:
$ setup jim_config $ setup jim_gridftp $ jim_config_manager_cmd.py gca -n jim_gridftp -p
Check jim_advertise configuration:
$ setup jim_config $ setup jim_advertise $ jim_config_manager_cmd.py gcs -n jim_advertise
Check sam_gsi_config configuration:
$ setup sam_gsi_config $ sam_gsi_read_config
Test sanity of the batch system configuration:
$ setup sam_batch_adapter
$ ${SAM_BATCH_ADAPTER_DIR}/etc/testBatchAdapters.py
This submits a dummy job using the adapter, tries to look it up and interpret
its results.Thanks to Gabriele Garzoglio, Robert Illingworth, and Joseph Kaiser for their invaluable help. All errors are mine alone. Comments, suggestions, corrections, bug reports are welcome.