SkillAgentSearch skills...

RunSAS

This shell script tool allows you to run a SAS program or a batch of SAS programs via Bash environment with various interactive and non-interactive execution modes. This is ideal for a SAS environment which is restrictive by design and you need a batch execution script/tool especially when you have a batch of SAS Data Integration Studio (DI) jobs.

Install / Use

/learn @PrajwalSD/RunSAS
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

Introduction

runSAS is essentially a bash shell script desigined to execute SAS programs or SAS Data Integration Studio jobs. It's feature rich with support for concurrency (as flows), job fail recovery options, email notifications, log monitoring, and has many useful interactive and non-interactive modes.

The primary motivation behind this side project was to provide existing SAS 9.x environments with a simple CLI-based tool to manage SAS programs/jobs without the need for an additional third-party softwares/programs.

This is useful for SAS sites where a third-party SAS job schedulers like LSF or Control-M is not installed, the projects can extend runSAS as per their needs preferrably contribute back to this.

Screenshots

Flows: runSAS in action Load balancing: runSAS flows Error handling: runSAS on error

Prerequisites

SAS 9.x environment (Linux) with SAS BatchServer component is essential for the runSAS to execute or any equivalent would work(i.e. sas.sh, sasbatch.sh etc.). This is typically present in every SAS 9.x installation.

All other script dependencies are checked at every launch of the script automatically.

Get started?

  • Download runSAS.sh (or clone the repo) and transfer it to a SAS compute server (Linux based)
  • Open runSAS.sh in edit mode and:
    • Set the user parameters in the header section as per your SAS environment configuration
    • Specify the list of the job(s)/program(s) you want to run with dependencies
  • Execute the script simply by using ./runSAS.sh command as a user who has job/program execution privileges (OS and SAS Metadata privileges)

Tip: There are many invocation options and useful features in the script all of it is discussed in later sections below

Configuring Parameters

runSAS has 4 user parameter sections within the script:

  • SAS 9.4 parameters – Provide SAS environment details

    • SAS_HOME_DIRECTORY: AS Home directory (e.g.: /SASInside/SASHome)
    • SAS_INSTALLATION_ROOT_DIRECTORY: SAS installation root directory (e.g.: /SASInside/SAS)
    • SAS_APP_SERVER_NAME: Name of the SAS Application Server Context (e.g.: SASApp)
    • SAS_LEV: SAS level (e.g.: Lev1 or Lev2 ...)
    • SAS_DEFAULT_SH: SAS shell script name that executes the SAS program/job (e.g.: sasbatch.sh)
    • SAS_APP_ROOT_DIRECTORY: Path for the SAS Application Server Context directory (e.g.: SASInside/SAS/Lev1/SASApp)
    • SAS_BATCH_SERVER_ROOT_DIRECTORY: Path for the SAS Batch Server directory (e.g.: /SASInside/SAS/Lev1/SASApp/BatchServer)
    • SAS_LOGS_ROOT_DIRECTORY: Path for the logs directory (e.g.: /SASInside/SAS/Lev1/SASApp/BatchServer/Logs)
    • SAS_DEPLOYED_JOBS_ROOT_DIRECTORY: Path for the deployed jobs directory (e.g.: /SASInside/SAS/Lev1/SASApp/SASEnvironment/SASCode/Jobs)
  • Job/flow list – Provide a list of jobs/flows to run (append the optional parameters to the mandatory parameters with no whitespaces)

    • Syntax: flow-id|flow-nm|job-id|job-nm|dependent-job-id|dependency-type|job-rc-max|job-run-flag|options|sub-options|sasapp-dir|batchserver-dir|sas-sh|log-dir|job-dir|

    • Details:

      • flow-id: Flow identifer (has to be a number)
      • flow-nm: Flow name (no spaces allowed in the name)
      • job-id: Job identifer (has to be a number)
      • job-nm: Job name (typically a deployed job file name, no spaces allowed in the name)
      • dependent-job-id: Job ID of a dependent(s), specify them as a comma-delimited list (e.g.: 1,2,3) or with hyphen (e.g.: 1-3) if there are more than one dependent jobs
      • dependency-type- Possible values here are AND (i.e., run after all of it's dependents have completed) and OR (i.e., run if at least one of the dependents has completed)
      • job-rc-max: Specify the limit (e.g., 0 or 4), return codes above this will halt the run
      • job-run-flag: Y/N to indicate if you want to deactivate this job in the flow
      • options(optional): Possible values here are --prompt (user will be prompted to continue), --server (speficic SAS server related options can be provided for a job to overrides the global option)
      • sub-options(optional, if options is not --server): Name of the SAS Application Server Context (e.g.: SASApp)
      • sasapp-dir(optional, if options is not --server): Path for the SAS Application Server Context directory (e.g.: SASInside/SAS/Lev1/SASApp)
      • batchserver-dir(optional, if options is not --server): Path for the SAS Batch Server directory (e.g.: /SASInside/SAS/Lev1/SASApp/BatchServer)
      • sas-sh(optional, if options is not --server): SAS shell script name that executes the SAS program/job (e.g.: sasbatch.sh)
      • log-dir(optional, if options is not --server): Path for the logs directory (e.g.: /SASInside/SAS/Lev1/SASApp/BatchServer/Logs)
      • job-dir(optional, if options is not --server): Path for the deployed jobs directory (e.g.: /SASInside/SAS/Lev1/SASApp/SASEnvironment/SASCode/Jobs)

    Example (specifying a flow with jobs):

    1|Flow_A|1|Job_1|1|AND|4|Y|
    1|Flow_A|2|Job_2|2|AND|0|Y|--prompt
    1|Flow_A|3|Job_3|3|AND|4|Y|
    2|Flow_B|4|Job_4|1,2,3|AND|4|Y|
    2|Flow_B|5|Job_5|5|AND|4|Y|
    

    Tip: You can simply provide a list of jobs without anything else if you're not using the flows, runSAS will automatically create one flow (OR multiple flows depending on the GENERATE_SINGLE_FLOW_FOR_ALL_JOBS parameter) with all jobs in it

    Example (specifying just jobs):

    Job_1
    Job_2,--skip
    Job_3
    Job_4
    Job_5
    
  • Email settings – Configure the email parameters (script has an inline explanation for each parameter)

    • ENABLE_EMAIL_ALERTS: Y to enable all 4 alert types (YYYY is the extended format, <trigger-alert><job-alert><error-alert><completion-alert>)
    • EMAIL_ALERT_TO_ADDRESS: Provide email addresses separated by a semi-colon
    • EMAIL_ALERT_USER_NAME: This is used as FROM address for the email alerts
  • runSAS script overrides – A collection of script behavior control parameters (script has an inline explanation for each parameter, keep the defaults if you're unsure)

    • ENABLE_DEBUG_MODE=N: Enables the debug mode, specifiy Y/N
    • RUNTIME_COMPARISON_FACTOR=30: Runtime change threshold, increase this to display only higher % difference
    • KILL_PROCESS_ON_USER_ABORT=Y: The rogue processes are automatically killed by the script on user abort
    • ENABLE_RUNSAS_RUN_HISTORY=N: Enables runSAS flow/job runtime history, specify Y/N
    • ABORT_ON_ERROR=N: Set to Y to abort as soon as runSAS sees an ERROR in the log file (i.e don't wait for the job to complete)
    • ENABLE_SASTRACE_IN_JOB_CHECK=Y: Set to N to turn off the warnings on sastrace
    • ENABLE_RUNSAS_DEPENDENCY_CHECK=Y: Set to N to turn off the script dependency checks
    • BATCH_HISTORY_PERSISTENCE=ALL: Specify a postive number to control the number of batches preserved by runSAS (e.g. 50 will preserve last 50 runs), ALL will keep everything
    • CONCURRENT_JOBS_LIMIT=ALL: Specify the available job slots as a number (e.g. 2), ALL will use the CPU count instead (nproc --all) and MAX will spawn all jobs
    • CONCURRENT_JOBS_LIMIT_MULTIPLIER=1: Specify a positive number to increase the available job slots (e.g. 1x, 2x, 3x``...), will be used a multiplier to the above parameter
    • ERROR_CHECK_SEARCH_STRING="^ERROR": This is what is grepped in the log
    • STEP_CHECK_SEARCH_STRING="Step:": This is searched for the SAS Step in the log
    • SASTRACE_SEARCH_STRING="^options sastrace": This is used for searching the options sastrace ... option in SAS log

Additional "Hidden" Script Parameters

There are additional set of script behavior control parameters, it's kept hidden away deep in the code in the bottom third of the script intentionally, typically they don't require changing and defaults should just work fine. An inline explanation for each parameter is provided in the script for reference.

  • Parameters

    • EMAIL_USER_MESSAGE=: This will be appended to email subject
    • GENERATE_SINGLE_FLOW_FOR_ALL_JOBS=N: If set to Y runSAS will create a single flow for all jobs instead of one flow per job
    • EMAIL_ATTACHMENT_SIZE_LIMIT_IN_BYTES=8000000: This is the log file size limit (sent as email attachment)
    • SERVER_PACKAGE_INSTALLER_PROGRAM=yum: Package installer is used by runSAS for auto-installation of depdendencies
    • RUNSAS_LOG_SEARCH_FUNCTION=egrep: runSAS uses this as search util to detect errors in job logs
    • RUNSAS_DETECT_CYCLIC_DEPENDENCY=Y: If set to N runSAS will NOT detect cyclic dependencies in job flows before the batch run
    • GENERATE_SINGLE_FLOW_FOR_ALL_JOBS=N: If set to Y runSAS will create a single flow for all jobs instead of one flow per job
    • RUNSAS_PRINT2DEBUG_LOGGING=Y: This outputs a useful essential batch run related info to .tmp/.runsas.debug file

Can I save batch status info to a SAS environment (as a dataset or into a database table)?

Yes, runSAS can save batch run related info in real-time to a SAS dataset or to a database table of your choice. To enable this feature, set the following parameters (these parameters can be found in the "Hidden" script parameters section). An inline explanation for each parameter is provided in the script for reference.

In nutshell, runSAS generates a new SAS program file by using the parameters. This SAS program is designed to take in the batch run details and update in real-time into a specified table/dataset, runSAS

Related Skills

View on GitHub
GitHub Stars9
CategoryProduct
Updated1y ago
Forks1

Languages

Shell

Security Score

70/100

Audited on Nov 1, 2024

No findings