Thursday, March 19, 2015

DataStage Checksum Stage. Do's and Dont's

DataStage has provide many powerful plugin tools for data inergration and loading purpose into datawarehouse. In this blog I am giving a brief overview of checksum stage and present you the do's and dont's of this stage.

Checksum Stage : This stage is used to calcuate the unique checksum of a given string or given set of columns. This stage can be used to generate a unique identifier for a composite key or can be used for detection of changes between source and target set of records during CDC process while loading to datawarehouse.

Do's:

1) Ensure the column name specfication between the source and target at the same. The order of column need not be the same , but the spelling on the column should be the same.

As the checksum stage will auto align the columns in alphabetical ascending order it is very important that the spelling of the column names be the same between source and target.

2) Ensure the column lenght and scale are same between the source and target to generate the same unique identifier.

3) Ensure the column exclude or include to be same between both the source and taget columns.

4) For better debugging purpose always use a option Buffer Output Column Name to store the buffer value for identifying the columns and data used to build the cheksum.


Dont:

If using checksum stage in a job with run time column propagation , it is advised not use the column exlucusion as the maintainability of the jobs will be difficult for future changes. Inclusion of column is a best approach for better maintainability in such scenario. Vice versa for non run time column propagation columns.

Avoid performing a checksum on a existing checksum field as this will not produce the same result always.


Let me know your view and toughts abt this post ..

Windows batch Script to Export Datastage jobs using command line

The below script is to automate extraction of export on to desktop from Datastage server for usage into code version systems.
 
@ECHO OFF
:: -----------------------------------------------------------------
:: -----------------------------------------------------------------
:: Name: export.bat
:: -----------------------------------------------------------------
:: Description: To extract datastage components using Command line.
:: -----------------------------------------------------------------
:: VERSION CREATED BY DATE  DESCRIPTION
:: 1.0  Vijay Reddy  16/02/2015  Initial Version
:: -----------------------------------------------------------------
:: -----------------------------------------------------------------
:: Required Components:
::     dsexport.exe (Windows Version)
::     dssearch.exe (Windows Version)
::     exportRec.bat (Windows Batch Script file )
:: -----------------------------------------------------------------
:: Command Line Parameters:
:: 1. Single Mode or Multi Mode
:: 2. Job Name or File list all the job names for export
:: 3. Project Name
:: 4. Target Path to save the exported files
:: 5. Recursive extraction mode to extracted all the referenced jobs.
:: -----------------------------------------------------------------
:: -----------------------------------------------------------------
:: -----------------------------------------------------------------
:: SETTING GLOBAL VARIABLES
:: -----------------------------------------------------------------
SET OPTION=%1
SET JOBNAME=%2
SET JOBLISTFILENAME=%2
SET PROJECTNAME=%3
SET TARGETPATH=%4
SET RECUREXTRACT=%5

:: -----------------------------------------------------------------
:: SETTING DATASTAGE ENVIRONMENT VARIABLES
:: -----------------------------------------------------------------
SET DOMAIN=
SET USER=
SET PWD=
SET SERVER=

:: -----------------------------------------------------------------
:: VALIDATING PARAMETERS
:: -----------------------------------------------------------------
SET CLIENTPATH=C:\IBM\InformationServer87\Clients\Classic
SET UTILPATH=C:\IBM\InformationServer87\JobExport\ExportUtil
:: -----------------------------------------------------------------
:: EXPORTING in SINGLE MODE
:: -----------------------------------------------------------------
if  "%1"==""   ( goto usage ) else (if NOT %OPTION% == M ( if NOT %OPTION% == S goto ERROR1 ) )
if  "%2"==""   ( goto usage ) else (if %OPTION% == M (if not exist %JOBLISTFILENAME% goto ERROR2 ))
if  "%3"==""   ( goto usage )
if  "%4"==""   ( goto usage ) else ( if not exist %TARGETPATH% goto ERROR3 )
if  "%5"==""   ( SET RECUREXTRACT=N )

:: -----------------------------------------------------------------
:: EXPORTING in MULTIPLE MODE
:: -----------------------------------------------------------------
IF %OPTION% == S (
echo Exporting %JOBNAME% from Project: %PROJECTNAME%
%CLIENTPATH%\dsexport.exe /D=%DOMAIN% /U=%USER% /P=%PWD% /JOB=%JOBNAME% /NODEPENDENTS %SERVER%/%PROJECTNAME% %TARGETPATH%\%JOBNAME%.dsx
IF %RECUREXTRACT% == R (
CALL %UTILPATH%\exportRec.bat %JOBNAME% %PROJECTNAME% %TARGETPATH%
)

) else (

 for /F "tokens=1" %%i in (%JOBLISTFILENAME%) do (
        echo Exporting %%i from Project: %PROJECTNAME%
        %CLIENTPATH%\dsexport.exe /D=%DOMAIN% /U=%USER% /P=%PWD% /JOB=%%i /NODEPENDENTS %SERVER%/%PROJECTNAME% %TARGETPATH%\%%i.dsx
        IF %RECUREXTRACT% == R (
        CALL %UTILPATH%\exportRec.bat %%i %PROJECTNAME% %TARGETPATH%
        )
    )
)

cd /D  %~dp0
EXIT /B
:usage
echo  Options 1-4 are Mandatory
echo  Command Options : Option 1 - "S" for Single Job extraction Mode and "M" for Multi Job Extraction Mode 
echo  Option 2 - In single job extraction Mode - Job Name to extract 
echo  Option 2 - In Multi job extraction Mode - Path and Filename of the list of jobs for extraction 
echo  Option 3 - DataStage Project Name for extraction 
echo  Option 4 - Target path to store exported dsx files
echo  option 5 - Recursive export of components ( Downward recursive only ). This parameter is optional.
cd /D  %~dp0
EXIT /B
:ERROR1
echo  Not a valid Option, Please use letters S - for Single job extraction or M - for Multi Job extraction.
echo  Please find the options summary below
goto  usage

:ERROR2
echo  %JOBLISTFILENAME% not found
cd /D  %~dp0
EXIT /B
:ERROR3
echo Not a valid directory %TARGETPATH%
cd /D  %~dp0
EXIT /B
 
 
============================================
 
@ECHO OFF
:: ------------------------------------------------------------------------------------
:: ------------------------------------------------------------------------------------
:: Name: exportRec.bat
:: ------------------------------------------------------------------------------------
:: Description::: To extract datastage components using Command line in recursive mode
:: ------------------------------------------------------------------------------------
:: VERSION CREATED BY DATE  DESCRIPTION
:: 1.0  Vijay Reddy  16/02/2015  Initial Version
:: ------------------------------------------------------------------------------------
:: ------------------------------------------------------------------------------------
:: Required Components:
::     dsexport.exe (Windows Version)
::     dssearch.exe (Windows Version)
:: ------------------------------------------------------------------------------------
:: Command Line Parameters:
:: 1. Job Name or File list all the job names for export
:: 2. Project Name
:: 3. Target Path to save the exported files
:: ------------------------------------------------------------------------------------
:: ------------------------------------------------------------------------------------
SET JOBNAME=%1
SET PROJECTNAME=%2
SET TARGETPATH=%3
SET WORKDIR=C:\IBM\InformationServer87\JobExport\ExportUtil
SET RECURJOBLIST=tmprecurjoblist.txt
SET DOMAIN=
SET USER=
SET PWD=
SET SERVER=

SET CLIENTPATH=C:\IBM\InformationServer87\Clients\Classic
echo Exporting Recursive jobs
%CLIENTPATH%\dssearch.exe -domain %DOMAIN% -user %USER% -password %PWD% -server %SERVER% -ljobs -uses %PROJECTNAME% %JOBNAME% > %WORKDIR%\%RECURJOBLIST%
:: ------------------------------------------------------------------------------------
:: CHECKING ERROR CODES
:: ------------------------------------------------------------------------------------
IF %ERRORLEVEL% EQU 0 (
echo No recursive jobs found for %JOBNAME% in Project: %PROJECTNAME%
EXIT /B 0
)
:: ------------------------------------------------------------------------------------
:: EXPORTING in RECURSIVE MODE
:: ------------------------------------------------------------------------------------
IF %ERRORLEVEL% NEQ 1 (
 for /F "tokens=1" %%i in (%WORKDIR%\%RECURJOBLIST%) do (
        echo Exporting %%i from Project: %PROJECTNAME%
        %CLIENTPATH%\dsexport.exe /D=%DOMAIN% /U=%USER% /P=%PWD% /JOB=%%i /NODEPENDENTS %SERVER%/%PROJECTNAME% %TARGETPATH%\%%i.dsx
)
)
 
Usage manual will be updated later. For more information pls contact me