Phase-III Macro System: Unterschied zwischen den Versionen

Aus phenixxenia.org
Zur Navigation springen Zur Suche springen
Wolf-Dieter Batz (Diskussion | Beiträge)
K 1 Version
(kein Unterschied)

Version vom 27. Juni 2013, 14:42 Uhr

Themendomain ZAZY

General

The Phase-III Macro System is a flexible, data independent and parameter controlled set of SAS macros.

The Phase-III Macro System is not an end-to-end reporting tool.

  • It is a highly interacting collection of macro modules providing transformation methods for study emergent datasets making use of all the information available in the description part of the dataset processed. The user is provided with (an) output dataset(s) containing character columns with standard names and externally controlled attributes.
  • The Phase-III Macro System provides subroutines that care for data types, formats, labels, headers, missing values, loops and more. Runtime generated information used to control processing is kept in standardized data structures using macro-variable lists (mlists), SAS formats and datasets.
  • Input data structures may need some form of pre-processing as well as output data structures may need some post-processing to perfectly fulfil requirements. The Phase-III Macro System already supports these steps to some extent by providing condense, struct and missline functions.

Objective

The Phase-III Macro System is aimed at serving as a base for an extendable system that provides mechanisms for shaping input datasets, processing calculations and generating SAS datasets with ready made text content.

Scope

The Phase-III Macro System interacts with and makes use of other programs, modules, systems and datasets available. Communication and information interchange use SAS macrovariables, environment variables from the operating system and data structures compatible with the SAS System.

Input data streams will require preprocessing in general by assigning formats and labels. Output datasets will need postprocessing using merge and set operations mainly.

Characteristics

Module size is kept small (not more than three screen pages) for maintainability and avoids hard-coded references to any application related information like data types, labels and formats. Coding style makes broad use of automatic documentation and generation of meta data and lookup tables at runtime.

Approach

  1. Avoid dependency of programs to data scope, study characteristics or personal styles.
  2. Have modules implemented in a way to operate in any emerging environment.
  3. Be prepared to add new output structures without substantial delay.
  4. Produce a wide variety of output with a minimum set of modules.
  5. Minimize maintenance efforts through self-documenting and limited program code.
  6. Maximize validation throughput by adopting a non-mutual-impact architecture.

Architecture

Info Modules

Provide information about datasets and variables for correct processing.

Service Modules

Provide frequently requested tasks in a standard format with limited parameter set

Core Modules

Perform input transformation, calculations and output transformation

User Modules

Generate datasets carrying subtables controlled by user-supplied parms.

Module Details

Info Modules

%GET_ATTR()

Function

Return single attributes like label, format, etc.

Description

Reads dataset header and returns attributes as undeclared macro variables using the requested attributes names. Information becomes available when the particular variable is declared in the calling environment using a %global or %local statement.

Source

%MACRO GET_ATTR(dsn=,source=,attrib=) / store des="Get attribute from SAS Variable";
%LOCAL name;
%LET name=GET_ATTR;
%IF &DSN ne and &SOURCE ne and &ATTRIB ne %THEN %DO;
  %IF %INDEX(&DSN,.) eq 0 %THEN %DO;
    %LET dsn=WORK.&DSN;
  %END;
proc datasets nolist lib=%SCAN(&DSN,1);
  contents noprint  data=%SCAN(&DSN,2)(keep=&SOURCE)
                     out=work.tmp_data(keep=&ATTRIB);
  run;
quit;
proc sql noprint;
  select &ATTRIB
    into :&ATTRIB
  from work.tmp_data
  ;
%IF %UPCASE(&ATTRIB) ne LABEL %THEN %DO;
  %LET &ATTRIB = &&&ATTRIB;
%END;
quit;
%PUT &NAME._MESSAGE: Temporary SAS dataset WORK.TMP_DATA created ;
%PUT &NAME._MESSAGE: Field %UPCASE(&SOURCE) in dataset %UPCASE(&DSN) has &ATTRIB = %BQUOTE(&&&ATTRIB.). ;
%PUT &NAME._MESSAGE: Information stored into Local Macrovariable of calling environment: ;
%PUT &NAME._MESSAGE: &ATTRIB=%BQUOTE(&&&ATTRIB);
%PUT ;
%END;
%ELSE %DO;
%PUT vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv;
%PUT &NAME._ERROR: Missing Keyword Parameter(s).;
%PUT &NAME._STATUS: Macro processing abended. ;
%PUT &NAME._STATUS: Global Macrovariable(s) not available. ;
%PUT ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^;
%GEN_MAIL(name=&NAME,rc=1);
%END;
%MEND GET_ATTR;

%GRP_DESC()

Function

Provide info about a categorial variable.

Description

Investigates given categorial variable and provides results using undeclared macro variables: &n_grp - number of distinct values; &v_grp – structured list of distinct unformatted values; &l_grp – structured list of distinct formatted values.

%CHK_LIST()

Function

Provide info about a list type macrovar.

Description

Reads supplied list of tokens and returns undeclared macro variables: &n_lst - number of list elements; &v_lst – structured list of supplied elements. Input list elements may be separated by blank and comma only.

User Modules

%TWO_CATV()

Function

Deliver PCT/count table from 2 nested categorial variables.

Description

Perform nested processing of two categorial variables looping the context variable from the row_* modules over the categories of the "outer" categories.


Parameters

Name Description
dsn input dataset name
row, row2 categorial variable name, 2=nested variable
exclude decode for excluded group from &ROW
weight Y/N (multiply percentages for &ROW and &ROW2)
col categorial variable name used for columns
head2 Y/N (block header for nested variable)
indent, indinc n (number of indent columns and increment for nested variable)
num n (sequence number of output)
stat Y/N (column with statistics names)
space 1/2/3 (blank line before or after output and between nesting levels)
struct, struct2 name of reference dataset used for full decode structure, 2=nested variable
condense var#value (non-distinct variable and true value for &ROW)
misslin2 Y/N (force missing line for nested variable)

Source

declares and upper level processing
%MACRO TWO_CATV(dsn=
               ,exclude=
               ,row=
               ,row2=
               ,col=
               ,indent=0
               ,num=
               ,stat=N
               ,weight=Y
               ,space=2
               ,condense=
               ,struct=
               ,struct2=
               ,head2=N,misslin2=
               ,indinc=2)
/ store des="" 
;
%LOCAL n_grp v_grp n name;
%LET name=TWO_CATV;
%IF &STRUCT  eq %THEN %LET struct =&DSN;
%IF &STRUCT2 eq %THEN %LET struct2=&DSN;
%GRP_DESC(dsn=&DSN
         ,grp=&ROW
         ,miss=n)
;
%TOP_FILT(dsn=&DSN
         ,grp=&ROW
         ,by=&COL
         ,grplvl=&NUM
         ,var=
         ,condense=&CONDENSE)
;
%TOP_FREQ(dsn=top_filt
         ,struct=&STRUCT
         ,grp=&ROW
         ,by=&COL)
;
%TOP_OUTC(dsn=top_freq
         ,head=n
         ,total=n
         ,stat=&STAT
         ,indent=&INDENT
         ,grp=&ROW
         ,rev=n
         ,use=
         ,by=&COL
         ,missline=)
;
loop for lower level processing
%DO n=1 %TO &N_GRP;
  %IF %SCAN(&V_GRP,&N) ne &EXCLUDE %THEN %DO;
    %ROW_FILT(dsn=&DSN
             ,context=&ROW
             ,subgrp=&N
             ,grp=&ROW2
             ,by=&COL
             ,var=
             ,miss=n)
    ;
    %ROW_FREQ(dsn=row_filt
             ,sum=top_freq
             ,struct=&STRUCT2
             ,context=&ROW
             ,grp=&ROW2
             ,by=&COL
             ,weight=&WEIGHT)
    ;
    %ROW_OUTC(dsn=row_freq
             ,sum=main_3rd
             ,head=&HEAD2
             ,stat=&STAT
             ,indent=%EVAL(&INDENT+&INDINC)
             ,context=&ROW
             ,grp=&ROW2 
             ,by=&COL
             ,missline=&MISSLIN2)
    ;
  %END;
%END;
care for naming and send completion mail
%IF &TAB_NAME ne %THEN %DO;
  data %SUBSTR(&TAB_NAME,1,3)&NUM%SUBSTR(&TAB_NAME,5,4);
   set
  %DO n=1 %TO &N_GRP;
    %IF &SPACE eq 1 %THEN dummy ;
    %IF %SCAN(&V_GRP,&N) ne &EXCLUDE %THEN row&NUM._&N ;
    %IF &SPACE eq 2 %THEN dummy ;
  %END;
    %IF &SPACE eq 3 %THEN dummy ;
   ;
  run;
%END;
%GEN_MAIL(name=&NAME);
%MEND TWO_CATV;