Data Step Programming (from Fraktal SAS Programming): Unterschied zwischen den Versionen
K |
K |
||
Zeile 20: | Zeile 20: | ||
== The Data Step == | == The Data Step == | ||
− | Simply speaking, '''''SAS''''' processes one '''''"observation"''''' at a time. An ''observation'' is a data line, | + | Simply speaking, '''''SAS''''' processes one '''''"observation"''''' at a time when generating a '''''"dataset"'''''. An ''observation'' is a data line, known as ''"row"'' to the rDBMS specialist coding SQL, that is derived from the punch card concept in pioneering ages of IT; hence, a ''dataset'' is a table made from ''observations'' that share a common structure. |
− | + | ''Observations'' are processed in a one-line-register called '''''"Program Data Vector" (PDV)'''''. | |
− | Generally speaking, each line of code in ''DSL'' applies some function to the ''PDV'', | + | Generally speaking, each line of code in ''DSL'' applies some function to the ''PDV'', the content of which is then written to the dataset generated, either implicitly or on explicitly stated order with an ''"OUTPUT"'' statement. |
{{SeitenNavigation1 | {{SeitenNavigation1 |
Version vom 8. Juli 2014, 10:29 Uhr
What is this?
The triplex name "Data Step Programming" needs to be explained step-by-step:
- DATA is the SAS technical term for values operated on.
- STEP is the SAS conceptual name for a segment-wise oriented coding structure.
- PROGRAMMING is the SAS term for coding a scripted (not compiled) algorithm.
Data Step Programming is done using the "Data Step Language" (DSL). The Data Step Language is a fully equipped 3rd generation language, modelled on IBM Corporation's PL/1 called successor candidate for FORTRAN.
The Data Step
Simply speaking, SAS processes one "observation" at a time when generating a "dataset". An observation is a data line, known as "row" to the rDBMS specialist coding SQL, that is derived from the punch card concept in pioneering ages of IT; hence, a dataset is a table made from observations that share a common structure.
Observations are processed in a one-line-register called "Program Data Vector" (PDV).
Generally speaking, each line of code in DSL applies some function to the PDV, the content of which is then written to the dataset generated, either implicitly or on explicitly stated order with an "OUTPUT" statement.