Sabre |
Using Sabre: Examining and manipulating data | |
Sabre manual |
The two ways in which raw data can be visually displayed in SABRE are through
the LOOK and HISTOGRAM commands. The FACTOR command is used to change
variables
into factors, while the TRANSFORM command may be used for either data
transformations or the creation of interaction terms. The LOOK command allows printing of up to six variables. For example, to print variables A and B type LOOK a b This will print observations in blocks of twenty, with the user prompted at the end of each block whether she wants to carry on printing. If specific blocks of observations are required, the start and end line numbers can be added as arguments. Thus, to print observations 28 to 67 type LOOK a b 28 67 Note that, for factor and pseudo factor (see below) arguments, the printed value is the (pseudo) factor level. The HISTOGRAM command draws a histogram and prints group frequencies. The command can be issued for both variables and factors. With variables, an optional argument can be added to specify the maximum number of groups (ie, bars) in the histogram. The actual number of bars printed may be less than this argument due to the suppression of groups with zero frequency. Thus, to draw a histogram with at most nine groups for variable A type HISTOGRAM a 9 If a factor (see below) is used as the argument, group specification is not allowed. However, in general, better histograms are obtained by using the factor rather than the variable from which the factor was produced. The FACTOR command produces a series of dummy variables which indicate the presence or absence of a level of the factor. There can be up to 100 factor levels. When using SABRE, variables can be factorised in two ways. If the variable has a discrete number of values the command is simply FACTOR var fac where var is the name of the variable to be factorised and fac is the name of the resulting factor. If the variable is continuous, cut-points can be specified which divide the variable into the relevant number of levels n. The form of the command is then FACTOR var fac [n-1 cut-points] For example, to create a factor (afac) which indicates whether a variable (bvar) is less than, or greater than or equal to 10, the command is FACTOR bvar afac 10 and would result in bvar afac 3 1 0 9 mapped 1 0 12 to 0 1 8 1 0 If a variable element is equal to a cut-point, the element is placed in the higher cut-point group. Data transformations may be performed using the TRANSFORM command. The only permitted functional transformations are exponentiation and natural logarithm. Variables can also be raised to a constant power or acted upon by the arithmetic operators multiplication, division, addition and subtraction. In the case of arithmetic manipulation, the post-operator argument may be either another variable or a constant (eg, both A + B and A + 2 are valid expressions). Although the command allows only a single transforming operation at a time, compound transformations may be dealt with simply by issuing a sequence of related TRANSFORM commands. For example, given variables A, B and C, the transformed variable D = exp((A**1/4 - loge(B))/(3*C**2 + 1)) could be derived via the following series of instructions
TRANSFORM var1 a ^ 0.25 The TRANSFORM command may also be used to create interactions between variables and/or factors. A variable by variable interaction is formed simply by multiplying together the corresponding values of each of the two variables. For example, to create an interaction term cint from variables avar and bvar, the command is TRANSFORM cint avar * bvar and would result in avar bvar cint 3 10 30 9 5 mapped 45 12 2 to 24 8 6 48 A variable by factor (or factor by variable) interaction is formed by replacing the unit element in the factor by the variable value while keeping all the remaining elements at zero. For example, to create an interaction term cint from a variable avar and a 3-level factor bfac, the command is TRANSFORM cint avar * bfac and would result in avar bfac cint 3 0 1 0 0 3 0 9 1 0 0 mapped 9 0 0 12 0 0 1 to 0 0 12 8 0 1 0 0 8 0 Note that the interaction term cint is neither a variable (it has more than one level) nor a factor (it doesn't consist of just 0's and 1's). However, it is similar to a factor in the sense that it contains only a single non-zero entry. For this reason, we refer to such structures as pseudo factors; ie, an interaction between a variable and an n-level factor produces an n-level pseudo factor. An m-level factor by n-level factor interaction forms an mn-level factor. If the unit elements of the original factors are at indices i and j respectively, then index (i-1)n + j of the new factor will be the unit element. For example, to create an interaction term cint from a 4-level factor afac and a 3-level factor bfac, the command is TRANSFORM cint afac * bfac and would result in afac bfac cint 0 1 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 mapped 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 to 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 As noted above (with m=4 and n=3), cint is indeed a 12-level factor. |
Other links: Centre for e-Science | Centre for Applied Statistics