SAS Safety 9 Manual

A SERVICE OF

next previous

Data Integrity and Security Potential Result Set Differences When Processing Null Data 27

quit;

When the libref is assigned, the SAS/ACCESS engine connects to the DBMS and

passes a command to the DBMS to execute the stored procedure MY_PROCEDURE. By

default, a new connection to the DBMS is made for every table that is opened for

updating, so MY_PROCEDURE is executed a second time after a connection is made to

update the table MYTABLE.

To execute a DBMS command or stored procedure

only after the ﬁrst connection in a

library assignment, you can use the DBLIBINIT= option. Similarly, the DBLIBTERM=

option enables you to specify a command to be executed prior to the disconnection of

only the ﬁrst library connection, as in the following example:

libname myoralib oracle user=testuser password=testpass

dblibinit="EXEC MY_INIT" dblibterm="EXEC MY_TERM";

Potential Result Set Differences When Processing Null Data

When your data contains null values or when internal processing generates

intermediate data sets that contain null values, you might get different result sets

depending on whether the processing is done by SAS or by the DBMS. Although in

many cases this does not present a problem, it is important to understand how these

differences occur.

Most relational database systems have a special value called null, which means an

absence of information and is analogous to a SAS missing value. SAS/ACCESS

translates SAS missing values to DBMS null values when creating DBMS tables from

within SAS and, conversely, translates DBMS null values to SAS missing values when

reading DBMS data into SAS.

There is, however, an important difference in the behavior of DBMS null values and

SAS missing values:

A DBMS null value is interpreted as the absence of data, so you cannot sort a

DBMS null value or evaluate it with standard comparison operators.

A SAS missing value is interpreted as its internal ﬂoating point representation

because SAS supports 28 missing values (where a period "." is the most common

missing value). Because SAS supports multiple missing values, you can sort a SAS

missing value and evaluate it with standard comparison operators.

This means that SAS and the DBMS interpret null values differently, which has

signiﬁcant implications when SAS/ACCESS passes queries to a DBMS for processing.

This can be an issue in the following situations:

when ﬁltering data (for example, in a WHERE clause, a HAVING clause, or an

outer join ON clause). SAS interprets null values as missing; many DBMS exclude

null values from consideration. For example, if you have null values in a DBMS

column that is used in a WHERE clause, your results might differ depending on

whether the WHERE clause is processed in SAS or is passed to the DBMS for

processing. This is because the DBMS removes null values from consideration in a

WHERE clause, but SAS does not.

when using certain functions. For example, if you use the MIN aggregate function

on a DBMS column that contains null values, the DBMS does not consider the null

values, but SAS interprets the null values as missing, which affects the result.

when submitting outer joins where internal processing generates nulls for

intermediate result sets.

when sorting data. SAS sorts null values low; most DBMSs sort null values high.

(See “Sorting DBMS Data” on page 31 for more information.)