A SERVICE OF

logo

Data Integrity and Security Potential Result Set Differences When Processing Null Data 27
quit;
When the libref is assigned, the SAS/ACCESS engine connects to the DBMS and
passes a command to the DBMS to execute the stored procedure MY_PROCEDURE. By
default, a new connection to the DBMS is made for every table that is opened for
updating, so MY_PROCEDURE is executed a second time after a connection is made to
update the table MYTABLE.
To execute a DBMS command or stored procedure
only after the first connection in a
library assignment, you can use the DBLIBINIT= option. Similarly, the DBLIBTERM=
option enables you to specify a command to be executed prior to the disconnection of
only the first library connection, as in the following example:
libname myoralib oracle user=testuser password=testpass
dblibinit="EXEC MY_INIT" dblibterm="EXEC MY_TERM";
Potential Result Set Differences When Processing Null Data
When your data contains null values or when internal processing generates
intermediate data sets that contain null values, you might get different result sets
depending on whether the processing is done by SAS or by the DBMS. Although in
many cases this does not present a problem, it is important to understand how these
differences occur.
Most relational database systems have a special value called null, which means an
absence of information and is analogous to a SAS missing value. SAS/ACCESS
translates SAS missing values to DBMS null values when creating DBMS tables from
within SAS and, conversely, translates DBMS null values to SAS missing values when
reading DBMS data into SAS.
There is, however, an important difference in the behavior of DBMS null values and
SAS missing values:
A DBMS null value is interpreted as the absence of data, so you cannot sort a
DBMS null value or evaluate it with standard comparison operators.
A SAS missing value is interpreted as its internal floating point representation
because SAS supports 28 missing values (where a period "." is the most common
missing value). Because SAS supports multiple missing values, you can sort a SAS
missing value and evaluate it with standard comparison operators.
This means that SAS and the DBMS interpret null values differently, which has
significant implications when SAS/ACCESS passes queries to a DBMS for processing.
This can be an issue in the following situations:
when filtering data (for example, in a WHERE clause, a HAVING clause, or an
outer join ON clause). SAS interprets null values as missing; many DBMS exclude
null values from consideration. For example, if you have null values in a DBMS
column that is used in a WHERE clause, your results might differ depending on
whether the WHERE clause is processed in SAS or is passed to the DBMS for
processing. This is because the DBMS removes null values from consideration in a
WHERE clause, but SAS does not.
when using certain functions. For example, if you use the MIN aggregate function
on a DBMS column that contains null values, the DBMS does not consider the null
values, but SAS interprets the null values as missing, which affects the result.
when submitting outer joins where internal processing generates nulls for
intermediate result sets.
when sorting data. SAS sorts null values low; most DBMSs sort null values high.
(See “Sorting DBMS Data” on page 31 for more information.)