Thursday, May 26, 2005
On Vacation
I start my vacation tomorrow. The next update will be June 13th.
Please enjoy the 8 posts from the last 2 weeks, and comments are welcome.
Please enjoy the 8 posts from the last 2 weeks, and comments are welcome.
Optimizing Sybase Queries and Stored Procedures
This is meant for technical specialists, not DBAs, although some of the information obtained can be passed onto DBAs for server-level tuning.
This is also not a step-by-step or a "how to," is it instead a collection of tools that can be used to find and resolve common performance issues.
Basics:
How to log your results
isql -P password -e -i input_file -o outputfile
isql -P password < input_file > outputfile
Getting information about the tables and indexes
sp_help
sp_helpdb
sp_tables
Diagnosing the Problem:
Execute the query again, with some of the below options to get more information:
See the query plan Sybase ASE is using, look for table scans in the results
set showplan on
(optional) set noexec on (this negates statistics io)
Gather some statistics looks for I/O, which usually takes a lot of time.
set statistics io on
Gather time statistics so you can see where the time is spent
set statistics time on
If you think the problem is with an index, here is how to see they are chosen
dbcc traceon(302) and dbcc traceon(310)
Best way to determine size of a table
sp_spaceused
This tool will take samples studying all areas of database activity
sp_sysmon
Sybase ASE has an advanced tool for studying statistics
optdiag
You can get additional network information using sp_monitor, here are some examples
sp_monitor
@@pack_sent - Number of packets sent by Adaptive Server
@@pack_received - Number of packets received
@@packet_errors - Number of errors
Solving the Problem:
Often the solution is to create an index
create index [name] on [table] ([column])
Clean-up tables that have been updated frequently
rebuild reorg
The optimizer needs up-to-date statistics
update statistics [table]
After updating statistics, consider re-compiling the stored procedures
sp_recompile [table]
There are also more advanced tuning techniques to turn options on and off, but consult an advanced DBA for those.
This is also not a step-by-step or a "how to," is it instead a collection of tools that can be used to find and resolve common performance issues.
Basics:
How to log your results
isql -P password -e -i input_file -o outputfile
isql -P password < input_file > outputfile
Getting information about the tables and indexes
sp_help
sp_helpdb
sp_tables
Diagnosing the Problem:
Execute the query again, with some of the below options to get more information:
See the query plan Sybase ASE is using, look for table scans in the results
set showplan on
(optional) set noexec on (this negates statistics io)
Gather some statistics looks for I/O, which usually takes a lot of time.
set statistics io on
Gather time statistics so you can see where the time is spent
set statistics time on
If you think the problem is with an index, here is how to see they are chosen
dbcc traceon(302) and dbcc traceon(310)
Best way to determine size of a table
sp_spaceused
This tool will take samples studying all areas of database activity
sp_sysmon
Sybase ASE has an advanced tool for studying statistics
optdiag
You can get additional network information using sp_monitor, here are some examples
sp_monitor
@@pack_sent - Number of packets sent by Adaptive Server
@@pack_received - Number of packets received
@@packet_errors - Number of errors
Solving the Problem:
Often the solution is to create an index
create index [name] on [table] ([column])
Clean-up tables that have been updated frequently
rebuild reorg
The optimizer needs up-to-date statistics
update statistics [table]
After updating statistics, consider re-compiling the stored procedures
sp_recompile [table]
There are also more advanced tuning techniques to turn options on and off, but consult an advanced DBA for those.
Wednesday, May 25, 2005
Investigating Locks
I had a colleague come across a situation where an investigation of a performance issue revealed blocking in the database, but wasn't sure how to investigate further.
The first step is usually to execute sp_who.
1> sp_who
2> go
This will list all the tasks and what they are doing. Some of them may be waiting on locks being use by another task. This task will also be in the list, and will helpfully include the user and the machine where this task originated
We can get more information about these tasks from the sysprocesses table, using those spids from sp_who. Here we get to see the name of blocking process, and the process ID on wherever it is being executed.
1> select * from sysprocesses where spid = [spid]
2> go
Before you get up and go to that machine and ask the user what he is doing with that application, you can execute sp_lock to get more information about the resource the task is blocking.
1> sp_lock
2> go
The results of sp_lock will reveal what kind of locks, and what object is being locked.
From there we look at sysobjects with the objectid from sp_lock.
1> select * from sysobjects where id = [objectid]
2> go
Now you know exactly:
1. Which task is blocking which
2. What those tasks are, including who is running them on which machine
3. Which resource is being blocked
That should give you the information you need to start thinking about how to avoid this problem.
The first step is usually to execute sp_who.
1> sp_who
2> go
This will list all the tasks and what they are doing. Some of them may be waiting on locks being use by another task. This task will also be in the list, and will helpfully include the user and the machine where this task originated
We can get more information about these tasks from the sysprocesses table, using those spids from sp_who. Here we get to see the name of blocking process, and the process ID on wherever it is being executed.
1> select * from sysprocesses where spid = [spid]
2> go
Before you get up and go to that machine and ask the user what he is doing with that application, you can execute sp_lock to get more information about the resource the task is blocking.
1> sp_lock
2> go
The results of sp_lock will reveal what kind of locks, and what object is being locked.
From there we look at sysobjects with the objectid from sp_lock.
1> select * from sysobjects where id = [objectid]
2> go
Now you know exactly:
1. Which task is blocking which
2. What those tasks are, including who is running them on which machine
3. Which resource is being blocked
That should give you the information you need to start thinking about how to avoid this problem.
Tuesday, May 24, 2005
Sybase ASE Glossary
Recently I had a client with a tricky database performance issue. So I read the Sybase Performance and Tuning Guide. It is an excellent document, but very long (1000 pages). I think it explains basically everything you need to know about performance and tuning:
http://sybooks.sybase.com/onlinebooks/group-as/asg1250e/ptallbk
While I read this guide, I made note of a number of confusing terms I came across, and what they meant. Here are a few of them, in my own words.
1. Cache hit ratio
All the data is stored on a disk, but a subset of the data is kept in a cache. This cache is quick to access, whereas trips to the disk are time consuming.
Put simply, the cache hit ratio tells you how often requested data was found in the fast cache (thus avoiding a trip to the disk).
2. Latching
Basically they're non-transactional locks, and for data and index pages only (thus they don't apply to APL below). They are held only briefly whereas locks persist.
3. Isolation level
This is a key concept in database theory, and has an ANSI-SQL standard. A quick google gives plenty of pages that explain what each one means, but here is a brief breakdown.
Low isolation level: less locking, which results in less overhead, but also increases the chance of two tasks using the same data and thus a greater chance of having incorrect data.
High isolation level: more locking, which results in more overhead, but also helps guarantee that your view of the data is completely correct.
4. Clustered index
A clustered index simply means the data is organised in order in the database. Therefore inserts can cause some overhead, but it is quicker to find data that is close together. Obviously you can only have a single clustered index on a table.
5. Row Forwarding
When you update a row with variable-length fields, you sometimes increase the amount of space that row requires. In some cases that row will no longer fit back into its original location on disk. If so, it is placed somewhere else, and a marker in its original location points to its new location.
There is never a chain of these forwards, the first one is simply updated every time it must move. There are DBA commands (see 'rebuild') that will rearrange things so that the rows can be restored to their proper locations.
6. APL/DOL
These are two locking models for tables. APL means "All Pages Locked" and DOL means "Data Only Locked." Those are good titles, but trust me, the concepts are much more complex than that. But basically, APL will lock data pages AND the index page, and DOL will not lock the index page.
7. Index covering
Imagine you have a table with many columns, but you have a common query that wants only a small subset of those columns. You could then create an index that includes that small subset. Now when you run the query, all the information you need is right in the index pages, saving you a trip to the table's pages. That approach, which is helpful in certain specific situations, is called index covering.
8. Checkpoint
The checkpoint commonly refers to the point in the cache where the pages are written back to disk. This is done for two reasons:
- Keep the disk current, and clean out the transaction log (which keeps track of the differences between the cache and the disk), and
- Make room for loading more pages into memory
When new pages are loaded or accessed, they are placed at the front of the cache list*. This pushes the other pages down the list, potentially past the checkpoint where they would be written to disk.
* Actually, there are strategies whereby some pages are placed at the back (and thus stay in the cache only briefly), and some pages receive preferential treatment to stay in the cache longer.
9. Buffer wash
The area past the aforementioned checkpoint is referred to as the buffer wash area. A buffer wash is the act of writing the dirty (updated) pages from the cache to the disk. Think of buffer as another word for the cache.
10. Free write
Sybase ASE has limited resource to access the disk. A free write is when such a resource exists. Sybase likes to use free writes for housekeeping activities, like updating system statistics and performing buffer washes. Think of free as another worked for available.
http://sybooks.sybase.com/onlinebooks/group-as/asg1250e/ptallbk
While I read this guide, I made note of a number of confusing terms I came across, and what they meant. Here are a few of them, in my own words.
1. Cache hit ratio
All the data is stored on a disk, but a subset of the data is kept in a cache. This cache is quick to access, whereas trips to the disk are time consuming.
Put simply, the cache hit ratio tells you how often requested data was found in the fast cache (thus avoiding a trip to the disk).
2. Latching
Basically they're non-transactional locks, and for data and index pages only (thus they don't apply to APL below). They are held only briefly whereas locks persist.
3. Isolation level
This is a key concept in database theory, and has an ANSI-SQL standard. A quick google gives plenty of pages that explain what each one means, but here is a brief breakdown.
Low isolation level: less locking, which results in less overhead, but also increases the chance of two tasks using the same data and thus a greater chance of having incorrect data.
High isolation level: more locking, which results in more overhead, but also helps guarantee that your view of the data is completely correct.
4. Clustered index
A clustered index simply means the data is organised in order in the database. Therefore inserts can cause some overhead, but it is quicker to find data that is close together. Obviously you can only have a single clustered index on a table.
5. Row Forwarding
When you update a row with variable-length fields, you sometimes increase the amount of space that row requires. In some cases that row will no longer fit back into its original location on disk. If so, it is placed somewhere else, and a marker in its original location points to its new location.
There is never a chain of these forwards, the first one is simply updated every time it must move. There are DBA commands (see 'rebuild') that will rearrange things so that the rows can be restored to their proper locations.
6. APL/DOL
These are two locking models for tables. APL means "All Pages Locked" and DOL means "Data Only Locked." Those are good titles, but trust me, the concepts are much more complex than that. But basically, APL will lock data pages AND the index page, and DOL will not lock the index page.
7. Index covering
Imagine you have a table with many columns, but you have a common query that wants only a small subset of those columns. You could then create an index that includes that small subset. Now when you run the query, all the information you need is right in the index pages, saving you a trip to the table's pages. That approach, which is helpful in certain specific situations, is called index covering.
8. Checkpoint
The checkpoint commonly refers to the point in the cache where the pages are written back to disk. This is done for two reasons:
- Keep the disk current, and clean out the transaction log (which keeps track of the differences between the cache and the disk), and
- Make room for loading more pages into memory
When new pages are loaded or accessed, they are placed at the front of the cache list*. This pushes the other pages down the list, potentially past the checkpoint where they would be written to disk.
* Actually, there are strategies whereby some pages are placed at the back (and thus stay in the cache only briefly), and some pages receive preferential treatment to stay in the cache longer.
9. Buffer wash
The area past the aforementioned checkpoint is referred to as the buffer wash area. A buffer wash is the act of writing the dirty (updated) pages from the cache to the disk. Think of buffer as another word for the cache.
10. Free write
Sybase ASE has limited resource to access the disk. A free write is when such a resource exists. Sybase likes to use free writes for housekeeping activities, like updating system statistics and performing buffer washes. Think of free as another worked for available.
Friday, May 20, 2005
Dynamic SQL
I got the following question this past week from a colleague.
QUESTION:
Is there a way to query a field by having it as a variable?
Example
Sybase isql prompt:
declare @var varchar(20)
@var="param_0"
select @var from MyTable where @var <> NULL
ANSWER:
Yes, it is possible. Incidentally, this is called "Dynamic SQL." That basically means any SQL code that is generated on the fly (ie. dynamically), as opposed to being previously prepared.
Here is how to achieve your result:
declare @c varchar(100)
declare @var varchar(20)
select @c = "select " + @var + " from MyTable where " + @var + " <> NULL"
exec (@c)
This only works in Sybase ASE 12.0 or greater.
There is also a way to do dynamic SQL in 11.x and earlier as well, using sp_remote_sql, but it is more complex.
I don't want to re-invent the wheel because this matter has been very well addressed by Rob Verschoor. Here is where to find his articles.
Simulating dynamic SQL in ASE
http://www.sypron.nl/dynsql.html
Simulating dynamic SQL through CIS
http://www.sypron.nl/dynsqlcis.html
There are plenty of other articles about "Dynamic SQL" in general. I just googled it and found a pretty reasonable treatment of the subject, which also covers SQL Injection, one of the security risks of Dynamic SQL.
The Curse and Blessings of Dynamic SQL
Erland Sommarskog
http://www.sommarskog.se/dynamic_sql.html
QUESTION:
Is there a way to query a field by having it as a variable?
Example
Sybase isql prompt:
declare @var varchar(20)
@var="param_0"
select @var from MyTable where @var <> NULL
ANSWER:
Yes, it is possible. Incidentally, this is called "Dynamic SQL." That basically means any SQL code that is generated on the fly (ie. dynamically), as opposed to being previously prepared.
Here is how to achieve your result:
declare @c varchar(100)
declare @var varchar(20)
select @c = "select " + @var + " from MyTable where " + @var + " <> NULL"
exec (@c)
This only works in Sybase ASE 12.0 or greater.
There is also a way to do dynamic SQL in 11.x and earlier as well, using sp_remote_sql, but it is more complex.
I don't want to re-invent the wheel because this matter has been very well addressed by Rob Verschoor. Here is where to find his articles.
Simulating dynamic SQL in ASE
http://www.sypron.nl/dynsql.html
Simulating dynamic SQL through CIS
http://www.sypron.nl/dynsqlcis.html
There are plenty of other articles about "Dynamic SQL" in general. I just googled it and found a pretty reasonable treatment of the subject, which also covers SQL Injection, one of the security risks of Dynamic SQL.
The Curse and Blessings of Dynamic SQL
Erland Sommarskog
http://www.sommarskog.se/dynamic_sql.html
Thursday, May 19, 2005
Propagating Session-Level Parameters
I received a question on parallelism, and even after reviewing sections 24 and 25 of the Sybase Performance Tuning Guide, it still called for a test
http://sybooks.sybase.com/onlinebooks/group-as/asg1250e/ptallbk
I have a stored procedure that calls 20 others. I want to set the parallel degree to 1 for this stored procedure, and everything it calls. This works fine if I set it in isql before calling the stored procedure. However, normally this stored procedure is integrated into our application, so I set the parallel degree to 1 inside this stored procedure and called it normally. There was some question
whether that would affect only the main stored procedure and not propogate to the procedures that were called.
The answer is that yes, the parallel degree set by a main procedure does trickle to any other stored procedures that are called. I imagine that would be the same for all parameters. Have a look.
Here are the results of my test:
*** main procedure sp_testparadeg1 has parallel degree 0 ***
*** set parallel degree to 1 ***
*** After being set to 1, parallel degree is 1 ***
*** sub-procedure sp_testparadeg2 has parallel degree 1 ***
(return status = 0)
*** Deleting proc dbo.sp_testparadeg1 ***
*** Deleting proc dbo.sp_testparadeg2 ***
Here is the test:
dump tran sybsystemprocs with truncate_only
go
use sybsystemprocs
go
if exists (select * from sysobjects where name = "sp_testparadeg1" and type = 'P')
drop proc sp_testparadeg1
go
if exists (select * from sysobjects where name = "sp_testparadeg2" and type = 'P')
drop proc sp_testparadeg2
go
-- create test procedures
create procedure sp_testparadeg2
as
print '*** sub-procedure sp_testparadeg2 has parallel degree %1! ***', @@parallel_degree
go
create procedure sp_testparadeg1
as
print '*** main procedure sp_testparadeg1 has parallel degree %1! ***', @@parallel_degree
print '*** set parallel degree to 1 ***'
set parallel_degree 1
print '*** After being set to 1, parallel degree is %1! ***', @@parallel_degree
exec sp_testparadeg2
go
if object_id('sp_testparadeg1') is not null
begin
print '*** Created procedure dbo.sp_testparadeg1 ***'
grant execute on dbo.sp_testparadeg1 to public
end
else
begin
print '*** Failed creating proc dbo.sp_testparadeg1 ***'
end
go
if object_id('sp_testparadeg2') is not null
begin
print '*** Created procedure dbo.sp_testparadeg2 ***'
grant execute on dbo.sp_testparadeg2 to public
end
else
begin
print '*** Failed creating proc dbo.sp_testparadeg2 ***'
end
go
-- do your tests
sp_configure 'max parallel degree'
go
exec sp_testparadeg1
go
-- clean up
print '*** Deleting proc dbo.sp_testparadeg1 ***'
if exists (select * from sysobjects where name = "sp_testparadeg1" and type = 'P')
drop proc sp_testparadeg1
go
print '*** Deleting proc dbo.sp_testparadeg2 ***'
if exists (select * from sysobjects where name = "sp_testparadeg2" and type = 'P')
drop proc sp_testparadeg2
go
http://sybooks.sybase.com/onlinebooks/group-as/asg1250e/ptallbk
I have a stored procedure that calls 20 others. I want to set the parallel degree to 1 for this stored procedure, and everything it calls. This works fine if I set it in isql before calling the stored procedure. However, normally this stored procedure is integrated into our application, so I set the parallel degree to 1 inside this stored procedure and called it normally. There was some question
whether that would affect only the main stored procedure and not propogate to the procedures that were called.
The answer is that yes, the parallel degree set by a main procedure does trickle to any other stored procedures that are called. I imagine that would be the same for all parameters. Have a look.
Here are the results of my test:
*** main procedure sp_testparadeg1 has parallel degree 0 ***
*** set parallel degree to 1 ***
*** After being set to 1, parallel degree is 1 ***
*** sub-procedure sp_testparadeg2 has parallel degree 1 ***
(return status = 0)
*** Deleting proc dbo.sp_testparadeg1 ***
*** Deleting proc dbo.sp_testparadeg2 ***
Here is the test:
dump tran sybsystemprocs with truncate_only
go
use sybsystemprocs
go
if exists (select * from sysobjects where name = "sp_testparadeg1" and type = 'P')
drop proc sp_testparadeg1
go
if exists (select * from sysobjects where name = "sp_testparadeg2" and type = 'P')
drop proc sp_testparadeg2
go
-- create test procedures
create procedure sp_testparadeg2
as
print '*** sub-procedure sp_testparadeg2 has parallel degree %1! ***', @@parallel_degree
go
create procedure sp_testparadeg1
as
print '*** main procedure sp_testparadeg1 has parallel degree %1! ***', @@parallel_degree
print '*** set parallel degree to 1 ***'
set parallel_degree 1
print '*** After being set to 1, parallel degree is %1! ***', @@parallel_degree
exec sp_testparadeg2
go
if object_id('sp_testparadeg1') is not null
begin
print '*** Created procedure dbo.sp_testparadeg1 ***'
grant execute on dbo.sp_testparadeg1 to public
end
else
begin
print '*** Failed creating proc dbo.sp_testparadeg1 ***'
end
go
if object_id('sp_testparadeg2') is not null
begin
print '*** Created procedure dbo.sp_testparadeg2 ***'
grant execute on dbo.sp_testparadeg2 to public
end
else
begin
print '*** Failed creating proc dbo.sp_testparadeg2 ***'
end
go
-- do your tests
sp_configure 'max parallel degree'
go
exec sp_testparadeg1
go
-- clean up
print '*** Deleting proc dbo.sp_testparadeg1 ***'
if exists (select * from sysobjects where name = "sp_testparadeg1" and type = 'P')
drop proc sp_testparadeg1
go
print '*** Deleting proc dbo.sp_testparadeg2 ***'
if exists (select * from sysobjects where name = "sp_testparadeg2" and type = 'P')
drop proc sp_testparadeg2
go
Wednesday, May 18, 2005
Variables in Cursors
You can use variables in cursor declarations, along these lines:
1. declare a cursor, using a variable
2. open that cursor, and use it
3. close that cursor
4. change the variable used in the declaration
5. open the cursor, and use it
6. close that cursor
(repeat 4-6 as you wish)
Here is an example.
dump tran sybsystemprocs with truncate_only
go
use sybsystemprocs
go
-- prepare the tests
if exists (select * from sysobjects where name = "TableA" and type = 'U')
drop table TableA
go
create table TableA (a int, b int)
go
insert into TableA values (1, 1)
go
insert into TableA values (2, 2)
go
if exists (select * from sysobjects where name = "sp_testcursor" and type = 'P')
drop proc sp_testcursor
go
-- create test procedure
create procedure sp_testcursor
as
declare @acol int
declare @bcol int
-- Step #1: create a cursor on a variable, and fetch
select @bcol = 1
declare var_cursor cursor for select a from TableA where b = @bcol for read only
open var_cursor
fetch var_cursor into @acol
while @@sqlstatus = 0
begin
print '*** Cursor fetched %1!, Expecting 1 ***', @acol
fetch var_cursor into @acol
end
close var_cursor
-- Step #2: change the variable, re-open the cursor, and fetch
select @bcol = 2
open var_cursor
fetch var_cursor into @acol
while @@sqlstatus = 0
begin
print '*** Cursor fetched %1!, Expecting 2 ***'', @acol
fetch var_cursor into @acol
end
go
if object_id('sp_testcursor') is not null
begin
print '*** Created procedure dbo.sp_testcursor ***'
grant execute on dbo.sp_testcursor to public
end
else
begin
print '*** Failed creating proc dbo.sp_testcursor ***'
end
go
-- do your tests
exec sp_testcursor
go
-- clean up
print '*** Deleting proc dbo.sp_testcursor ***'
if exists (select * from sysobjects where name = "sp_testcursor" and type = 'P')
drop proc sp_testcursor
go
print '*** Dropping table tableA ***'
drop table TableA
go
1. declare a cursor, using a variable
2. open that cursor, and use it
3. close that cursor
4. change the variable used in the declaration
5. open the cursor, and use it
6. close that cursor
(repeat 4-6 as you wish)
Here is an example.
dump tran sybsystemprocs with truncate_only
go
use sybsystemprocs
go
-- prepare the tests
if exists (select * from sysobjects where name = "TableA" and type = 'U')
drop table TableA
go
create table TableA (a int, b int)
go
insert into TableA values (1, 1)
go
insert into TableA values (2, 2)
go
if exists (select * from sysobjects where name = "sp_testcursor" and type = 'P')
drop proc sp_testcursor
go
-- create test procedure
create procedure sp_testcursor
as
declare @acol int
declare @bcol int
-- Step #1: create a cursor on a variable, and fetch
select @bcol = 1
declare var_cursor cursor for select a from TableA where b = @bcol for read only
open var_cursor
fetch var_cursor into @acol
while @@sqlstatus = 0
begin
print '*** Cursor fetched %1!, Expecting 1 ***', @acol
fetch var_cursor into @acol
end
close var_cursor
-- Step #2: change the variable, re-open the cursor, and fetch
select @bcol = 2
open var_cursor
fetch var_cursor into @acol
while @@sqlstatus = 0
begin
print '*** Cursor fetched %1!, Expecting 2 ***'', @acol
fetch var_cursor into @acol
end
go
if object_id('sp_testcursor') is not null
begin
print '*** Created procedure dbo.sp_testcursor ***'
grant execute on dbo.sp_testcursor to public
end
else
begin
print '*** Failed creating proc dbo.sp_testcursor ***'
end
go
-- do your tests
exec sp_testcursor
go
-- clean up
print '*** Deleting proc dbo.sp_testcursor ***'
if exists (select * from sysobjects where name = "sp_testcursor" and type = 'P')
drop proc sp_testcursor
go
print '*** Dropping table tableA ***'
drop table TableA
go
Tuesday, May 17, 2005
Scoping in T-SQL
Is there scoping in T-SQL? Actually ... no, not really.
The term "scope" refers to the time or place that the value of a declared variable is accessible. Normally variables declared in a loop, for instance, are not available outside that loop.
In T-SQL, a local variable is available basically anywhere in the stored procedure. There are also global variables, which endure beyond the session and can be referenced by other procedures or triggers. So I suppose the variables are scoped, but only by stored procedure/trigger.
The situation gets slightly more complex when it involves cursors, but the idea is still the same. For example, here is a quote from the Sybase manual:
"If a declare cursor statement is part of a stored procedure or trigger, the cursor created within it applies to that scope and to the scope that launched the stored procedure or trigger. However, cursors declared inside a trigger on an inserted or a deleted table are not accessible to any nested stored procedures or triggers. Such cursors are accessible within the scope of that trigger. Once the stored procedure or trigger completes, Adaptive Server deallocates the cursors created within it. "
Here is a test that demonstrates the relative non-scoping of T-SQL variables.
dump tran sybsystemprocs with truncate_only
go
use sybsystemprocs
go
if exists (select * from sysobjects where name = "sp_testscope" and type = 'P')
drop proc sp_testscope
go
-- create test procedure
create procedure sp_testscope
as
-- Step 1: Create a variable inside
if (1=1)
begin
declare @scoped_var int
select @scoped_var = 1
print "*** Value of @scoped_var is %1! ***", @scoped_var
end
-- Step 2: Try to use it outside the scope
print "*** Value of @scoped_var is %1! ***", @scoped_var
select @scoped_var = 2
print "*** Value of @scoped_var is %1! ***", @scoped_var
-- Step 3: Try to declare one that will hide the other (this returns an error)
-- declare @scoped_var int
-- select @scoped_var = 3
-- print "*** Value of @scoped_var is %1! ***", @scoped_var
go
-- grant execute
if object_id('sp_testscope') is not null
begin
print '*** Created procedure dbo.sp_testscope ***'
grant execute on dbo.sp_testscope to public
end
else
begin
print '*** Failed creating proc dbo.sp_testscope ***'
end
go
-- do the tests
exec sp_testscope
go
-- clean up
print '*** Deleting proc dbo.sp_testscope ***'
if exists (select * from sysobjects where name = "sp_testscope" and type = 'P')
drop proc sp_testscope
go
Monday, May 16, 2005
NULLs in Sybase ASE
Here is where my story begins. For potential performance reasons, I was trying to change a "where" clause that looked like this:
ta.b & isnull (tb.b, ta.b) = isnull (tb.b, ta.b)
to one that looked like this:
ta.b & tb.b = tb.b
(ta and tb are different tables)
But I wasn't sure if this would work if tb.b was null. Would ta.b & null = null? So I did this test:
declare @a int
declare @b int
select @a = null
select @b = 1
select @b where (@b & @a = @a)
go
-- This returns 1 result.
This means @b & @a = @b is true if @a is null. So I thought I could go ahead. But I was wrong! Observe:
create table TableA (a int, b int null)
insert into TableA values (1, 1)
create table TableB (a int, b int null)
insert into TableB values (1, null)
select ta.b from TableA ta, TableB tb
ta.b & isnull (tb.b, ta.b) = isnull (tb.b, ta.b)
go
-- This returns 1 result
select ta.b from TableA ta, TableB tb
ta.b & tb.b = tb.b
go
-- This returns 0 results
What is the explanation for this?
The quick answer is that Sybase ASE treats NULLs differently in search clauses (the first case) than in join clauses (the second case, when you use a "where").
Search clauses: NULL = NULL
Join clauses: NULL = nothing, not even itself
Where does this come from? Oddly enough, from the ANSI SQL-92 Standard itself. The following quote is from Sybooks (look for the description of the "ansinull" configuration parameter):
"The SQL standard requires that if either one of the two operands of an equality comparison is NULL, the result is UNKNOWN. Transact-SQL treats NULL values differently. If one of the operands is a column, parameter, or variable, and the other operand is the NULL constant or a parameter or variable whose value is NULL, the result is either TRUE or FALSE."
Incidentally, "ansinull" is a Sybase ASE configuration parameter that will toggle the treatment of NULLs from ANSI to Sybase ASE.
ANSI: NULL = NULL is NULL, which is neither TRUE nor FALSE, it is UNKNOWN
Sybase ASE: NULL = NULL is TRUE
For example, consider TableB above and run this query:
select * from TableB where b <> "1"
ANSINULL = ON: This returns 0 results (ie. not the NULL row)
ANSINULL = OFF: This returns 1 result (ie. the NULL row)
If you want to get a result, but don't want to mess with ANSINULL, here are two approaches:
select * from TableB where b != 1 or b IS NULL
select * from TableB where isnull(b,0) != 1
Also make note that in Sybase ASE, an empty string doesn't exist, because that is the string representation of NULL.
So what is a guy to do?
The answer is to remember that NULLs don't always behave as you'd expect, and to follow these guidelines:
1. Use functions like ISNULL and NULLIF, and even COALESCE
2. Don't use = or != with NULL, use 'IS NULL' and 'IS NOT NULL' instead
ISNULL: Takes two values, using the second one if and only if the first one is NULL
NULLIF: Takes an expression and a value, returns NULL if the expression is true, otherwise the value
COALESCE: Takes a list of at least 2 values, and returns the first one that is NOT NULL
ta.b & isnull (tb.b, ta.b) = isnull (tb.b, ta.b)
to one that looked like this:
ta.b & tb.b = tb.b
(ta and tb are different tables)
But I wasn't sure if this would work if tb.b was null. Would ta.b & null = null? So I did this test:
declare @a int
declare @b int
select @a = null
select @b = 1
select @b where (@b & @a = @a)
go
-- This returns 1 result.
This means @b & @a = @b is true if @a is null. So I thought I could go ahead. But I was wrong! Observe:
create table TableA (a int, b int null)
insert into TableA values (1, 1)
create table TableB (a int, b int null)
insert into TableB values (1, null)
select ta.b from TableA ta, TableB tb
ta.b & isnull (tb.b, ta.b) = isnull (tb.b, ta.b)
go
-- This returns 1 result
select ta.b from TableA ta, TableB tb
ta.b & tb.b = tb.b
go
-- This returns 0 results
What is the explanation for this?
The quick answer is that Sybase ASE treats NULLs differently in search clauses (the first case) than in join clauses (the second case, when you use a "where").
Search clauses: NULL = NULL
Join clauses: NULL = nothing, not even itself
Where does this come from? Oddly enough, from the ANSI SQL-92 Standard itself. The following quote is from Sybooks (look for the description of the "ansinull" configuration parameter):
"The SQL standard requires that if either one of the two operands of an equality comparison is NULL, the result is UNKNOWN. Transact-SQL treats NULL values differently. If one of the operands is a column, parameter, or variable, and the other operand is the NULL constant or a parameter or variable whose value is NULL, the result is either TRUE or FALSE."
Incidentally, "ansinull" is a Sybase ASE configuration parameter that will toggle the treatment of NULLs from ANSI to Sybase ASE.
ANSI: NULL = NULL is NULL, which is neither TRUE nor FALSE, it is UNKNOWN
Sybase ASE: NULL = NULL is TRUE
For example, consider TableB above and run this query:
select * from TableB where b <> "1"
ANSINULL = ON: This returns 0 results (ie. not the NULL row)
ANSINULL = OFF: This returns 1 result (ie. the NULL row)
If you want to get a result, but don't want to mess with ANSINULL, here are two approaches:
select * from TableB where b != 1 or b IS NULL
select * from TableB where isnull(b,0) != 1
Also make note that in Sybase ASE, an empty string doesn't exist, because that is the string representation of NULL.
So what is a guy to do?
The answer is to remember that NULLs don't always behave as you'd expect, and to follow these guidelines:
1. Use functions like ISNULL and NULLIF, and even COALESCE
2. Don't use = or != with NULL, use 'IS NULL' and 'IS NOT NULL' instead
ISNULL: Takes two values, using the second one if and only if the first one is NULL
NULLIF: Takes an expression and a value, returns NULL if the expression is true, otherwise the value
COALESCE: Takes a list of at least 2 values, and returns the first one that is NOT NULL