Friday, January 27, 2006


Join Optimization

Tuning queries is a required skill for a Sybase specialist. One of the many ways to tune a query is to make sure that the very best query plan is being used. One clear advantage one plan may have over another is the join order that it uses when querying several tables. This advantage can be based on such factors as the where clauses, the number of rows in the tables, how they are stored, and the available indexes.

The Sybase query optimizer is in charge of evaluating the available join orders and choosing the best one. You know that we can determine what method it used using showplan:

set showplan on

But did you also know that we can use certain DBA tracing parameters to see the various plans it evaluated, and the estimated costs of each one? Here's how:

dbcc traceon(3604, 302, 310, 317)

Incidentally, the 3604 sends the output to the monitor, or to a specified file. That prevents it from filling up your error log (3605).

You may already be familiar with 302, but 310 and 317 are the trace commands that can tell you which join orders were considered and rejected, and what the costs were. Together they'll show you ALL the plans, so be prepared for a lot of output. Otherwise use 310 by itself.

There is a way to force the optimizer to choose a certain plan. Specifically, the join order specified by the query itself, in the "from" clause.

set forceplan [on|off]

The advantage is that it saves you the time the optimizer would have spent choosing a join order. The bad news is, that time is negligible anyway.

The disadvantage is that there is usually a reason the optimizer chooses its plan, and it can vary from execution to execution depending on the data and the variables of the query. Instead of forcing a plan you think is best, you may want to consider why the optimizer is choosing the "wrong" plan. That's where the aforementioned tracing can help.

As a quick fix, though, make sure your statistics and stored procedures are up to date! The optimizer depends on its accuracy for the best results.

update statistics table_name [index_name]
exec proc_name with recompile.

There is much more to know about join optimization than I can do justice to in this short article. If you find yourself examining join optimization as part of your query tuning, you'll definitely want to review Sybase's Performance and Tuning Guide, Chapter 38. It talks about dbcc 302, 310, 317 as well as the principles of join optimization.

I also found a really good article from Sybase entitled "Analyzing and Resolving Optimizer Problems/Symptoms":

G'day Rob,

I am digging into the troika of #temp tables (with indexes), statistics and "with recompile". I was trying to capiche exactly the sequence of events that occur around these when shaken not stirred.

I have a number of sprocs that produce v.large #temp datasets that I am endeavouring to ensure will present optimal query plans (as you do). So far I am using a mixture of index hints and force plans, but wanted a more comprehensive general approach.

I have update stats statements inline after the #tables have been filled with data but I assume the plans have been formed by then and won't be affected by the histograms?

If I apply "with recompile" on the execs I expect I will get new plans for each statement before entering the sproc which is before the stats are populated, or will each statement be compiled just prior to execution? The latter case I assume would see the fresh stats included in the plan.

Are there any sources you know of on this topic? Having googled to your blog my next stop is usenet.

Cheers, Frank.

That's a stumper. I don't know off the top of my head and it didn't jump out after a quick survey of the Sybase Performance & Tuning Guide (the best source of info on these matters, I think).

I think you're heading the right direction with the right approach.

I think we'd have to put together some tests to answer your questions about the effect of with_recompile. Luckily the information we'd need to figure out the behaviour is queryable at run-time provided you're sa. I don't have time to put those tests together right now, but I'll make a note to cover this interesting topic shortly.

In the meantime, I would consider avoiding the #temp tables, and use permanent temporary tables (at least as a preliminary exercise).

After understanding the data better, I'd consider ways to figure out and force the best plan dynamically.

Sorry, that's all I have at the top of my head at the end of a day Friday!


Did you ever get any where with setting up tests to find more about this.

This comment has been removed by a blog administrator.
Post a Comment

<< Home

This page is powered by Blogger. Isn't yours?