SQL Major Concepts

 

SQL MAJOR.doc (190,5 kB)

 

 

CASE and COALESCE

Perhaps the most commonly used function, which is classified as a system function, is CASE. There are two general uses of CASE. The first one is used to replace occurrences of one value with other values, as specified by the programmer. Syntax for this flavor of CASE is as follows:

SELECT column_name = CASE
WHEN column_name = 'a' THEN 'b'
...
ELSE 'c'

END

 

For instance, suppose that we want to specify the salary level for each job category within my publishing company. Using the pubs database, we could write the following query:

SELECT job_desc, salary_level =

CASE

WHEN job_desc = 'New Hire - Job Not Specified' THEN '25K'

WHEN job_desc = 'Chief Executive Officer' THEN '500K'

WHEN job_desc = 'Business Operations Manager' THEN '350K'
WHEN job_desc = 'Chief Financial Officier' THEN '400K'
WHEN job_desc = 'Publisher' THEN '75K'
WHEN job_desc = 'Managing Editor' THEN '65K'
WHEN job_desc = 'Marketing Manger' THEN '55K + commissions'
ELSE 'you get the idea'
END

FROM jobs      

 

Results:

job_desc

salary_level

New Hire—job not specified

25K

Chief Executive Officer

500K

Business Operations Manager

350K

Chief Financial Officer

400K

Publisher

75K

Managing Editor

65K

Marketing Manager

You get the idea

Public Relations Manager

You get the idea

 


The other variation of CASE, which is sometimes referred to as the searched CASE, evaluates a Boolean expression and returns different values accordingly. For instance, we could use the searched CASE to categorize the price of titles as cheap, affordable, expensive, or unknown, as follows:

 

SELECT title_id, price,  category= CASE

       WHEN price IS NULL THEN 'unknown'

       WHEN price < = 7 THEN 'cheap'

       WHEN price BETWEEN 7.1 AND 15 THEN 'affordable'

       ELSE 'expensive'

       END

FROM titles

 

Results:

title_id

price

Category

BU1032

19.99

Expensive

BU1111

11.95

Affordable

BU2075

2.99

Cheap

BU7832

19.99

Expensive

MC2222

19.99

Expensive

MC3021

2.99

Cheap

MC3026

NULL

Unknown

PC1035

22.95

Expensive

PC8888

20

Expensive

PC9999

NULL

Unknown

PS1372

21.59

Expensive

PS2091

10.95

Affordable

PS2106

7

Cheap

PS3333

19.99

Expensive

PS7777

7.99

Affordable

TC3218

20.95

Expensive

TC4203

11.95

Affordable

TC7777

14.99

Affordable

 

The COALESCE function hardly qualifies as a system function because it doesn't retrieve any system information. COALESCE simply returns the first value out of a list that is not NULL. COALESCE is a powerful tool if you are returning numerous values to a user and want to substitute occurrences of NULL with values from a different column or with an expression.

For instance, suppose that we need to return a list of cities and states. If a publisher is located in a country that does not have a state, we'll substitute a NULL value with an expression:

SELECT

        city,

        COALESCE(state, 'no state specified') AS state

FROM publishers

 

 

Results:

City

State

Boston

MA

Washington

DC

Berkeley

CA

Chicago

IL

Dallas

TX

München

no state specified

New York

NY

Paris

no state specified

 

The only requirement of COALESCE is that all expressions in the list must have compatible data types. For instance, you can't coalesce DATETIME and INTEGER. Notice that the COALESCE function can accept multiple values.

ISNULL, NULLIF, and GETANSINULL

The ISNULL function is similar to COALESCE, but accepts only two parameters. The first parameter will be checked, and if NULL value is found, it will be replaced with the second parameter. Furthermore, ISNULL requires that both parameters have the same (not just similar) data type.

For instance, we can return 0 instead of NULL for titles that do not have any royalties associated with them, as follows:

SELECT

TOP 10

title_id,

ISNULL(royalty, 0) AS royalty

FROM titles

Results:

title_id     royalty    

--------     -----------

BU1032          10

BU1111          10

BU2075          24

BU7832          10

MC2222          12

MC3021          24

MC3026           0

PC1035          16

PC8888          10

PC9999           0

 

The NULLIF function returns a NULL value if the two parameters it accepts are equivalent. NULLIF can be thought of as an opposite of ISNULL; for instance, we could use NULLIF if we wanted to return NULLs whenever royalty is 10:

SELECT

TOP 10

      title_id,

      NULLIF(royalty, 10) AS royalty

FROM titles

Results:

title_id     royalty    

--------     -----------

BU1032         NULL

BU7832         NULL

MC2222           12

MC3021           24

MC3026         NULL

PC1035           16

The GETANSINULL function provides a quick way of checking whether column nullability is determined according to the ANSI 92 standard. (I have not found much use for this function; you can refer to online documentation if you want to learn more about GETANSINULL).

CAST and CONVERT

The CAST and CONVERT functions are very similar: Both translate a value from one data type to another. Although their performance is also similar, their syntax and potential usage is slightly different.

Both CAST and CONVERT have a straightforward syntax:

CAST(expression AS new_data_type)

 

CONVERT(new_data_type, expression, [style])

 

The expression must already have a data type that is translatable into the new_data_type. For instance, you can't convert an alphanumeric string into an integer.

NOTE

CONVERT has an optional parameter: style. This parameter is allowed only for cases when working with date and time values. SQL Server supports numerous formats for presenting date and time values; the style parameter is used to specify such format.

For example, suppose that we want to retrieve order dates from the sales table. However, we don't care about the time portion; all we need to know is the order date. We could use either CAST or CONVERT to do this, as in the following queries:

SELECT TOP 1 CAST(ord_date AS VARCHAR(12)) FROM sales

 

or

SELECT TOP 1 CONVERT(VARCHAR(12), ord_date, 109) FROM sales

 

Both return the same results:

------------

Sep 14 1994

 

In this example, we retrieved a date value in its default format (on my server). In fact, CONVERT(VARCHAR(12), ord_date) would bring the same result. Now, suppose that we need a date value in which month, day, and year are separated by dashes. In such a case, we have no choice but to resort to CONVERT with the style 110:

 

SELECT TOP 1 CONVERT(VARCHAR, ord_date, 110) FROM sales

Results:

------------

09-14-1994

 

 

IDENTITY- and GUID-Related Functions

The @@IDENTITY, IDENTITY, SCOPE_IDENTITY, IDENT_CURRENT(), and NEWID() functions deal with IDENTITIY values or globally unique identifiers.

The NEWID() function must be used if you want to provide a default value for a column with the UNIQUEIDENTIFIER data type. It returns a new GUID each time it is executed.

The IDENTITY function has a limited use; in rare cases, when you use SELECT…INTO syntax you can supply identity values for the newly created table using the IDENTITY function. For instance, suppose we want to add an identity column to the sales table (within a temporary table). We could use the following statement to copy all rows from sales table into #new_sales and add an identity column, all in one shot:

SELECT IDENTITY(INT, 1,1) AS sales_key, *

INTO #new_sales

FROM sales

The other three IDENTITY-related functions deserve more attention. You will often need to populate multiple related tables, perhaps within a single transaction. For instance, you could be populating the order and order_details tables in one transaction. If the order table has an identity column, you'll have to look up the identity value just inserted into the order table before you can add a related record in order_details. The @@IDENTITY, SCOPE_IDENTITY(), and IDENT_CURRENT() functions help you look up the last identity value inserted, but their behavior is slightly different from each other, as follows:


 

·         @@IDENTITY returns the last IDENTITY value inserted on the current connection. Suppose that you have an INSERT trigger on the order table that populates the audit_trail table, which also has an IDENTITY column. In such a case, the @@IDENTITY function will return the last identity value inserted, which would be the identity inserted in the audit_trail table instead of the identity value added to the order table. Therefore, if you try populating order_details with a value returned by the @@IDENTITY function, your data integrity will be compromised.

·         The IDENT_CURRENT() function accepts a table name as the parameter and returns the last identity value generated in that table by any connection. If you're trying to populate order_details with the last identity value inserted into the order table, you might think that IDENT_CURRENT('order') would suffice. Well, it would if you were the only user of the system, but it is possible that some other user might add a row to the order table a few milliseconds after you added the row to the same table. Therefore, using IDENT_CURRENT() in a multiuser system might also compromise your data integrity.

·         The SCOPE_IDENTITY() function takes no parameters and returns the last identity value inserted within the current scope. What this means is that if an INSERT statement populating the order table fires a trigger and adds a row to the audit_trail table, SCOPE_IDENTITY() will return the last value added to the order table, whereas @@IDENTITY will return the last value added to audit_trail. These functions have appropriate uses, but they're often used incorrectly

 

USER- and HOST-Related Functions

The CURRENT_USER, SYSTEM_USER, USER_NAME(), and SESSION_USER functions return information about the current user of the system. The output of these functions is useful for enforcing security—you can allow reading or modifying certain data only to authorized users. These functions are also handy when you want to create an audit trail of data changes: Every time someone modifies the data, you can store the user name in the updated_by column, or perhaps even insert a row in the audit tail table.


 

CURRENT_USER, SESSION_USER and USER_NAME() are equivalent—all three return the current database user name. Notice, however, that the USER_NAME() function accepts a user_id as the parameter, defaulting to the current user if no parameter is passed. All these functions can also be successfully used for auditing purposes. USER_NAME can also be used to look up a database user name by its identifier.

SYSTEM_USER returns the SQL Server or Windows login name, depending on the authentication method used.

HOST_ID() and HOST_NAME() return workstation identifier and name, respectively. They also can be used to troubleshoot problems specific to a particular computer.

 

Data Type Validation Functions

The ISDATE and ISNUMERIC functions are similar to each other: They let you know whether the parameter passed is of a valid data type (date and time data type in case of ISDATE and numeric data type in case of ISNUMERIC). Both return a BIT value, as in the following:

SELECT

ISDATE('february 39, 2002') AS 'february 39, 2002',

    ISDATE('1/1/2002') AS '1/1/2002',

    ISNUMERIC('abc') AS 'abc',

    ISNUMERIC('123') AS '123'

Results:

february 39, 2002

1/1/2002

abc

123

0

1

0

1

 


Other System Functions

The CURRENT_TIMESTAMP function works exactly the same way as GETDATE(): It returns current date and time. Oddly enough, CURRENT_TIMESTAMP is classified as a system function, whereas GETDATE falls into the date and time category.

The DATALENGTH function is similar to the LEN function, which returns the length of a particular string expression. DATALENGTH returns the number of bytes used to represent an expression of any data type.

The @@TRANCOUNT function returns the number of open transactions on a particular connection. If you experience locking problems or SQL Server complains about transactions being opened, you can check the value of @@TRANCOUNT to troubleshoot such issues. @@TRANCOUNT can also be used for error handling; if @@TRANCOUNT returns anything other than 0, something must have gone wrong and you have uncommitted transactions. (Refer to my article about transactions and locking for more information on this topic.)

The @@ERROR function returns the number of the last error encountered on the current connection. If there are no errors, @@ERROR returns 0. Not surprisingly, @@ERROR is used for error handling.

The @@ROWCOUNT function returns the number of rows affected by your last query. This function can be used effectively to find out whether the number of rows modified is the same as what you intended to modify. If the two values are different, something went wrong.

The ROWCOUNT_BIG() function does the same thing as @@ROWCOUNT, but returns a BIGINT instead of an INT data type.

NOTE

It's easy to confuse the functionality of ROWCOUNT and @@ROWCOUNT. The former advises SQL Server to affect only a specified number of rows (similar to the TOP keyword); the latter simply counts the number of rows affected, as shown here:

/* first limit the output to 2 rows */

SET ROWCOUNT 2

 

/* this query will affect only 2 rows */

SELECT title_id, title FROM titles

 

/* now use functions to count the number of

affected rows */

SELECT @@ROWCOUNT AS '@@rowcount_output',

       ROWCOUNT_BIG() AS 'rowcount_big_output'

Results:

title_id       title                                                                            

--------     ---------------------------------------------------------------

PC1035       But Is It User Friendly?

PS1372       Computer Phobic AND Non-Phobic Individuals: Behavior Variations

 

@@rowcount_output     rowcount_big_output 

-----------------     --------------------

2                     2

The APP_NAME() function returns a string with the name of the application that initiated the database connection. APP_NAME() can be helpful if you're troubleshooting a connection and want to know which app initiated the offending process.

>> Download document for full notes.