Saturday 5 December 2009

Query XML column in a table as record set in SQL Server 2005 and above

Since SQL Server 2005 running select queries on XML columns had become lot easier. SQL Server 2005 and 2008 introduced several XML functions which can be used to extract XML data as record set and even apply filters on them. Following are the four XML data type methods which takes XQuery expression to either return part of XML or value.

  1. query() – Returns part of XML from XML data
  2. value() – Returns value of XML element of attribute
  3. exists() – Returns value 1 or 0 if a XQuery expression evaluates non-null node list
  4. nodes() – Returns dataset result from the XQuery expression it takes. This method is used with value() and exist()

I’m going to use here nodes() and value() methods to query XML column within CustomerOrderHistory table. This table contains OrderHistory XML data type column which contains historical customer order details. Here is how the table structure looks like and data within this table

CREATE TABLE CustomerOrderHistory
(
CustomerId INT IDENTITY(1, 1)
PRIMARY KEY CLUSTERED,
CustomerName VARCHAR(50),
OrderHistory XML
)
GO

INSERT INTO CustomerOrderHistory
(
CustomerName,
OrderHistory
)
VALUES (
'William Strong',
'<Orders>
<Order Id="1010" OrderDate="11-05-2009" PaymentMethod="CC">
<OrderDetails>
<OrderItem ProductId="102030" Quantity="20" />
<OrderItem ProductId="102230" Quantity="10" />
<OrderItem ProductId="102045" Quantity="4" />
</OrderDetails>
</Order>
<Order Id="2312" OrderDate="11-15-2009" PaymentMethod="DC">
<OrderDetails>
<OrderItem ProductId="102343" Quantity="25" />
<OrderItem ProductId="102233" Quantity="15" />
<OrderItem ProductId="102445" Quantity="42" />
</OrderDetails>
</Order>
</Orders>'
),
(
'Andy Clarke',
'<Orders>
<Order Id="1232" OrderDate="11-05-2009" PaymentMethod="CC">
<OrderDetails>
<OrderItem ProductId="123231" Quantity="34" />
<OrderItem ProductId="123211" Quantity="12" />
<OrderItem ProductId="454345" Quantity="64" />
</OrderDetails>
</Order>
<Order Id="3424" OrderDate="11-15-2009" PaymentMethod="Cash">
<OrderDetails>
<OrderItem ProductId="542342" Quantity="15" />
<OrderItem ProductId="234234" Quantity="65" />
<OrderItem ProductId="234222" Quantity="62" />
</OrderDetails>
</Order>
</Orders>'
)
GO

SELECT CustomerId,
CustomerName,
OrderHistory
FROM CustomerOrderHistory
GO

I’ve now created a table-valued function which takes XML data type parameter and returns all order history from XML as data set.

IF OBJECT_ID(N'dbo.FN00_GetOrderHistory', N'TF') IS NOT NULL
DROP FUNCTION dbo.FN00_GetOrderHistory
GO

CREATE FUNCTION FN00_GetOrderHistory
(
-- Add the parameters for the function here
@OrderHistoryXML XML
)
RETURNS
@OrderHistory TABLE
(
-- Add the column definitions for the TABLE variable here
OrderId INT,
OrderDate DATETIME,
PaymentMethod VARCHAR(10),
ProductId INT,
Quantity INT
)
AS
BEGIN
-- Fill the table variable with the rows for your result set
INSERT INTO @OrderHistory
SELECT R.nref.value('@Id', 'INT') AS OrderId,
R.nref.value('@OrderDate', 'DateTime') AS OrderDate,
R.nref.value('@PaymentMethod', 'Varchar(10)') AS PaymentMethod,
T.sref.value('@ProductId', 'INT') AS ProductId,
T.sref.value('@Quantity', 'INT') AS Quantity
FROM @OrderHistoryXML.nodes('//Orders/Order') R(nref)
CROSS APPLY R.nref.nodes('//OrderDetails/OrderItem') T(sref)

RETURN
END
GO

In this function we take XQuery expression from the root (//Orders/Order) and pass it to nodes(). We also want to get all order items and inner join it with the orders so I have used CROSS APPLY to get all order items elements from the current root.

This function can now simply used when selecting rows from CustomerOrderHistory table as follows:



SELECT  CustomerId,
CustomerName,
b.*
FROM CustomerOrderHistory a
CROSS APPLY dbo.FN00_GetOrderHistory(OrderHistory) b

This query returns us results as expected:

results

Invoking WCF method Asynchronously and Signalling calling Thread

When calling WCF service method which takes longer to complete often you would require to have WCF client to be responsive while calling the method. Visual Studio allows developers to generate proxy for calling WCF service by adding service references. When generating service proxy Visual Studio also allows you to generate asynchronous method stubs to call WCF service method asynchronously.

Proxy creates two types of asynchronous methods stubs for clients to use. Consider following simple WCF calculator service:

	
[ServiceContract]
public interface ICalculatorService
{
[OperationContract]
float Add(float p1, float p2);

[OperationContract]
float Substract(float p1, float p2);
}

public class CalculatorService : ICalculatorService
{
#region ICalculatorService Members

public float Add(float p1, float p2)
{
Console.WriteLine("Adding {0} + {1}", p1, p2);

return p1 + p2;
}

public float Substract(float p1, float p2)
{
Console.WriteLine("Substracting {0} - {1}", p1, p2);

Thread.Sleep(5000);

return p1 - p2;
}

#endregion
}


At the client side the proxy generates couple of types of asynchronous method stubs. For instance for Add() we get following stubs:

   
public System.IAsyncResult BeginAdd(float p1, float p2, System.AsyncCallback callback, object asyncState)

public float EndAdd(System.IAsyncResult result)

private void OnAddCompleted(object state)

public void AddAsync(float p1, float p2)

public void AddAsync(float p1, float p2, object userState)


Many of us are familiar with BeginAdd() and EndAdd() methods. Same functionality can be achieved when using delegates. Similar methods are generated for Subtract(). Now the client can call any of these methods to asynchronously invoke WCF method. AddAsync() is something I found interesting. This method is used along with OnAddCompleted event handler which gets the return value from the method call. Here is how you can use this methods in client side.

        
static WaitHandle[] waits = new WaitHandle[] { new AutoResetEvent(false) };

static void Main(string[] args)
{
try
{
using (CalculatorServiceClient client = new CalculatorServiceClient())
{
Console.WriteLine(Thread.CurrentThread.ManagedThreadId);
Console.WriteLine("Calling service methods asynchronously method 2");
client.SubstractCompleted += new EventHandler<SubstractCompletedEventArgs>(client_SubstractCompleted);
client.SubstractAsync(6, 3, waits[0]);
WaitHandle.WaitAll(waits);
}

Console.WriteLine("Press <ENTER> to terminate client");
Console.ReadLine();
}
catch (Exception e)
{
Console.WriteLine(e.Message);
}
}

static void client_SubstractCompleted(object sender, SubstractCompletedEventArgs e)
{
Console.WriteLine("Async Substraction result = {0}", e.Result);
Console.WriteLine(Thread.CurrentThread.ManagedThreadId);
((AutoResetEvent)e.UserState).Set();
}

Now I had to send WaitHandle[] as useState parameter to SubstractAsync() and also call WaitHandle.WaitAll(waits) after that. If we don’t do that client_SubstractCompleted() handler will throw TargetInvocationException(). The reason being SubstractAsync() takes longer to complete and before it completes using block disposes CalculatorServiceClient variable. To handle this situation our calling code needs to wait until the thread where subtract operation is taking place signals main thread about its completion.

For this we have created an array of WaitHandles and sent first array element as userState parameter of SubstractAsync(). The calling code waits until AutoResetEvent to signal main thread by calling Set() inside client_SubstractCompleted() before using block is completed. One thing to note here that client_SubstractCompleted() also executes in a separate thread. Thread.CurrentThread.ManagedThreadId outputs different values for both main thread and thread where subtract operation takes place.

Friday 20 November 2009

Using Custom Configuration Section & Reflection to create Dynamic Factory Class

A Factory pattern can be implemented in several ways depending on which way a developer feels comfortable with and addresses future extensions. Factory class encapsulates object creation logic which other classes can use to get object instances without knowing the details of how the object was created. It gives a central location to developers for creational logic rather than scattering that logic everywhere in the application code.

I’m going to show an example here of how I used Reflection and Custom Configuration Section in .NET Framework to create my factory class. The example is based on simple ATM system which communicates with Card Service Providers (CSP) to carry out transaction successfully. CSP then communicates with its underlying bank network (Bank Service Providers) for transaction completion. Because there can be several card service providers and banking network I’ve managed to extract a common interface for both of them. Here is how the UML looks like for them:

image

In the above class diagram CardServiceProvider is an abstract class which has virtual ConnectBank() and abstract LoadBankNetworks(). Every deriving CSP is responsible to load its own banking network. The ATM client is not aware of any of the CSP or banking network concrete implementations. ATM client only knows static ServiceProviderFactory class and CSP and banking network interfaces. Following code shows code for both CSP and banking network.

public abstract class CardServiceProvider
{
///
/// This dictionary collection holds bank service provider for a given
/// first few digits card number format depending upon card service provider
/// e.g card number starting with 6465 belongs to ANZ bank within Visa network
///

protected Dictionary _bankNetworkList;

public CardServiceProvider()
{
_bankNetworkList = new Dictionary();
}

///
/// Connects to bank network for transaction completion
///

public virtual TransactionNotification ConnectBank(TransactionInfo transactionInfo)
{
//Only load list if it is not loaded yet
if (_bankNetworkList.Count == 0)
{ LoadBankNetworks(); }

TransactionNotification transactionNotification = new TransactionNotification();

int identifier = GetBankNetworkIdentifier(transactionInfo.CardNumber);

IBankServiceProvider bankServicerProvider = _bankNetworkList[identifier];
bankServicerProvider.TransactionInfo = transactionInfo;

if (bankServicerProvider.CheckAccountValidity())
{
switch (transactionInfo.TransactionType)
{
case TransactionTypeCode.BalanceEnquiry:
transactionNotification.Balance = bankServicerProvider.BalanceEnquiry();
transactionNotification.CSPNotificationType = NotificationTypeCode.BankTransactionSuccess;
break;
case TransactionTypeCode.WithDrawCash:
transactionNotification.IssueCash = bankServicerProvider.DebitCash(transactionInfo.WithDrawCashAmount);
transactionNotification.CSPNotificationType = transactionNotification.IssueCash ? NotificationTypeCode.BankTransactionSuccess : NotificationTypeCode.NotEnoughBalance;
break;
case TransactionTypeCode.MiniStatementEnquiry:
transactionNotification.MiniStatement = bankServicerProvider.MiniStatementEnquiry(transactionInfo.StatementStartDate,
transactionInfo.StatementEndDate);
transactionNotification.CSPNotificationType = NotificationTypeCode.BankTransactionSuccess;
break;
default:
transactionNotification.CSPNotificationType = NotificationTypeCode.AccountBlocked;
break;
}
}
else
{ transactionNotification.CSPNotificationType = NotificationTypeCode.AccountBlocked; }

return transactionNotification;
}

///
/// Every card service provider maintains their own list of banks
///

public abstract void LoadBankNetworks();

///
/// By default first 4 digits identifies bank network
///

/// card number
/// 4 digits identifier
protected virtual int GetBankNetworkIdentifier(string cardInfo)
{
return Convert.ToInt32(cardInfo.Substring(0, 4));
}
}


	public interface IBankServiceProvider
{
/// <summary>
/// Gets balance from the account
/// </summary>
float BalanceEnquiry();

/// <summary>
/// Gets mini transaction statement
/// </summary>
/// <param name="StartDate">Transaction start date</param>
/// <param name="EndDate">Transaction end date</param>
DataTable MiniStatementEnquiry(DateTime StartDate, DateTime EndDate);

/// <summary>
/// If transaction is successfull we also want to log the transaction
/// </summary>
/// <param name="Amount">Withdraw cash requested</param>
/// <returns>True if transaction was successful otherwise false</returns>
bool DebitCash(float Amount);

/// <summary>
/// Checks if the account is valid
/// </summary>
bool CheckAccountValidity();

TransactionInfo TransactionInfo { get; set; }
}


In order to instantiate classes implementing above abstract class and interface I’ve create custom configuration section in the configuration file. This way allows developers to add more concrete CSP and bank networks in future making the application free for extensions. My configuration section looks like below.



<?xml version="1.0" encoding="utf-8"?>
<configuration>
<configSections>
<section name="CardServices" type="ATM.Transaction.Lib.CardServiceProviderSection, ATM.Transaction.Lib "/>
</configSections>

<CardServices>
<CardProvider>
<add keyvalue="Visa" type="ATM.Transaction.Lib.VisaProvider">
<Codes>
<add keyvalue="6465" banktype="ATM.Transaction.Lib.ANZServiceProvider" />
<add keyvalue="2323" banktype="ATM.Transaction.Lib.PNBServiceProvider"/>
</Codes>
</add>
<add keyvalue="Link" type="ATM.Transaction.Lib.LinkProvider">
<Codes>
<add keyvalue="5554" banktype="ATM.Transaction.Lib.PNBServiceProvider" />
<add keyvalue="6464" />
</Codes>
</add>
<add keyvalue="MasterCard" type="ATM.Transaction.Lib.MasterCardProvider">
<Codes>
<add keyvalue="7271" />
<add keyvalue="9595" />
</Codes>
</add>
</CardProvider>
</CardServices>
</configuration>


A given CSP can connect to several banking network. Few bank network types I have intentionally left out to just indicate that they can be added latter on. The CardServiceProviderSection class reads the configuration section and dynamically loads instances of concrete classes using reflection. This class is derived from System.Configuration.ConfigurationSection class. In the above configuration section <CardProvider> has collection which is represented by ProviderElement Collection and within ProviderElement we have more then one occurrences of CodeElement Collection. To represent these collections I’ve created generic CustomElementCollection<T> class which extends System.Configuration.ConfigurationElementCollection class. The real trick to use reflection and dynamically create instances of CSP and bank network happens inside ProviderElement class. See the code below for this class



	internal class CardServiceProviderSection : ConfigurationSection
{
[ConfigurationProperty("CardProvider")]
public CustomElementCollection<ProviderElement> CardServiceProviders
{
get
{
return this["CardProvider"] as CustomElementCollection<ProviderElement>;
}
}
}

internal class ProviderElement : CustomElementBase
{
[TypeConverter(typeof(ServiceProviderTypeConverter))]
[ConfigurationProperty("type", IsRequired = true)]
public CardServiceProvider ProviderInstance
{
get
{
return this["type"] as CardServiceProvider;
}
}

[ConfigurationProperty("Codes")]
public CustomElementCollection<CodeElement> Codes
{
get
{
return this["Codes"] as CustomElementCollection<CodeElement>;
}
}
}

internal class CodeElement : CustomElementBase
{
[TypeConverter(typeof(ServiceProviderTypeConverter))]
[ConfigurationProperty("banktype", IsRequired = false)]
public IBankServiceProvider BankProviderInstance
{
get
{
return this["banktype"] as IBankServiceProvider;
}
}
}

internal sealed class ServiceProviderTypeConverter : ConfigurationConverterBase
{
public override bool CanConvertFrom(ITypeDescriptorContext ctx, Type type)
{
return (type == typeof(string));
}

public override object ConvertFrom(ITypeDescriptorContext context, System.Globalization.CultureInfo culture, object value)
{
Type t = Type.GetType(value.ToString());

return Activator.CreateInstance(t);
}
}


ProviderInstance property of ProviderElement class is decorated with TypeConverterAttribute which specifies ServiceProviderTypeConverter. This class overrides ConvertFrom() from ConfigurationConvertBase which uses reflection to return instance of type contained in value parameter. Codes property uses similar technique to create instances of bank networks.



To represent ProviderElement collection I have used CustomElementCollection<ProviderElement>. The code is shown below.



	
internal class CustomElementCollection<T> : ConfigurationElementCollection where T : CustomElementBase, new()
{
public T this[int index]
{
get
{
return base.BaseGet(index) as T;
}
set
{
if (base.BaseGet(index) != null)
{
base.BaseRemoveAt(index);
}
this.BaseAdd(index, value);
}
}

protected override ConfigurationElement CreateNewElement()
{
return new T();
}

protected override object GetElementKey(ConfigurationElement element)
{
return ((T)element).KeyValue;
}
}

internal class CustomElementBase : ConfigurationElement
{
[ConfigurationProperty("keyvalue", IsRequired = true)]
public string KeyValue
{
get
{
return this["keyvalue"] as string;
}
}
}


Now the main thing left out is how to get instances of concrete classes specified in the configuration. Well as you may have guessed it would be static ServiceProviderFactory class which is used by ATM client to get instances.



	public static class ServiceProviderFactory
{
private static CardServiceProviderSection csps;

static ServiceProviderFactory()
{
csps = ConfigurationManager.GetSection("CardServices") as CardServiceProviderSection;
}

public static CardServiceProvider CreateCardServiceProvider(string cardNumber)
{
CardServiceProvider csp = null;

foreach (ProviderElement pe in csps.CardServiceProviders)
{
if (pe.Codes.OfType<CustomElementBase>().Select(elm => elm.KeyValue).Contains(cardNumber.Substring(0, 4)))
{
csp = pe.ProviderInstance;
break;
}
}

return csp;
}

public static Dictionary<int, IBankServiceProvider> CreateBankServiceProvider(string keyValue)
{
var providers = from pe in csps.CardServiceProviders.OfType<ProviderElement>()
from c in pe.Codes.OfType<CodeElement>()
where pe.KeyValue == keyValue
select new { KeyValue = Convert.ToInt32(c.KeyValue), Provider = c.BankProviderInstance };

Dictionary<int, IBankServiceProvider> bankProviders = new Dictionary<int, IBankServiceProvider>();
foreach (var p in providers)
{ bankProviders.Add(p.KeyValue, p.Provider); }

return bankProviders;
}
}


Inside the static constructor I get all custom configuration details. The static constructor is called first time when ServiceProviderFactory is used. Other static methods within this class have just got mere responsibilities to query correct instance of CSP and bank network when called.



The ATM client would can use following code to get instance of CardServiceProvider class



CardServiceProvider csp = ServiceProviderFactory.CreateCardServiceProvider(cardNumber);


More providers can be added latter on in the configuration, but the object creational technique and ATM client would not need to worry about this.

Saturday 10 October 2009

Finding Index Usage Statistics for a Table having multiple Indexes

Indexing tables is the best way to improve your table query performance. Having millions of records in a table without indexes will not only make your table query perform very slow but also your client applications using those queries will have adverse effects.

However, creating indexes on tables will not necessarily improve query performance, but creating right indexes and correct number of indexes also matters. Recently, I was working in a project to store several data validation logs in the database and then reporting those outstanding error logs to QA and Analysts. Tables contained not thousands but millions of logs stored in XML format. There were several types of reports created so I was required to write different types of SQL queries to retrieve data from the Logs table. In the development environment I created indexes which were required for my queries to perform well. Soon I was ended up with 4 – 6 indexes on my Logs table.

Unfortunately, user did not wanted any latency to the reports so we could not use OLAP for reporting purpose but rather had to query directly from OLTP database. This means that at the same time when more validation logs were produced other user was querying database table to produce report. Soon, I started to see adverse effects of too many indexes on my Logs table because when any DML statements were used on Logs table, all indexes where also updated.

Although my queries for reporting purpose were performing well enough, user processes were not on the other hand. It would have been too time consuming for me to run all report queries and find out with indexes where utilized and which were not. So I decided to used index meta data information which SQL Server maintains to find out index usage states. SQL Server 2005 and above ships with sys.dm_db_index_usage_stats dynamic management view which you can query to retrieve index usage statistics for one or more tables.

USE [AdventureWorks]
GO

SELECT *
FROM Sales.[SalesOrderHeader]
WHERE [CustomerID] = 227
GO

SELECT a.index_id,
b.name,
a.user_seeks,
a.user_scans,
a.user_lookups,
a.user_updates,
a.last_user_seek,
a.last_user_scan,
a.last_user_lookup
FROM sys.dm_db_index_usage_stats a
INNER JOIN sys.indexes b ON a.object_id = b.object_id
AND a.index_id = b.index_id
WHERE a.object_id = OBJECT_ID('Sales.SalesOrderHeader')
GO


Running first query will get all columns from Sales.SalesOrderHeader table for CustomerId 227. Running next query will give information about indexes on Sales.SalesOrderHeader table. It gives you name of the indexes along with other information such as when those indexes where used and how (seek, scan or lookup).


I found this a great way to find out those indexes on my table which have been never used but still getting updated because of DML activities on the table. To improve our DML performance I can easily drop those indexes which are never used.


Other way to perform similar action would be to use Database Engine Advisor (DTA). But sometimes in UAT or Production server you may not have permissions to use DTA. Running above query would be great way to find out unused indexes then ask DBAs to drop those indexes.

Tuesday 22 September 2009

Finding missing indexes which SQL Server recommends for T-SQL performance improvement

Recently, I was working on quite a lot on SQL Server development. I have been developing stored procedures, T-SQL queries and what not. While working with huge amount of data and bigger queries often developers find themselves stuck when queries are not running as fast you would really expect it to. Often developers would receive several support calls when these SQL objects are deployed to UAT or Production environment and your application using these objects is responding very slow.

I was in similar situation with my recent development. I had to work with numerous amount of data rows and several SQL Server objects. Some of my queries where requesting hell lot of data from our OLTP database which were performing very slow. With SQL Server 2005 and above you can see graphically the executing plan which query optimizer uses. The most common way to show execution plan is to either click on "Display Estimate Execution Plan" button from the toolbar or you can toggle "Include Execution Plan" button then run the query.

However, same thing can also be acheived by using following T-SQL set statements. Lets create two tables without any indexes.

IF(NOT EXISTS( SELECT * FROM sys.tables WHERE object_id = OBJECT_ID('MasterTable') ))
CREATE TABLE MasterTable
(
MasterId INT IDENTITY(1, 1),
[Description] VARCHAR(MAX)
)
GO

IF(NOT EXISTS( SELECT * FROM sys.tables WHERE object_id = OBJECT_ID('DetailsTable') ))
CREATE TABLE DetailsTable
(
SubId INT IDENTITY(1, 1),
MasterId INT,
)
GO

INSERT INTO MasterTable ([Description]) VALUES ('Sample 1'), ('Sample 2'), ('Sample 3'), ('Sample 4')
GO

DECLARE @i INT = 1
WHILE (@i <= 10000)
BEGIN
INSERT INTO DetailsTable(MasterId) VALUES (1), (1), (2), (2), (2), (2), (3), (3), (4), (4)
SET @i = @i + 1
END
GO

Now run following statement
SET SHOWPLAN_XML ON
GO

This causes SQL Server not to execute T-SQL statements. Instead, SQL Server returns detailed information about how the statements are going to be executed in the form of a well-defined XML document. Then run the following query

SELECT * FROM DetailsTable
INNER JOIN MasterTable ON DetailsTable.MasterId = MasterTable.MasterId
WHERE SubId IN (1, 2, 3, 4)
GO

This will not show the query result but rather XML in the result pane. Click the XML to view execution plan.



Right-click on the text where it says "Missing Index...". Basically SQL Server produces actual T-SQL for you to create missing index for improving query performance. It also states by how many percentage query performance will be improved. In this case by 82.2327%. When you click on Missing Index Details... You will see following script produced for you to run.



/*
Missing Index Details from ExecutionPlan2.sqlplan
The Query Processor estimates that implementing the following index could improve the query cost by 82.2327%.
*/


USE [YourDatabase]
GO
CREATE NONCLUSTERED INDEX []
ON [dbo].[DetailsTable] ([SubId])
INCLUDE ([MasterId])
GO

I found this very useful in my day to day task to find out those missing indexes that could really improve my query performance.