Friday, April 18, 2014

ASP.NET application pool, worker process, app domain and thread

This is my answer to a stackoverflow question:

One IIS server may have multiple application pools.
One web application binds to one application pool.
One application pool may have more than one worker process (when Web Garden is enable).
On worker process can have multiple app domains. One app domain lives only in one worker process.
One app domain may have multiple threads. One thread can be shared by different app domains in different time.

The meaning to ASP.NET developers: to make your web site scalable, don't use in-proc session and don't use static class variable lock for synchronization.

Upsert : Update a row if it exists else insert a new row

Some applications need an "upsert" operation: update a row if it exists else insert a new row. The following is a summary of solutions I found online. 

Suppose we have the following table in a SQL Server database that is mapped to a C# class using Entity Framework. 

CREATE TABLE dbo.Foo
(
ID int NOT NULL CONSTRAINT PK_Foo PRIMARY KEY,
Bar int NOT NULL
);
GO

The following C# code is a naive implementation of the upsert logic: 

int id =3; 
int bar = 199;
Foo foo = this.dbContext.Foos.find(id);
if (foo == null) 
{
foo = new Foo 

ID = id,
Bar = bar
};

this.dbContext.Foos.Add(foo);

else 
{
foo.Bar = bar;
}


this.dbContext.SaveChanges();

The above code doesn't work correctly when multiple users run the same code simultaneously. A duplicated key error happens when more than one users  try to insert different rows with the same key value.  A possible solution is to catch the duplication key error and try the above operation again. The benefit is that we handle everything in C# code. The disadvantage is the added complexity because of the error handling and try logic. As described below, it has race conditions that don't work correctly for non-idempotent update. 

Adding C# synchronization such as a lock in a static class variable helps. However, the solution breaks when web garden is enabled in the hosting ASP.NET application pool. Each worker process has its own copy of a static class variable. To solve the problem, one needs to use named mutex to enable inter-processes synchronization. The named mutex solution fails when we run multiple Web applications in different server. Another cons is that  the above C# code requires two round trips to the back end database. 

The right solution is to move the upsert operation to the back end database engine and add appropriate database locks. The tricky thing is to find the right "locks". Though work correctly, using a table level exclusive lock hurts performance. This Conditional INSERT/UPDATE blog gives a good solution:  

CREATE PROCEDURE dbo.Insert_Or_Update_Foo
      @ID int,
      @Bar int
AS

SET NOCOUNT, XACT_ABORT ON

BEGIN TRAN

IF EXISTS(SELECT * FROM dbo.Foo WITH (UPDLOCK, HOLDLOCK) WHERE ID = @ID)
  BEGIN
      UPDATE dbo.Foo
      SET bar = @bar
      WHERE ID = @ID
END
ELSE
BEGIN
      INSERT INTO dbo.Foo (ID, Bar)
      VALUES (@ID, @Bar)
END

COMMIT

RETURN @@ERROR

The key point in the above solution is setting the UPDLOCK and HOLDLOCK locks to avoid race conditions. At any time, only one instance of the stored procedure is allowed to run. The blog author also describes another solution using the SQL merge command:

CREATE PROCEDURE dbo.Merge_Foo
     @ID int,
     @Bar int
AS 
SET NOCOUNT, XACT_ABORT ON;  
 
MERGE dbo.Foo WITH (HOLDLOCK) AS f
USING (SELECT @ID AS ID, @Bar AS Bar) AS new_foo
ON f.ID = new_foo.ID
WHEN MATCHED THEN
UPDATE SET f.Bar = new_foo.Bar
WHEN NOT MATCHED THEN
INSERT (ID, Bar)
VALUES (new_foo.ID, new_foo.Bar);

RETURN @@ERROR;

GO

My original solution was to use a similar merge command but didn't set the HOLDLOCK option. As the blog author pointed out: without HOLDLOCK, the merge command releases its UPDLOCK after the existence checking and opens the door for multiple concurrent inserts. 

Another possible solution that doesn't use an explicit lock is to insert first, check error code for duplicated key error and update in the second step.

BEGIN TRY
 INSERT INTO dbo.Foo (ID, Bar) VALUES (@ID, @Bar)
END TRY
BEGIN CATCH
IF ERROR_NUMBER() = 2627
UPDATE dbo.Foo  SET bar = @bar  WHERE ID = @ID
END CATCH

This stackoverflow  answer discussed this solution and its potential performance issues of error catching. Except the first insert, all following updates incur the expense of error catching. This solution only works in a write-once-read-many situation that has very few updates. If there are many update operations, it is better to do an extra update first. 

UPDATE dbo.Foo  SET bar = @bar  WHERE ID = @ID
IF @@ROWCOUNT = 0
BEGIN 
    BEGIN TRY
INSERT INTO dbo.Foo (ID, Bar) VALUES (@ID, @Bar)
    END TRY
    BEGIN CATCH
IF ERROR_NUMBER() = 2627
UPDATE dbo.Foo  SET bar = @bar  WHERE ID = @ID
    END CATCH
END    

However, if the update is not an idempotent operation such as the above simple value setting, the above TRY-CATCH solutions (including both T-SQL stored procedure and the C# error handling solution) still have race conditions. For example, if the update is "SET bar = @bar + 1", those solutions don't work correctly. As a general upsert solution, the two stored procedures that use UPDLOCK and HOLDLOCK don't have a race condition and work correctly. 

Tuesday, April 8, 2014

JavaScript note: prototype, [[Protyotype]] and prototype property

JavaScript has two types of objects: function object and non-function object. Conceptually, all objects have a prototype (NOT A PROTOTYPE PROPERTY). Internally, JavaScript names an object's prototype as [[Prototype]].
There are two approaches to get any object (including non-function object)'s [[prototype]]: theObject.getPrototypeOf() method and the __proto__ property. The __proto__ property is supported by many browsers and Node.js. It is to be standardized in ECMAScript 6.
Only a function (a callable) object has the prototype property. This prototype property is a regular property that has no direct relationship with the function's own [[prototype]]. When used as a constructor ( after the new operator), the function's prototype property will be assigned to the [[Prototype]] of a newly created object. In a non-function object, the prototype property is undefined . For example,
var objectOne = {x: 5}, objectTwo = Object.create({y: 6});
Both objectOne and objectTwo are non-function objects therefore they don't have a prototype property

Thursday, April 3, 2014

A better way to structure a .net solution

Being able to quickly navigate source files in a large .net solution that has many projects is highly desirable for a programmer. My experience with some projects such as nopCommerce (http://www.nopcommerce.com/) was not a good one. Often it took me two or three tries to the right project folder for a namespace.

This YoutTube video  made by Roger Harford describes a good idea to structure a .net project. Most of the 15 minutes 47 seconds video discusses the reasons why the Visual Studio default structure is not a good one when when source control and multiple projects are added. For example, we usually don't want to use the same name for a solution and its first project. The method to structure a .net solution is rather simple and is summarized below.

When you create a solution in Visual Studio at the beginning, choose the Installed --> Templates -->Other Project Types --> Visual Studio Solution --> Blank Solution. Then name your solution based on your company and product name: MyCompany.MyProduct. Visual Studio will create a folder and solution files based on the new name. 

The "Blank Solution" template creates an empty solution. You can add to this solution a number of projects that are named after your namespace. For example: MyCompany.MyProduct.Web,   MyCompany.MyProduct.Domain,  and MyCompany.MyProduct.Database are good names for a MVC Web site project. 

When you add more projects and source files,  the above solution structure allows a clear and natural mapping between many file names, folder names and namespaces. Additionally, it is easy to move files and folders around and keep a consistent mapping between a namespace and its files.