How to optimize download performance for files stream from your database via asp.net

I had a requirement to create a download page that streamed office documents to the client from our website. The way Office (Word and Excel at least) open documents from URL’s are the browser downloads it first, then Word and Excel call back to the server to make sure nothing changed.

Normally, these results in you streaming the file twice, unknowingly. That is bad for performance and your bandwidth. No one wants a laggy and latent operation.

So, what is there you can do? Well, if you intercept the headers to a normally hosted document, you will see a header called ETag. This is basically a document hash reference that can be used when the follow request occurs.

Ok, English right.

So, the solution I implemented is pretty easy. I created a unique hash tag (it can be a true hash, or simply a rowid and the recordtimestamp combined) and set the ETag header.

You ask, “Well, how does setting an ETag header prevent duplicate efforts?”. It doesn’t. Here comes the work you do first.

When a request comes in. Look for a header called: IF-None-Match. If you see it, you need to jump into a little simple logic.

First, calculate your Etag value and compare it to the value of the If-None-Match and if it, well, matches, you get to do something special. If it doesn’t match, act like nothing happened and continue on with the normal data return.

So, what is the something special you ask? Simple, Clear your response contents, set your StatusCode to 304, and end the response .

Now I hear, geez. You made this sound very complicated. Is there an easier way to explain this?

Sure…I’ll try.

 

Here are the steps for simplicity:

  1. Check for If-None-Match header
    1. If Exists, compare value of header to your hash(id combo, CRC, whatever you want to use)
      1. If Matches,
        1. Set StatusCode =304,
        2. Clear Response Contents
        3. End Response
      2. Else.
        1. Continue as normal
  2. Normal Winking smile
    1. Generate ETag hash (again, however you want to identify this file as unique, something that changes if the file changes)
    2. Response with ETag in header

What does that mean in code? I don’t speak flow.

Here is a code example.

  private const string _wordMimeType = "application/msword";// (for Microsoft Word files)
    private const string _excelMimeType = "application/vnd.ms-excel"//(for Microsoft Excel files)
   
    private void StreamDocumentfromDatabase(Guid documentID)
    {
        string etagFilter = Request.Headers["If-None-Match"];
        string etag;
        //load the bits into memory
        using (var context = new EvidenceReviewContext())
        {
            if (!string.IsNullOrEmpty(etagFilter))
            {
                //check db to see if to respond with ignore (cached copy already found)
                var blobCheck = (from ev in context.Documents
                                 where ev.documentID == documentID &&
                                 ev.BlobDataId.HasValue
                                 select ev.BlobDataId.Value
                                     ).FirstOrDefault();
                if (blobCheck != null)
                {
                    if (etagFilter == (documentID.ToString() + blobCheck.ToString()))
                    {
                        Response.ClearContent();
                        Response.StatusCode = 304;                       
                        Response.End();
                    }
                }
            }

            var theBlob = (from ev in context.Documents
                           where ev.documentID == documentID &&
                           ev.BlobDataId.HasValue
                           select new
                           {
                               Data = ev.BlobData.Data,
                               FileExtension = ev.FileExtension,
                               BlobID = ev.BlobDataId.Value
                           }).FirstOrDefault();

            if (theBlob == null)
                return;

            if (theBlob.FileExtension.ToLower().StartsWith(".doc"))
                Response.ContentType = _wordMimeType;
            else if (theBlob.FileExtension.ToLower().StartsWith(".xls"))
                Response.ContentType = _excelMimeType;
            Response.AddHeader("Content-Disposition", "attachment; filename=" + documentID.ToString().Replace("-", "") + theBlob.FileExtension);
            etag = documentID.ToString() + theBlob.BlobID.ToString();
            Response.AddHeader("ETag", etag);

            //Write the file directly to the HTTP content output stream.

            //NOTE: THIS IS NOT MEMORY EFFICIENT AT ALL. TODO: change compression technique for documents
            byte[] dataToSend = null;
            using (MemoryStream dataStream = new MemoryStream(theBlob.Data))
            {
                using (MemoryStream stream = UnzippedData(dataStream))
                {
                    dataToSend = stream.ToArray();
                }
            }
            //relieve some pressure by pushing write of data outside of the blocks above to allow for memory free.
            Response.BinaryWrite(dataToSend);
            Response.Flush();
        }
        Response.End();
    }

I hope this article helped you out a little and saved you some serious download times. (When Office revisits the file, it remembers the ETag. So you get efficiencies all over the place).

Happy Coding!

Leave a Reply

Your email address will not be published. Required fields are marked *