Taking Asynchronous cache of web pages using Web Streams of C#


This article based on .Net feature on Asynchronous cache of web pages using Web Streams of C# with code snippet

.NET provides Web Streams support to read web contents. Using this feature, you can just as easily read from any web page on the Internet rather than reading from a stream provided by a custom server. WebRequest and WebResponse are the two core classes related to Web Streams. A WebRequest is an object that requests a Uniform Resource Identifier (URI) such as the URL for a web page. You can use a WebRequest object to create a WebResponse object that will encapsulate the object pointed to by the URI. That is, you can call GetResponse( ) on your WebRequest object to get the actual object (e.g., a web page) pointed to by the URI. What you get back is encapsulated in a WebResponse object. You can then ask that WebResponse object for a Stream object by calling GetResponseStream(). GetResponseStream( ) returns a stream that encapsulates the contents of the web object (e.g., a stream with the web page).

This example presents a class WebCache which retrieves the contents of a web page (example taken is the home page of The Time of India) as a stream and writes the content to a local disc file. Note that it is bulit on Asynchronous mechanism; so that in your code you can start the cache and proceed with other task. This would be helpful to take cache of large web pages or when the network is slow.


using System;
using System.IO;
using System.Net;
using System.Net.Sockets;

public class WebCache
{
AsyncCallback ResponseCallBack;
AsyncCallback ReadCallBack;
HttpWebRequest request;
HttpWebResponse response;
static int bufferSize = 1024;
byte[] bytes = new byte[bufferSize];
Stream stream;
FileStream fileStream;

public void AsyncCache(string uri, string cacheName)
{
fileStream = File.Open(cacheName, FileMode.Create, FileAccess.Write, FileShare.None);
ResponseCallBack = new AsyncCallback(OnResponse);
//The Create method of WebRequest is overloaded on the type of the parameter.
//It returns different derived types depending on what is passed in. Here as
//you have passed in a URI, an object of type HTTPWebRequest is created. The
//return type, however,is WebRequest, and so you must cast the returned value
//to HTTPWebRequest.
request = (HttpWebRequest)WebRequest.Create(uri);
request.BeginGetResponse(ResponseCallBack, null);
}

void OnResponse(IAsyncResult ar)
{
if (ar.IsCompleted)
{
response = (HttpWebResponse)request.EndGetResponse(ar);
stream = response.GetResponseStream();
ReadCallBack = new AsyncCallback(OnRead);
stream.BeginRead(bytes, 0, bytes.Length, ReadCallBack, null);
}
}

void OnRead(IAsyncResult ar)
{
int bytesRead = stream.EndRead(ar);
if (bytesRead > 0)
{
fileStream.Write(bytes, 0, bytesRead);
stream.BeginRead(bytes, 0, bytes.Length, ReadCallBack, null);
}
else
{
Console.WriteLine("Caching Completed!");
stream.Close();
fileStream.Close();
response.Close();
}
}
}

Here is a test code:

new WebCache().AsyncCache("http://timesofindia.indiatimes.com/", @"D:\cache.html");


Comments



  • Do not include your name, "with regards" etc in the comment. Write detailed comment, relevant to the topic.
  • No HTML formatting and links to other web sites are allowed.
  • This is a strictly moderated site. Absolutely no spam allowed.
  • Name:
    Email: