Divide by zero
# Tuesday, June 17, 2008
dasBlog and the Validation of viewstate MAC failed error

I have been having the following error occuring a lot in the dasBlog eventlog.

System.Web.HttpException: Validation of viewstate MAC failed. If this application is hosted by a Web Farm or cluster, ensure that <machineKey> configuration specifies the same validationKey and validation algorithm. AutoGenerate cannot be used in a cluster. ---> System.Web.UI.ViewStateException: Invalid viewstate.

As this blog is not hosted on a web farm, I searched high and low, but hadn't found anyone else reporting the same problem. After some further investigation, I think I have found the cause. It is always on the CommentView.aspx page. I tried adding comments in various browsers, and that worked without a hitch. So in frustration I did what I should have done in the first place, i.e. completely read the error message. In most instances there was an interesting user agent string such as MRSPUTNIK 1, 5, 0, 19 SW. This is mentioned in a lot of forum entries as a harvester or scraper. I suspect it is trying to post a spam message on the comments page, but because it hasn't followed the normal process, it doesn't contain the view state that the page was accepting. This is almost certainly a bot, and this is why there is no viewstate to decrypt from a previous page. I also checked the IP address, and port number. The ip addresses had many forum entries. One example was 89.149.205.199. I found an interesting site called IPillion which traces IP addresses. So adding that IP address to the url gives http://www.ipillion.com/?ip=89.149.205.199 which reports this IP address as sending lots of spam comments.

So it seems that dasBlog is sort of preventing the spam comments, altho accidentally. I hope this entry helps others having the same issue, as I couldn't find anyone using dasBlog who had the same problem, or had traced it to spammers using bots.
 


Tuesday, June 17, 2008 11:07:26 PM (AUS Eastern Standard Time, UTC+10:00)   #    Comments [6]  Webmaster | dasBlog
link to del.icio.us link to reddit link to StumbleUpon link to Facebook Bookmark to Google
# Wednesday, June 11, 2008
Where are your blog visitors coming from in DasBlog?

See part II of this post here.

By combining the Google Maps API with an IP address host service such as HostIP.info, it is possible to present a map showing the current visitors to your blog, and to display them on a map. This implementation shows how to do this for dasBlog. The result will look something like this:

Notice that the shadow on the marker for the united States is darker, indicating more hits for that area. Each of the marker icons can be clicked to give location details and how many hits for that location.

Client side or server side location determination

The location of an IP Address can be determined at the server side, or at the client side. Each has its advantages and disadvantages. A brief but not complete list of pros and cons is:

  • Client side
    • Pros
      • No delay in serving the page as the processing is done after the page has been serves to the client.
      • Easy to implement
    • Cons
      • Requires a call to a web service on a different domain which will cause security concerns
      • Does not allow for users to be able to modify what is their perceived location
  • Server side
    • Pros
      • More control as you are not dependent on an external web service
      • It is easier to cache results if required
    • Cons
      • More difficult to implement, especially if you want to keep to the architecture of the software you are developing for
      • You must maintain and update your own databse of IP address ranges and locations
The client side solution

This solution relies on a client side solution, taking advantage of the excellent Prototype JavaScript Framework. Further down the track I might possibly change this to a server side solution, but as it is only for an administrative page it is not so important. Note that there are security concerns with doing this client side. It is necessary top make an AJAX call to an external webservice, and this could be malicious, especially if a site were hacked. In order to view the example for this, you would have to add http://blog.focas.net.au to your trusted zones in Internet Explorer. If you do this, please make sure you remove it immediately afterwards. Usually I would only add a trusted site if I were using it frequently. The example page can be viewed at Visitors.aspx (Please scroll down to the bottom of the page as there are styling issues. The page may need refreshing)

I created a server page called Visitors.aspx, and a user control called VisitorsBox.ascx. These are directly based upon Referrers.aspx and ReferrersBox.ascx in the standard DasBlog source code.

How it works

C# code in the page creates JavaScript which is then registered with the RegisterClientScriptBlock method of the ClientScript object. The javascript requires some dynamic variables to be created and this is why it is in the page. Also the way dasBlog is set-up seems to make it difficult to add JavaScript to just one page without using this method. Atwo dimensional array is created which contains unique IP addresses and a count of hits for those addresses. THis is then processed on the client side The JavaScript code to create the map follows, it is well documented in the Google Maps API documentation, so i won't say too much about it.
     
if (GBrowserIsCompatible()) {
     map = new GMap2(document.getElementById('map_canvas'));
     map.setCenter(new GLatLng(37.4419, -122.1419), 1);
     var mapTypeControl = new GMapTypeControl();
     var topRight = new GControlPosition(G_ANCHOR_TOP_RIGHT, new GSize(10,10));
     var bottomRight = new GControlPosition(G_ANCHOR_BOTTOM_RIGHT, new GSize(10,10));
     map.addControl(mapTypeControl, topRight);
     map.addControl(new GSmallMapControl());

Now we need to loop thru the IP address array and call the hostip.info page to get the city/state/country information. There are different calls that can be made to return HTML or XML. I used the HTML version as it is a smaller response, and easy to parse with regular expressions. If there is no information for an IP address it will return (Unknown). Thed regular expression trims them out. This database seems to be maintained by one person, so consider donating to them.

var myAjax;
      
 for (var x=0;x<addresses.length;x++) {
     myAjax = new Ajax.Request('http://api.hostip.info/get_html.php?ip=' + addresses[x][0] , 
     {
         method: 'get', 
         onSuccess: function(originalRequest) {
            var result=originalRequest.responseText;
            var tokens=result.split('\n');
            var count=addresses[x][1];
            var country=/\:[^\(]+/i.exec(tokens[0])[0].substring(1).strip();
            var city=/\:[^\(]+/i.exec(tokens[1])[0].substring(1).strip();
            var location=(city.length>0)? city + ',':'';
             location+=country;     
            varmessage=location + ': ' + addresses[x][1] + ' visitor';
            if (addresses[x][1] > 1) message+='s';
		

 Now we can call the GClientGeocoder object to turn the city/state/country information into a latitude/longtitude combination. The GClientGeocoder's getLatLng object expects a method to be passed in to handle a callback. In this code, it is used to create the marker on the map which can be clicked to view more information.

   var geocoder = new GClientGeocoder();
   geocoder.getLatLng(location,
   function(point) {  
      if (point) { 
         var marker = new GMarker(point);   
         map.addOverlay(marker);   
         GEvent.addListener(marker, 'click', function() {
         marker.openInfoWindow(message);
      });
   }
   });
		

This javascript is created in the C# code using a StringBuilder object, and is then added to the page using the RegisterClientScriptBlock method of the ClientScript object.

Page.ClientScript.RegisterClientScriptBlock(this.GetType(), "ipaddressdata", sb.ToString(), true);

Download

The download code contains the Visitors.aspx and Visitors.ascx page and component. These should be dropped into the newtelligence.DasBlog.Web folder of your dasBlog source code. Before going any further you will need to change the Google Maps key in VisitorsBox.ascx.cs file.

Page.ClientScript.RegisterClientScriptBlock(this.GetType(), "googlemaps",
    @"<script src='http://maps.google.com/maps?file=api&amp;v=2&amp;key=YOUR KEY GOES HERE'"+" type='text/javascript'></script>");
		

When you have compiled the code, drop those two files into the root of your website, and the newtelligence.DasBlog.Web.dll into the bin folder of your website. Remember to change the trust settings for your blog, and enjoy seeing who is visiting your site. Note that the Google Maps JavaScript and Prototype JavaScripts are served up from the Google servers so you don't need to download them.

Visitors.zip (10.57 KB)


Wednesday, June 11, 2008 11:53:11 PM (AUS Eastern Standard Time, UTC+10:00)   #    Comments [0]  Downloads | JavaScript | Webmaster
link to del.icio.us link to reddit link to StumbleUpon link to Facebook Bookmark to Google
# Tuesday, June 03, 2008
Web log file parsing with c#

In the past week I have been doing a lot of analysis on the web analytics reports that are being generated by AWStats (A good Open Source product, don't be put off by the arrogant download Firefox message) and another commercial product that we have an Enterprise license for. I won't name this product, as I am very disillusioned with it, its accuracy is very questionable.

In order to determine how accurate the statistics were, i chose one days log file for a reasonably busy website. This site is mainly a Monday to Friday website with peaks around 10 AM to 4 PM, and another small peak 6 PM to 8PM. Weekends see a lower usage, so I used a sunday as this would be the smallest log I could reasonably analyse. Thee log contained just over 100,000 entries. In order to analyse it, I used Visual Studion 2008, filtering using regular expressions. After a while I realised I would like to review these stats on an ongoing basis, so i wrote a small log parser. There is an excellent free parser available from Microsoft called Log Parser 2.2 however I wanted direct control, because I also want to create some graphs. My first version creates a beautiful graph using the excellent Open Source Silverlight data visualisation component called Visifire. I haven't included that code in this download as i wanted to clean it up and make it more re-usable and extensible.

This tool is not particularly robust as it is just a throwaway utility, but I tried to make it a bit extensible for future requirments. The log files I am parsing are Apache log files. So I designed a simple ILogFormatReader interface, and created an ApacheLogParser implementation. This doesn't populate all the fields, but it's easy to see how it works to finish off the implementation if more information is necessary.

The main issue with parsing log files is how they are separated. in this case the log file is space separated. Any fields that have spaces in them are surrounded by double quotes or square brackets. Another consideration is what to do if a log entry comes in with an invalid format. I didn't worry about this too much, as the web logs should work well if the first line is correct. if this code was to go into production then of course that would have to be refactored.

Not all log file are created equal

Log parser example outputEach web server can be customised to record different informatin in the log files. Additionally proxy servers can modify what is sent thru to the web server, so it is important to check the format of any logfile before parsing. The log files I am parsing are using an Apache log file format. More information can be found on the Apache log files page and the Microsoft Log file formats in IIS (6.0) page.

For those interested, I found the statistics in AWStats very close to what I believe they should be. There were inaccuracies that I couldn't explain. To be fair, we are running an old version of AWStats so I assume that this might have been addressed in a newer version. The very expensive commercial application we use reports approximately twice the bandwidth that it should, and does not understand how to handle the JSESSIONID tacked on to the end of some JSP applications. It confuses the real resource with the session ID, and we get very iaccurate statistics as a result.

There is some minimalistic reporting to give an idea of how to use the parser. The output looks like thie image here.

There is a download included below. I would be interested to hear if anyone finds this useful. I hope to update this application at some stage to include some graphing output. I might use Visifire as mentioned earlier, or possibly Microsoft Excel. I think that is a better option as it would allow for richer maniupulation of the reporting after the log files have been parsed. Excel has such powerful features it would require good justification not to use it as a reporting mechanism. I think that AWStats doesn't do their reports justice by presenting them as they do, Everyones reporting requirements are different.

Download

The applicaition can be downloaded here:

LogParsing.zip (7 KB)


Tuesday, June 03, 2008 11:33:35 PM (AUS Eastern Standard Time, UTC+10:00)   #    Comments [0]  Downloads | Webmaster
link to del.icio.us link to reddit link to StumbleUpon link to Facebook Bookmark to Google
# Tuesday, May 27, 2008
How to add a provider to the Internet Explorer search bar

Internet Explorer 7 and above has a search bar with a default provider of live search. It also contains a link to other search providers such as Microsoft and Wikipedia. It is very easy to add a new provider to this search bar. It can be done manually, but this article describes how to change a web page to enable auto-detection of a search provider, plus how to create a link to that provider. Firefox 2  and above also has a search sidebar that supports this feature.

The search provider requires an XML file which adheres to the OpenSearch Description schema. It is easy enough to create this, but if you want to use Internet Explorer to create it for you then you can use the Find more providers option in the search bar dropdown. On the page it displays you place the search page for your site with the word TEST instead of a valid search keyword. This will only work for sites that use GET rather than POST for search queries. Once you have entered the URL, there is a View XML option which will display the necessary XML. It will look like this:

<? xml version = "1.0" encoding = "UTF-8" ?>
<
OpenSearchDescription xmlns = "http://a9.com/-/spec/opensearch/1.1/" >
<
ShortName>blog.focas.net.au</ShortName>
<
Description>blog.focas.net.au provider</Description>
<
InputEncoding>UTF-8</InputEncoding>
<Url type="text/html"
      template
="http://blog.focas.net.au/SearchView.aspx?q={searchTerms}"/>
</
OpenSearchDescription>

The ShortName will show up in the search bar. This file should be saved somewhere on your website. it doesn't matter where as it is the metadata in the pages that will point to its location.

On any page where you want Internet Explorer to automatically detect the search provider, you need to add some metadata into the head section of the page. This will look like this:

<link title="blog.focas.net.au search"
      
rel="search"
      
type="application/opensearchdescription+xml"
      
href=http://blog.focas.net.au/blog.focas.net.au.searchprovider.xml/>

This tells Internet Explorer that there is a search provider from the rel attribute containing the value search. The type is a mime type referring to the Open Search Description xml format. the href is the location of the XML file you saved earlier.

Once this is in the page, when you browse to this page, Internet explorer will change the drop-down icon colour on the search toolbar to orange. Like this: 


Figure 1: Search provider glowing when it has discovered a search provider

By clicking on the drop-down, the provider will be displayed. It has not been installed at this stage, but is available whenever the page is viewed:


Figure 2: Search provider displayed in the search provider drop-down

To allow the user of a page to install the search provider in Firefox2+ or Internet Explorer 7+, you need to create a link on a page that calls some javascript. The window.external method must be called in order for the link to work. The following script is a modified version of a script on the Mozilla developer center 

function installSearchEngine(url) {
  if (window.external && ("AddSearchProvider" in window.external)) {
    // Firefox 2 and IE 7, OpenSearch
    window.external.AddSearchProvider(url);
  } else {
    // No search engine support (IE 6, Opera, etc).
    alert("Sorry, your browser doesn't provide search engine support");
  }
}

This page has a search provider in its metadata, so you can see the effect in the search box if using IE7+. Alternatively you can try installing the search engine by clicking here.


Tuesday, May 27, 2008 10:26:21 PM (AUS Eastern Standard Time, UTC+10:00)   #    Comments [0]  Metadata | Webmaster
link to del.icio.us link to reddit link to StumbleUpon link to Facebook Bookmark to Google
# Saturday, May 17, 2008
Image Gallery using metadata
Image gallery using metadata

This is a small application that takes all the images in a directory and creates a lightbox style AJAX image gallery that is web ready.  It reads the metadata in the picture to extract the title, description, keywords and rating. Primarily I wrote this to experiment with some C# 3 features, such as LINQ.

There are no options in the program, its fairly simple, point it at an input directory, an output directory and click  Make the gallery! And it does its stuff. Its not overly sophisticated, it doesn’t generate thumbnails, just uses width and height attributes on the img tags. All the necessary support files will be copied across to the output directory. A page called index.html is generated and automatically displayed when complete.

Metadata collection

Metadata is collected using the System.Windows.Media.Imaging namespace. This is part of the Windows Presentation Foundation. When I tested this it worked well on Windows Vista, but when I tested it on a machine running Windows XP SP2, I got a codec not available error when accessing the metadata. I got around this by installing Microsoft Photo Info which is a fantastic utility for XP that incorporates read/write access to image metadata,  It has explorer integration, and I highly recommend it if you like adding metadata to your images. It can also be very helpful for those who have upgraded from Windows Vista to Windows XP.

The code to access the metadata is straightforward:

using(Stream stream = fii.OpenRead()) {

                    BitmapDecoder decoder = BitmapDecoder.Create(stream, BitmapCreateOptions.None, BitmapCacheOption.Default);

                    BitmapFrame frame = decoder.Frames[0];

                    BitmapMetadata metadata = (BitmapMetadata)frame.Metadata;

                    caption = metadata.Title;

                    if(metadata.Subject != null) {

                        caption += " - " + metadata.Subject;

                    }

I tried to close the stream, hoping that the metadata would be cached, but I suspect it uses Lazy loading because it would throw an error as soon as I accessed the metadata.

LINQ seemed a good idea for filtering the files. There may be a better way but this worked just fine.

            string[] files = Directory.GetFiles(directoryToProcess);

            var query = from f in files

                        where (new string[] { ".jpg", ".png", ".gif" , ".jpeg"}).Contains(Path.GetExtension(f).ToLower())

                        select f;

              

                return query;

Having the list of file extensions as the first part of the where clause didn't please me, but that is just aesthetics.

Javascript/CSS

The lightbox effect is achieved by using the MooTools JavaScript framework, and another library and example from phatfusion which creates the lightbox. This program just encapsulates the HTML generation,  metadata extraction and file copying.

Scope for improvement

There is a lot of scope for improvement. The class layout is fairly simple. Interfaces could be added, and a plug-in approach to allow for different LightBox or similar effects. As this was more of an experiment than a robust utility I didn’t get too precious about such design considerations. I hope it helps a few people to make a gallery for themselves.

Focas.NET.ImageGallery.zip (84.1 KB)
Saturday, May 17, 2008 10:11:52 PM (AUS Eastern Standard Time, UTC+10:00)   #    Comments [0]  Downloads | Metadata | Webmaster
link to del.icio.us link to reddit link to StumbleUpon link to Facebook Bookmark to Google
# Sunday, May 11, 2008
Wikipediaise - a c# VSTO Word addin

Wikipediaise - What is it?

Wikipediaise is a Visual Tools for Office addin (VSTO) developed in Microsoft Visual Studio 2008 as an addin for Microsoft Word. It is written in C#.  It  was designed to hyperlink acronyms and jargon  to Wikipedia.

I do a lot of technical documentation for my work, and the IT industry being what it is, the documents end up with a ridiculous number of acronyms. To make life easier, we usually put an abbreviation section at the top of the document, but this is a time consuming process to go thru every time, so I automated it. Additionally I added another method which will seek out the first occurrence of an abbreviation or acronym, and hyperlink it. First I will describe how this works, then how to use and customize the functionality.

Initially Wikipedia was used as the reference point, as it is an excellent reference point for technical information. After a while it became clear that many acronyms were better documented elsewhere, or in internal company documents, so I added the ability to use alternative reference sources.

Note that although I refer to acronyms, the addin is good for jargon and technical terms as well.

The following images show a before and after shot of a simple document, additionally it shows the document with an acronym table inserted at the top.


Figure 1 - Before shot of a Word document with acronyms and jargon to be hyperlinked


Figure 2 - The same document after it has been hyperlinked


Figure 3 - The same document, hyperlinked, and with an acronym table inserted at the beginning

How it works

The application comes with an embedded XML file with a set of pre-defined acronyms. This serves as an example only. The application will look in the %mydocuments% folder for a file called wikipediaise.dic. If this file exists it will override the embedded file, so the application can be customized for most requirements.

Format of the XML file

There are two elements available in the wikipediaise.dic file shown below.

Table 1 - Elements available in wikipediaise.dic

Element

Description

Comment

excludeStyle

Lists a Word style to be excluded from the process

This could be a built in style or a user defined style. If a word is in this style it will not be hyperlinked.

entry

Contains a mandatory key term that will be searched for. Optional attributes will be described later.

This text will be searched for in a case sensitive manner. If the term is found in the middle of a word, it will still be matched. For this reason, position longer superset acronyms earlier e.g. place https before http

The excludeStyle element has no attributes, so just looks like this


Figure 4 - excludeStyle element example

The entry element has attributes, these are described below.

Attribute name

Mandatory

Description

Comment

key

Yes

The term that will be searched for and hyperlinked.

Case sensitive. position longer superset acronyms earlier e.g. place https before http

wikipediaEntry

No

This attribute is only used for entries in Wikipedia where the page name is not the same as the attribute.  E.g. the entry in Wikipedia for Apache has a page name of Apache_HTTP_Server

 

description

No

This will be used as a tooltip when a hyperlink is created in Word. It will also be used in the acronym table if that feature is used,

 

url

No

This is an alternative URL if Wikipedia is not to be the source of reference.

 

Table 2 - entry element attributes

Focas.NET.wikipediaise.zip (21.4 KB)
Sunday, May 11, 2008 5:29:29 PM (AUS Eastern Standard Time, UTC+10:00)   #    Comments [3]  Downloads | VSTO | Word 2007
link to del.icio.us link to reddit link to StumbleUpon link to Facebook Bookmark to Google
# Wednesday, April 23, 2008
Googlebot frequency

While researching how google crawls websites, I found this great piece of information on the Google Webmasters site.

I am certain google doesn't crawl every website every few seconds. Does this actually mean that when the Googlebot is crawling, it won't access the website for the duration of the crawl more than every few seconds? is this to avoid looking like a potential DoS attack? I think the wording could be clearer here!

For reference, the above image was taken from this URL: http://www.google.com/support/webmasters/bin/answer.py?answer=34439&ctx=sibling


Wednesday, April 23, 2008 2:54:15 PM (AUS Eastern Standard Time, UTC+10:00)   #    Comments [0]  Webmaster
link to del.icio.us link to reddit link to StumbleUpon link to Facebook Bookmark to Google
# Monday, March 10, 2008
Windows Vista grievance

I just tried to install Microsoft PowerShell on a Windows Vista Home Basic installation. OK, so it is a power application, and on a home machine, but there are some scripts I really wanted to run here. But guess what, It wont install on Vista home basic. I checked the system requirements at the PowerShell home page and sure enough, no mention of Vista home basic.

But what I think is lousy about this is that when you go to the Choose an edition page of the Vista site, it doesn't say anywhere that you cannot run PowerShell on Vista Basic. I find this a little sneaky. I can understand their logic behind it, perhaps if I want to use a power application, I shouldn't run it on a home basic edition installation, however, lets be honest up front about it.

Well, lets play the game and upgrade to Ultimate or even Home Premium. That is a fairly simple option as the control panel explains, in fact it states You can learn more about editions of Windows Vista, or you can upgrade immediately. How cool is that! All I have to do is click the button, purchase my upgrade, and I will have a shiny new bells and whistles Vista edition, and now I can run PowerShell.

Unfortunately it's not that simple. The upgrade options fire up a Windows anytimeUpgrade web page to begin the process. Here I can select my billing location, and proceed. Unfortunately the only options are to bill to the United States, or Canada. Here in Australia we actually have the internet and are capable of online shopping. Why is it so difficult to provide an upgrade option online? So I can't really upgrade instantly as advertised.

It looks like I will have to put this off unless I feel like going to a shop to upgrade. I will be upgrading, but due to the experiences with various versions of Vista, I will be upgrading to Windows XP Service Pack 2.


Monday, March 10, 2008 10:28:08 PM (AUS Eastern Daylight Time, UTC+11:00)   #    Comments [0]  PowerShell | Vista
link to del.icio.us link to reddit link to StumbleUpon link to Facebook Bookmark to Google
# Monday, December 17, 2007
Handy PowerShell commands

These are some PowerShell commands I have created that I find really handy. Some of the first ones are just helper methods for aiding more useful scripts. These are stored in my PowerShell profile located at:
%userprofile%\my documents\WindowsPowerShell\Microsoft.PowerShell_profile.ps1.  There is a special variable to point to this file, called $profile. It is easy to edit my PowerShell profile by typing: notepad $profile which will open up the profile in notepad. I actually use NotePad2 which I have renamed to n2 to make it easy to use from the run prompt, command line, or PowerShell. To use NotePad2, I type n2 $profile.

To make it easier later, I have set an alias to run Internet Explorer. I don’t use this alias in PowerShell usually, but it does get used in some cmdlet’s later. I call the alias ie

set-alias ie "${env:programfiles}\Internet Explorer\iexplore.exe".

Now the cmdlet I use a lot. This one just retrieves the latest item from a podcast feed, and plays it in the default Internet Explorer media player. Tjis cmdlet looks like this:

function play-Podcast($url) {

                ie ([xml](new-object net.webclient).DownloadString($url)).rss.channel.item[0].enclosure.url

}

This takes an URL and retrieves the contents as a string, turns it into an XML object, then performs the XPath rss/channel/item[0]/enclosure/url. It now passes that to the Internet Explorer alias set earlier, and the effect is to play the latest podcast entry.

My colleague David laughs at me for this, he thinks I should just use a podcast client, or just use a bookmark in a browser, but I find this handy. Now I can set other cmdlets so I can hear my favourite podcasts easily. I like a security podcast by Patrick Gray called Risky Business.  The cmdlet I have created is:

function risky-Business() {