Posts

Showing posts from October, 2015

Doing Delegates Differently

This is an article about using delegates in a way in which you may not have thought of using them before. As I started to write this it became obvious that this would be a long article, so I am splitting in two. This first part is an overview of delegates and how to group functionality and cross cutting concerns with them. Part two will delve into using delegates to create a validator which can validate any type of object dynamically.
Part 1 - Cross cutting concerns
http://www.codeproject.com/Articles/1045523/Doing-Delegates-Differently-Part

Part 2 - Dynamic Validator
http://www.codeproject.com/Articles/1051860/Doing-Delegates-Differently-Part
Earlier I wrote that I would include an article for actually getting the data from a web site using mechanisms such as WebClient, HttpWebClient etc.

Someone beat me to the punch, go to this well written article in code project. He presents using ScrapySharp. Although I don't have much experience with this product, it looks worth exploring. The article shows how I would use Fidler to examine the requests going back and forth and what needs to be sent as parameters etc.

http://www.codeproject.com/Articles/1041115/Webscraping-with-Csharp

Steve

Using HtmlAgility pack and CssSelectors

To start, I don't claim to be an expert in XPath or Regular Expressions but the following are some observations I have made while parsing HTML documents for client projects.
In the following examples I am using HtmlAgility pack (HAP) to load the HTML into a document object model (DOM) and parse into nodes. Additionaly, there are cases where I have had to parse the document on elements which are not truly nodes such as comments.
In addition to observations about HAP in general, I’ll point out extension methods provided by HAP.CSSSelectors package which allow for much easier selection.
Packages for the example will need to be imported using NuGet. The package descriptions will be loaded in the project but you will need to set NuGet package manager to restore the libraries. In the project I have included a really simple html file with examples of issues I have needed to address in my projects. 
To test without any modifications, you will need to copy the HTML file to the following d…