There are lot of ways to read HTML file, you can read it as Plain text, you can read it with XMLreader or XMLDocument or you can use 'Html Agility Pack'.
Basically Html Agility Pack is agile HTML parser that builds a read/write DOM and supports plain XPATH or XSLT (you actually don't HAVE to understand XPATH nor XSLT to use it, don't worry...). It is a .NET code library that allows you to parse "out of the web" HTML files
see below link
Editor, DotNetSpider MVM
Microsoft MVP 2014 [ASP.NET/IIS]