HTML Application (HTA) – Email IDs Extractor

I have written this application for them who want a simple and clean tool for extracting the Email IDs from a text content. Actually, I was myself in need of such a tool from long time, since I got a requirement to create a Database of Email IDs from Internet. I tried a lot of tools available on the Internet, but they all were too complex that one can find himself troubled very soon with them. So I thought of writing one myself, which is simple and usable by anyone with a basic knowledge of application. But, the big question was which language or platform to pick for this. The Answer was HTML using JavaScript and VBScript.

Here I am going to tell you how to use this simple but useful tool. Firstly download the application from here.

So you got the RAR archive containing the tool. When you run the tool you will see the screen as shown here:

startup

After welcome Screen you will see a prompt telling you that you don’t have any previously saved email ids in the database:

firsttimeload

Click Ok and you are ready to scrab any text containing the Email IDs. Just grab some text containing the Email IDs and paste it in the left hand side textbox and press enter.

idsindatabase

Here you go … you got all the unique Email IDs fetched from the text and collected in the right hand side list in fraction of seconds. This tool is smart enough that it will also not create a duplicate database entry of an existing Email ID. When ever you reopen the Email IDs Extractor it will maintain the previously saved Email IDs till the emailids.txt file exists in the same location as that of the tool (the database is text file based).

Whenever you need to use the database just open the emailids.txt in any text editor and use them.

So simple it is right !!!

I hope you will enjoy this tool. Don’t hesitate to drop me an Email if you get any trouble using it. I will be upgrading this tool according to the response I will get.

Enjoy!!!

Regular Expression with C#: Part 1

In this tutorial you will learn.

  1. Extract Text From The Web pages
  2. Use a special software (Expresso) to make writing Regular Expressions easier
  3. Write and Use Regular Expressions in C#

This tutorial is intended for people who know how to do some basic Regular Expressions and know the syntax of the language. Since there is a lot of material to cover I won’t go over everything.

Till a little while ago I never really thought much of Regular Expressions. I used them a couple of times just to validate an email address or web address etc. It was not till one day I thought of making a web scraping program that would deal with pages from a site and then store all the grabbed information in an xml file.

The need of grabbing the contents came across while studying the RSS technology. In RSS you need to send user a predefined output in XML format with some extra tags. In this tutorial I will not be reaching to that extent, but will give you all an idea about how to just grab the data from the site. Then its upon you, how to utilize the knowledge.

I will take the example of a google search result page, where the results are listed in a well organized manner. We will find out some text from the page for each result and display the result in my own format.

Step 1: Create a web application to host our functionality.

  1. Open Visual Studio 2005
  2. File Menu > New > Web Site
  3. Select ASP.Net Web Site from the listed project templates
  4. Select Language as Visual C# and Click Ok.

You will see the Default.aspx page in the solution explorer, if not add it.

Step 2: Write code to get the desired page source (Google Search Results).

  1. Drag n Drop a TextBox, Button and a Label Control on the Page in design mode.
  2. Rename TextBox1 to txtSearch, Button1 to btnSearch and Label1 to lblResults.
  3. On btnSearch Click event write the following code.
protected void btnGetRating_Click(object sender, EventArgs e)
{
    if (string.IsNullOrEmpty(txtSearch.Text.Trim()))
    {
        txtSearch.Text = "Google";
    }
    string URL = txtSearch.Text;
    string IMDBData = WebPage.getContents(URL);
    string fullRating = getRating(IMDBData.Replace("n", "").Replace("rn", "").Replace("nr", ""));
    lblResults.Text = fullRating;
}

Continue reading Regular Expression with C#: Part 1