Products Purchase Publishing Articles Support Company Contact |
Articles > .NET > Suffering from Spam |
|||||
.NET COM
|
Suffering from Spam?One of the major ways that spammers get email addresses is by crawling web pages and extracting any Email addresses they find. As part of your own web site development, it's a good idea to check your own web pages to make sure you didn't accidentally include any Email addresses. Here's a little VB .NET utility that can scan a web page and extract most email addresses. Private Sub cmdScan_Click(ByVal sender As System.Object, _ ByVal e As System.EventArgs) Handles cmdScan.Click ScanAPage(txtURL.Text) End Sub The Regex Regular Expression class does the scanning. This particular pattern breaks down as follows:
Private m_EmailRE As New Regex("\w+@(\w+)(\.\w+)+") Now, this expression is not perfect - it will detect some invalid Email addresses, but keep in mind what it is designed for. This application isn't a tool for spammers to extract Email addresses - it's a tool to help you spot Email addresses you've inadvertently left in your code. So false positives aren't harmful. The AddToList function checks for duplicates before listing the Email addresses. Private Sub AddToList(ByVal s As String) If Not ListBox1.Items.Contains(s.ToLower) Then ListBox1.Items.Add(s.ToLower) End If End Sub The ScanAPage function does the real work. It uses the WebRequest object to retrieve the specified web page, then applies the Regular Expression to extract all patterns that match the Email address and adds them to the list. Private Sub ScanAPage(ByVal url As String) Dim req As WebRequest req = WebRequest.Create(url) Dim pageinfo As String Dim response As WebResponse Dim thisuri As New Uri(url) Try response = req.GetResponse() Catch ex As Exception Exit Sub End Try Dim sr As New IO.StreamReader(response.GetResponseStream) pageinfo = sr.ReadToEnd() response.Close() Dim reresults As MatchCollection Dim onematch As Match ' Get Email addresses and add to list reresults = m_EmailRE.Matches(pageinfo) If Not reresults Is Nothing Then For Each onematch In reresults AddToList(onematch.Value) Next End If End Sub For more information on regular expressions, check out the eBook "Regular Expressions in .NET" by Dan Appleman.
For notification when new articles are available, sign up for Desaware's Newsletter. |
|
|||
Products Purchase Articles Support Company Contact Copyright© 2012 Desaware, Inc. All Rights Reserved. Privacy Policy |
|||||