DNSO Archives: [nc-whois]

ICANN/DNSO DNSO Mailling lists archives
[nc-whois]

<<< Chronological Index >>> <<< Thread Index >>>
Re: [nc-impwhois] RE: [nc-whois] WHOIS and SPAM - survey show noconnection

To: Bruce Tonkin <Bruce.Tonkin@melbourneit.com.au>
Subject: Re: [nc-impwhois] RE: [nc-whois] WHOIS and SPAM - survey show noconnection
From: Rick Wesson <wessorh@ar.com>
Date: Wed, 15 Jan 2003 20:45:57 -0800 (PST)
cc: Philip Sheppard <philip.sheppard@aim.be>, <steve@stevecrocker.com>, <dnssac-comment@icann.org>, <nc-whois@dnso.org>, "NC (list)" <council@dnso.org>, Louis Touton ICANN <touton@icann.org>, <nc-impwhois@dnso.org>
In-Reply-To: <AFEF39657AEEC34193C494DBD71792221A7824@phoenix.mit>
Sender: owner-nc-whois@dnso.org

another data point for ya'll:

see http://postini.com who does commerical spam filtering, on their home
page they have some spam stats, one of which is in the last 24hours they
processed some 48M messages and in that same period thwarted 30K directory
harvest attacks.

30,000 attacks in 24hours on one SMTP service is significant, there are
more SMTP servers than there are whois servers and this single statistic
points to a much greater attack than any that can be prefomed by mining
whois.

In another exprement collecting email addresses from a 48 Mb/s news feed
in 24 hours we were able to collect 2.1M unique email addresses.

now, can't we agree that whois is a potential source of email addresses
but its not the only source, and that there are many significant sources
for harvesting email addresses.

for me, this topic detracts us from the real issues of privacy that have
yet to be delt with by any of us.

-rick




> This is a multi-part message in MIME format.
>
> ------_=_NextPart_001_01C2BD0F.714DDA53
> Content-Type: text/plain;
> 	charset="iso-8859-1"
> Content-Transfer-Encoding: quoted-printable
>
> Hello Philip,
> =20
> Thanks for posting this information.
> =20
> The FTC analysis was an interesting experiment - but be careful not to =
> jump to too many conclusions.
> =20
> For example, use of port 43 WHOIS data is often as a result of a two =
> phase search
> (1) Phase 1 - find websites that are real - ie qualify the lead
> (2) Run WHOIS search against the domain name associated with the website
> =20
> Just creating a random domain name, and setting up WHOIS contact data, =
> will not necessarily pick up this usage unless the website is =
> established and real in the first place.  There are other techniques =
> available as well but often leave a trace.   The process above can be =
> done reasonably anonymously.
> =20
> Registrars could provide data on WHOIS usage by IP address, and this =
> could show the amount of data mining going on (after removing IP =
> addresses from registrars checking WHOIS for transfer authorisation =
> purposes).  ie if WHOIS was being used as it was intended the number of =
> queries would be close to the number of unique IP addresses, but there =
> are often high peaks from a few IP addresses.
> =20
> Note what was picked up in the analysis below, is that when a real =
> website is established - email addresses found on that website are used.
> =20
> Regards,
> Bruce
> =20
> =20
> =20
>
> -----Original Message-----
> From: Philip Sheppard [mailto:philip.sheppard@aim.be]
> Sent: Wednesday, January 15, 2003 8:43 PM
> To: steve@stevecrocker.com; dnssac-comment@icann.org
> Cc: nc-whois@dnso.org; NC (list); Louis Touton ICANN
> Subject: [nc-whois] WHOIS and SPAM - survey show no connection
>
>
> Steve, interesting to read the Security and Stability Advisory Committee =
> recommendation on Whois. In relation to privacy you state: "it is widely =
> believed that Whois data is a source of e-mail addresses for the =
> distribution of spam".  This may be a wide belief but empirical evidence =
> from the US Federal Trade Commission tells us otherwise. See the last =
> sentence of the note below in particular.
> Philip
> ------------------
> http://www.ftc.gov/bcp/conline/pubs/alerts/spamalrt.htm. =20
> To find out which fields spammers consider most fertile for harvesting, =
> investigators "seeded" 175 different locations on the Internet with 250 =
> new, undercover email addresses. The locations included web pages, =
> newsgroups, chat rooms, message boards, and online directories for web =
> pages, instant message users, domain names, resumes, and dating =
> services. During the six weeks after the postings, the accounts received =
> 3,349 spam emails. The investigators found that:
>
> *	86 percent of the addresses posted to web pages received spam. It =
> didn't matter where the addresses were posted on the page: if the =
> address had the "@" sign in it, it drew spam.=20
>  =20
>
> *	86 percent of the addresses posted to newsgroups received spam.=20
>  =20
>
> *	Chat rooms are virtual magnets for harvesting software. One address =
> posted in a chat room received spam nine minutes after it first was =
> used.
>
> Addresses posted in other areas on the Internet received less spam, the =
> investigators found. Half the addresses posted on free personal web page =
> services received spam, as did 27 percent of addresses posted to message =
> boards and nine percent of addresses listed in email service =
> directories. Addresses posted in instant message service user profiles, =
> "Whois" domain name registries, online resume services, and online =
> dating services did not receive any spam during the six weeks of the =
> investigation.
>
> =20
>
>
> ------_=_NextPart_001_01C2BD0F.714DDA53
> Content-Type: text/html;
> 	charset="iso-8859-1"
> Content-Transfer-Encoding: quoted-printable
>
> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
> <HTML xmlns:v xmlns:o><HEAD>
> <META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; =
> charset=3Diso-8859-1">
>
>
> <META content=3D"MSHTML 5.00.3314.2100" name=3DGENERATOR>
> <STYLE></STYLE>
> </HEAD>
> <BODY bgColor=3D#ffffff id=3DMailContainerBody leftMargin=3D0=20
> style=3D"BORDER-BOTTOM-STYLE: none; BORDER-LEFT-STYLE: none; =
> BORDER-RIGHT-STYLE: none; BORDER-TOP-STYLE: none; COLOR: #000000; =
> FONT-FAMILY: Verdana; FONT-SIZE: 10pt; FONT-STYLE: normal; FONT-WEIGHT: =
> normal; PADDING-LEFT: 10px; PADDING-TOP: 15px; TEXT-DECORATION: none"=20
> topMargin=3D0 acc_role=3D"text" CanvasTabStop=3D"true" name=3D"Compose =
> message area">
> <DIV><SPAN class=3D490511803-16012003>Hello Philip,</SPAN></DIV>
> <DIV><SPAN class=3D490511803-16012003></SPAN>&nbsp;</DIV>
> <DIV><SPAN class=3D490511803-16012003>Thanks for posting this=20
> information.</SPAN></DIV>
> <DIV><SPAN class=3D490511803-16012003></SPAN>&nbsp;</DIV>
> <DIV><SPAN class=3D490511803-16012003>The FTC analysis was an =
> interesting=20
> experiment - but be careful not to jump to too many =
> conclusions.</SPAN></DIV>
> <DIV><SPAN class=3D490511803-16012003></SPAN>&nbsp;</DIV>
> <DIV><SPAN class=3D490511803-16012003>For example, use of port 43 WHOIS =
> data is=20
> often as a result of a two phase search</SPAN></DIV>
> <DIV><SPAN class=3D490511803-16012003>(1) Phase 1 - find websites that =
> are real -=20
> ie qualify the lead</SPAN></DIV>
> <DIV><SPAN class=3D490511803-16012003>(2) Run WHOIS search against the =
> domain name=20
> associated with the website</SPAN></DIV>
> <DIV><SPAN class=3D490511803-16012003></SPAN>&nbsp;</DIV>
> <DIV><SPAN class=3D490511803-16012003>Just creating a random domain =
> name, and=20
> setting up WHOIS contact data, will not necessarily pick up this usage =
> unless=20
> the website is established and real in the first place.&nbsp; There are =
> other=20
> techniques available as well but often leave a trace.&nbsp;&nbsp; The =
> process=20
> above can be done reasonably anonymously.</SPAN></DIV>
> <DIV><SPAN class=3D490511803-16012003></SPAN>&nbsp;</DIV>
> <DIV><SPAN class=3D490511803-16012003>Registrars could provide data =
> on&nbsp;WHOIS=20
> usage by IP address, and this could show the amount of data mining going =
> on=20
> (after removing IP addresses from registrars checking WHOIS for transfer =
>
> authorisation purposes).&nbsp; ie if WHOIS was being used as it was =
> intended the=20
> number of queries would be close to the number of unique IP addresses, =
> but there=20
> are often high peaks from a few IP addresses.</SPAN></DIV>
> <DIV><SPAN class=3D490511803-16012003></SPAN>&nbsp;</DIV>
> <DIV><SPAN class=3D490511803-16012003>Note what was picked up in the =
> analysis=20
> below, is that when a real website is established - email addresses =
> found on=20
> that website are used.</SPAN></DIV>
> <DIV><SPAN class=3D490511803-16012003></SPAN>&nbsp;</DIV>
> <DIV><SPAN class=3D490511803-16012003>Regards,</SPAN></DIV>
> <DIV><SPAN class=3D490511803-16012003>Bruce</SPAN></DIV>
> <DIV><SPAN class=3D490511803-16012003></SPAN>&nbsp;</DIV>
> <DIV><SPAN class=3D490511803-16012003></SPAN>&nbsp;</DIV>
> <DIV><SPAN class=3D490511803-16012003></SPAN>&nbsp;</DIV>
> <BLOCKQUOTE=20
> style=3D"BORDER-LEFT: #000000 2px solid; MARGIN-LEFT: 5px; MARGIN-RIGHT: =
> 0px; PADDING-LEFT: 5px">
>   <DIV align=3Dleft class=3DOutlookMessageHeader dir=3Dltr><FONT=20
>   face=3DTahoma>-----Original Message-----<BR><B>From:</B> Philip =
> Sheppard=20
>   [mailto:philip.sheppard@aim.be]<BR><B>Sent:</B> Wednesday, January 15, =
> 2003=20
>   8:43 PM<BR><B>To:</B> steve@stevecrocker.com;=20
>   dnssac-comment@icann.org<BR><B>Cc:</B> nc-whois@dnso.org; NC (list); =
> Louis=20
>   Touton ICANN<BR><B>Subject:</B> [nc-whois] WHOIS and SPAM - survey =
> show no=20
>   connection<BR><BR></DIV></FONT>
>   <DIV><FONT face=3DArial>Steve, interesting to read the Security and =
> Stability=20
>   Advisory Committee recommendation on Whois. In relation to =
> privacy&nbsp;you=20
>   state: "it is widely believed that Whois data is a source of e-mail =
> addresses=20
>   for the distribution of spam".&nbsp; This may be a wide belief but =
> empirical=20
>   evidence from&nbsp;the US Federal Trade Commission tells us otherwise. =
> See the=20
>   last sentence of the note below in particular.</FONT></DIV>
>   <DIV><FONT face=3DArial>Philip</FONT></DIV>
>   <DIV><FONT face=3DArial>------------------</FONT></DIV>
>   <DIV><A=20
>   =
> href=3D"http://www.ftc.gov/bcp/conline/pubs/alerts/spamalrt.htm">http://w=
> ww.ftc.gov/bcp/conline/pubs/alerts/spamalrt.htm</A>.&nbsp;=20
>   </DIV>
>   <DIV><SPAN class=3D677390914-15112002><FONT face=3DArial>To find out =
> which fields=20
>   spammers consider most fertile for harvesting, investigators "seeded" =
> 175=20
>   different locations on the Internet with 250 new, undercover email =
> addresses.=20
>   The locations included web pages, newsgroups, chat rooms, message =
> boards, and=20
>   online directories for web pages, instant message users, domain names, =
>
>   resumes, and dating services. During the six weeks after the postings, =
> the=20
>   accounts received 3,349 spam emails. The investigators found=20
> that:</FONT></DIV>
>   <DIV>
>   <UL>
>     <LI><FONT face=3DArial>86 percent of the addresses posted to web =
> pages=20
>     received spam. It didn't matter where the addresses were posted on =
> the page:=20
>     if the address had the "@" sign in it, it drew spam. =
> <BR>&nbsp;</FONT>=20
>     <LI><FONT face=3DArial>86 percent of the addresses posted to =
> newsgroups=20
>     received spam. <BR>&nbsp;</FONT>=20
>     <LI><FONT face=3DArial>Chat rooms are virtual magnets for harvesting =
> software.=20
>     One address posted in a chat room received spam nine minutes after =
> it first=20
>     was used.</FONT></LI></UL>
>   <P><FONT face=3DArial>Addresses posted in other areas on the Internet =
> received=20
>   less spam, the investigators found. Half the addresses posted on free =
> personal=20
>   web page services received spam, as did 27 percent of addresses posted =
> to=20
>   message boards and nine percent of addresses listed in email service=20
>   directories. Addresses posted in instant message service user =
> profiles, "<FONT=20
>   color=3D#0000ff>Whois" domain name registries, online resume services, =
> and=20
>   online dating services did not receive any spam during the six weeks =
> of the=20
>   investigation</FONT>.</FONT></P></SPAN></DIV>
>   <BLOCKQUOTE><FONT=20
> face=3DArial></FONT>&nbsp;</BLOCKQUOTE></BLOCKQUOTE></BODY></HTML>
>
> ------_=_NextPart_001_01C2BD0F.714DDA53--
>
References:
- RE: [nc-whois] WHOIS and SPAM - survey show no connection
  - From: "Bruce Tonkin" <Bruce.Tonkin@melbourneit.com.au>
<<< Chronological Index >>> <<< Thread Index >>>