[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ga] Privacy and Whois databases



On Fri, 15 Oct 1999 21:19:05 -0500 Peter Veeck <veeck@texoma.net>
wrote:

> I use whois to fight spam abuse.  Are  Spam complaints going 
> to be taken over by ICANN or a subset thereof?


(This note is going to be long and a bit technical.  I apologize
in advance and anyone who believes that all problems are easy 
should just skip it.  Additional disclaimer: these are personal 
impressions based on a bit of experience and thought --I have no
idea whether anyone else in MCI WorldCom would agree and they 
certainly aren't corporate positions.)

Peter,

This case worries me a lot, because I can argue that either 
whois is important to it or that it is nearly irrelevant.  The 
problem also looks different depending on whether you see those 
tables as sources of information for fighting spammers (and you 
and I do) or as sources of addresses for use by the spammers 
(the amount of spam I get as the result of being in those tables
is trivial compared to what shows up from other sources, and 
the CDs of millions of addresses for people to bother don't 
appear to be significantly populated from Whois).  

For background, in my day job, I've ended up with administrative 
responsibility for MCI.NET; if you check the Whois tables, 
you'll find my name and phone number there.  Until 13 
months ago, MCI.NET (with a fairly deep hierarchy) was the
management domain for internetMCI: there were never supposed to 
be any user/customer mail addresses in the domain, but there 
were many routers, mail and web servers, system management 
stations, etc.  internetMCI was pretty aggressively antispammer,
with a significant full-time staff dedicated to fighting the 
activity, and there are a good number of ex-spammers, would-be 
spammers, and even a few ex-large-bandwidth customers who can 
attest to that.   When we sold internetMCI to Cable and 
Wireless, most of the spam-fighting apparatus went to them along
with the equipment, customers, etc.   

But the spammers --or those who supply them with software and 
tools-- either don't know that the sale occurred or don't care, 
so MCI.NET has become a popular address for faking into 
messageIDs, "From:", fields, bogus server names, etc., and is 
used far more in those ways than it was, e.g., two years ago.   
That the addresses are being faked is, in almost all cases, 
obvious to anyone who has a clue about email and who takes a 
minute to examine the trash that they have received.

It is also worth noting that, as for most business activities, 
when things get large, they get specialized:  even if 
information is public, for a large domain, the top-level
contacts in the Whois tables are _exactly_ what the specs say 
they are, i.e., administrative, technical, and billing contacts 
for _namespace_ management.  They may not have much to do with 
email systems or, in especially bad cases, may not be more 
effective at reaching the email people in their organizations 
than an end user might be.

So, let's see what happens today.  A user receives spam and 
finds it offensive.  There are a bunch of neat tools on the 
market that either intercept the stuff sight unseen or take a 
referral from that user and start sending out complaint messages 
-- to postmaster, root, any address in whois, etc., at all of 
the apparently-relevant domains.  But those tools aren't too 
smart, especially in the hands of clueless users (we recently 
had the authors of one tell us that being more careful would 
slow down the software and be inefficient (!)).

So, these faked addresses produce a large flow of messages (some
of them quite abusive and threatening) to people who aren't 
responsible for the spam or its relaying, have little or no 
control over organizational mail servers, and, if there are 
specific people in the organization whose jobs focus on 
spammer-fighting and who have the skills and tools to do so, 
they don't get reached. I, and I assume most of us, do forward 
those notes to the right places, but some considerable time gets
lost in the process.  

And time is important: typically, the real offenders are 
originating the junk from short-lived dialup accounts.  If they 
can be tracked down at all, one has to capture the dialup 
address and timestamps from the email header, identify the ISP,
get to _their_ antispam people, and find out which customer was 
using that address at that time (that assumes little relaying 
and fakery goes on; otherwise the tracing process has to be done
recursively, one site/organization at a time.  Now, here, the 
whois tables might help us identify a site contact to discuss 
things with, but, as in our case, the larger and better-staffed 
the ISP is, the less likely it is that the whois path will be 
particularly efficient.   And many ISPs don't keep those 
detailed logs for a very long time: if the spammer can succeed 
in evading identification for long enough (in some cases we have
encountered, only 24 hours), it can't be found at all.

Even if we (or someone closer to the user -- we really shouldn't
be involved at all in this part of the process) find the right 
ISP, privacy and business considerations often prevent their 
identifying the customer to us.  If they care (some do more than
others), they must identify the customer and take responsibility
for discouraging the behavior (noting that shutting down the 
account of a dialup user is nearly pointless -- it just shows up 
somewhere else a few minutes later).   But those are other 
issues.

Conclusion: the whois data, even if available, aren't an 
especially good tool for fighting spam, although they may be 
better than anything else right now (see below).  And, if they 
are needed, replacing them with the smail, inquiries to 
registrars, or proofs of why the information is important, just 
aren't going to be adequate substitutes because of those 
timeout problems.

However, it is often extremely important to be able to use the 
Whois data for the reasons for which they (and the rule that 
sites running email must support a "postmaster" address) were 
originally intended: to get a message to someone about 
something, in the name space, on the mail system, or elsewhere 
relevant, that the involved system is broken and needs fixing up
from the inside.  In the Whois case, relying on a DNS SOA record
(or something similar) to obtain the contact information can be 
pointless -- the canonical complaint is "your DNS server is 
broken and is causing network damage", and that requires a path 
that doesn't depend upon being able to access the DNS server.   
Remember that, ultimately, the information in those tables is 
about the management of a name space... it is not about who runs
a business, where to find the web master, or who is the chief 
poo-bah in charge of cutting off customers who violate network 
norms.  

Oddly, the trademark issues that keep coming up as examples of 
why the data need to be public may be less difficult, just 
because obtaining information in strictly real-time may be a bit
less important.  I haven't seen anything that feels to me like 
the right formula yet (some of the ideas that have been floated 
feel distinctly not-right, but I think there may be a reasonable
one somewhere).  For example, there may be some possibilities 
involving registering or credentialing people who would engage 
in legitimate intellectual property searches to get them 
different access than random users might have while ensuring 
those mechanisms don't create another monopoly or another 
"business opportunity" for registries or registrars.  And, if 
_their_ privacy is important, we could imagine third-party 
organizations, keys, and certificates that would provide 
credentials while protecting privacy.

That obviously isn't a case for either "should be completely 
open" or "should be completely closed" or even for "user 
option".  It is a strong suggestion that there are more 
possibilities if we think creatively about the issues and what 
we are trying to accomplish.

And that brings us back to the fighting of the spammers.  I 
think some creative work is needed.  It isn't clear to me that 
ICANN is the right place to do the work or to make whatever 
guidelines are needed.  I think most ISPs, and companies who 
receive a lot of spam complaints, would be delighted to publish, 
either as part of Whois data that was always exposed or through 
some agreed-upon DNS entry, contact information for anyone who 
believes spam is originating from their sites and that the odds 
of persuading others to go along are pretty good.  A "for 
alleged spam, contact" address could be published, even for a 
domain whose real contact information needed to be hidden from 
general view, by pointing to a third party (since many of the 
sites requiring anonymity don't run mail servers, they might 
find that recruiting someone to accept such mail and return a 
brief response, ideally after an automated review, quite easy). 
Or we could try to standardize another address like 
"postmaster".   But we would all need a convention about where 
to put the information and how to present it that could be used 
by low-clue users and whatever tools they select.

Like it or not, these are complex systems.  Everything is 
related to everything else.  Answers that are developed from 
only a single perspective, or with the needs of only a single 
user group, in mind, will almost always be wrong because they 
will foul up something else of [nearly] equal importance.  We 
need to figure out how to work together to get all of the issues
and considerations onto the table, to eliminate the fantasies, 
and then to construct a solution space and see what can be 
created in it.

My impression is that the turmoil of the last few years has made
it hard to think creatively about these problems and to inject 
any solutions that might be found into the systems.  Too much 
else has been going on, and it has been too tempting to identify
any change or suggestion as a plot with one sinister purpose or 
another. But maybe this, or right after we get through the 
election, is the right time. And maybe the GA would be a good 
place to at least initiate the discussion, rather than just 
turning into a series of simplistic straw polls on a small 
fraction of the options or arguments about which objective is 
most important.

     john