What We Can Learn from Internet Email Headers

By: Shally Steckerl

This is a thorough explanation on how to read email headers. It occurred to me that some folks may be interested in knowing what we can learn from Internet Email headers. I thought this may be a great opportunity to share a few things about email with the recruiting community. If you already know this stuff, can explain it better, or are not interested - my apologies and please stop reading! This is intended as an overview. I am experienced in this matter from my own research after receiving malicious SPAM on several occasions, but I am by no means a definitive authority on this subject.

The email header is text encoded in the message, but that we don't actually see. In Outlook, while looking at the email message, go to the View menu then the Options button and you should see something like the following attached (from a sample message):

----- ACTUAL MESSAGE HEADER -----
Return-Path:
Received: from unknown ([151.202.177.115]) by imf04bis.bellsouth.net
(InterMail vM.5.01.01.01 201-252-104) with SMTP
id <20010809213458.KISJ11435.imf04bis.bellsouth.net@unknown>;
Thu, 9 Aug 2001 17:34:58 -0400
From: Subject: CICS, COBOL II, DB2, VSAM and MVS-ES. - 3 Yrs with Merrill Lynch
Date: Thu, 9 Aug 2001 14:00:17 Message-Id: <39.231076.732715@>

SHALLY'S TRANSLATION

This email address is being protected from spambots. You need JavaScript enabled to view it. is the sender (using Intermedia as the mail provider)
Their IP address is 151.202.177.115
They use Bell Atlantic (NETBLK-BELL-ATLANTIC1)
Netname: BELL-ATLANTIC1
Netblock: 151.196.0.0 - 151.205.255.255
Maintainer: BAIS Their website or ISP host is:
Verizon Global Networks Inc. (ZV20-ARIN) This email address is being protected from spambots. You need JavaScript enabled to view it.
(703) 295-4583
imf04bis.bellsouth.net is recepient's mail server

The "unknown" and blank space after the @ in the following lines tell me this is a forged email, the user is manipulating their headers intentionally.

20010809213458.KISJ11435.imf04bis.bellsouth.net@unknown
39.231076.732715@

This is email fraud, if you want to get serious about it. My educated guess is that its an H1-B dependant shop spamming recruiters with candidates. You can copy the header and send it to the offender's ISP for analysis - if you think its worth the trouble. You can also send me a header sample and I can help decode it to help you decide if its worth pursuing. Superficially, it appears that email is passed directly from the sender's machine to the recipient's. Normally, this isn't true; a typical piece of email passes through at least four computers during its lifetime. This happens because most organizations have a dedicated machine to handle mail, called a "mail server." It's normally not the same machine that users are looking at when they read their mail. In the case of an ISP whose users dial in from their home computers, the "client" computer is the user's home machine, and the "server" is some machine that belongs to the ISP. When a user sends mail, she normally composes the message on her own computer, then sends it off to her ISP's mail server. At this point her computer is finished with the job, but the mail server still has to deliver the message. It does this by finding the recipient's mail server, talking to that server and delivering the message. It then sits on that second mail server until the recipient comes along to read her mail. When she retrieves it on her own computer, normally it is deleted from the mail server in the process.

Example Spam Message (fictitious names and ID's used to protect privacy):

Diane sends a letter to and association, she composes it at her workstation (which is called This email address is being protected from spambots. You need JavaScript enabled to view it..) The composed text is passed from there to the mail server, This email address is being protected from spambots. You need JavaScript enabled to view it. or mq.egroups.com. This is the last Diane sees of it. The rest is handled by machines with no intervention from her. The mail server, seeing that it has a message for someone at egroups.com, contacts its mail server---called, hypothetically, l10.egroups.com---and delivers the mail to it. Now the message is stored on l10.egroups.com until yahoogroups processes it.

During all this processing, headers will be added to the message three times:
1. At composition time, by Diane's Outlook;
2. When that program hands control off to mail.yahoogroups.com; and
3. At the transfer from egroups to yahoo.

Let's watch the evolution of these headers. As generated by Diane's mailer and handed off to yahoogroups.com:

From: "Diane Smith" To: Date: Tue, Mar 18 1997 14:36:14 PST X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0) Subject: [someassociation] Porno Spam

Here is the header when the email system at Diane's someisp.com account transmits the message to the mail host at yahoogroups.com (which is mq.egroups.com ).

Received: from Smiths (A010-1198.PHNX.splitrock.net [209.254.234.182]) by pimout4-int.someisp.com (8.11.0/8.11.0) with SMTP id f6EHrsk152696 for ; Sat, 14 Jul 2001 13:53:54 -0400 To:
Message-ID:
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0)
From: "Diane Smith"
Date: Sat, 14 Jul 2001 10:51:40 -0700
Subject: [someassociation] Porno Spam

When mq.egroups.com finishes processing the message and gives it to a prodigy server called pimout4-int.someisp.com and adds the first line:

Received: from unknown (HELO pimout4-int.someisp.com ) (207.115.63.103) by mta1 with SMTP; 14 Jul 2001 17:53:56 -0000

Received: from Smiths (A010-1198.PHNX.splitrock.net [209.254.234.182]) by pimout4-int.someisp.com (8.11.0/8.11.0) with SMTP id f6EHrsk152696 for ; Sat, 14 Jul 2001 13:53:54 -0400 To:
Message-ID:
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0)
From: "Diane Smith"
Date: Sat, 14 Jul 2001 10:51:40 -0700
Subject: [someassociation] Porno Spam

This last set of headers is the one that I see on the letter when I download and read mail. Here's a line-by-line analysis of these headers and exactly what each one means.

Received: from mq.egroups.com ([208.50.144.79]) by ehost002.intermedia.net with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13) id N8JKS1JG; Sat, 14 Jul 2001 10:57:50 -0700

This piece of mail was received from a machine calling itself mq.egroups.com with the IP address [208.50.144.79]. It was received by ehost002.intermedia.net which is running MS Exchange. The receiving machine assigned the ID number N8JKS1JG to the message. (This is used internally by the machine---it's something an administrator would need to know to look up the message in the machine's log files, but it's not usually meaningful to anyone else.)

Received: from Smiths (A010-1198.PHNX.splitrock.net [209.254.234.182]) by pimout4-int.someisp.com (8.11.0/8.11.0) with SMTP id f6EHrsk152696 for ; Sat, 14 Jul 2001 13:53:54 -0400

The message was addressed to This email address is being protected from spambots. You need JavaScript enabled to view it.. Note that this header is not related to the To: line.

Sat, 14 Jul 2001 13:53:54 -0400

This mail transfer happened on Sat, 14 Jul 2001 13:53:54 -0400 (1:53:54 in the afternoon) Eastern Standard Time (which is 4 hours behind Greenwich Mean Time; hence the "-0400").

Received: from unknown (HELO pimout4-int.someisp.com ) (207.115.63.103) by mta1 with SMTP; 14 Jul 2001 17:53:56 -0000

This line documents the mail handoff from someisp.com (Diane's workstation) to mta1; this handoff happened at 14 Jul 2001 17:53:56 -0000 (10:53:56 Phoenix time). The sending machine called itself mq.egroups.com; it really came from pimout4-int.someisp.com , and its IP address is 207.115.63.103.

From: "Diane Smith"

The mail was sent by This email address is being protected from spambots. You need JavaScript enabled to view it., who gives her real name as Diane Smith.

To: The letter is addressed to This email address is being protected from spambots. You need JavaScript enabled to view it..

Date: Sat, 14 Jul 2001 10:51:40 -0700 The message was composed at 10:51:40 Arizona Time on Saturday, July 12, 2001.

Message-ID:

The message has been given this number by prodigy to identify it. This ID is different from the SMTP and ESMTP ID numbers in the Received: headers because it is attached to this message for life. The other IDs are only associated with specific mail transactions at specific machines, so that one machine's ID number means nothing to another machine. Sometimes, as was the case in this example, the Message-ID has the sender's email address embedded in it; more often, it has no intelligible meaning of its own.

X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0)

The message was sent using MS Outlook, and even gives you the build number.

Regular Mail Protocols

This section is a little more technical than the others, and focuses on the details of how mail gets from one point to another. You don't need to understand every word, but familiarity with this subject can do a lot to clarify what's happening in strange situations. Since email spammers often intentionally create such strange situations (partly to confuse their victims), the ability to understand those situations can be quite helpful.

To communicate over a network, computers often use "points of entry" called ports. You might think of a port as a channel like on your TV or Radio through which computers listen to communications from the network. To listen to many communications at once, a computer needs to have multiple ports; to distinguish them, they're generally numbered. On systems connected to the Internet (or any systems using the same protocols for email), port 25 is of particular importance for the present discussion. That's the port used to transmit and receive mail. Port 80 - as an interesting note, is where most of your web browsing occurs.

Normal Behavior

Let's return to the example of the last section, and specifically to the point where prodigy communicates with egroups. What really happens here is that pimout4-int.someisp.com opens a connection to port 25 of mq.egroups.com, and sends the mail through that connection, along with some administrative data. The commands it uses to do this, and the responses issued by the receiving system, are more or less human-readable. They're commands in a rudimentary language called SMTP, for Simple Mail Transfer Protocol. Someone eavesdropping on the "conversation" between the machines would see something like this:

220 ehost002.intermedia.net with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2653.13) id N8JKS1JG; Sat, 14 Jul 2001 10:57:50 -0700
HELO pimout4-int.someisp.com 250 ehost002.intermedia.net
Hello pimout4-int.someisp.com [207.115.63.103], pleased to meet you
MAIL FROM: This email address is being protected from spambots. You need JavaScript enabled to view it.
250 This email address is being protected from spambots. You need JavaScript enabled to view it.... Sender ok
RCPT TO: This email address is being protected from spambots. You need JavaScript enabled to view it. (it was actually sent from yahoogroups to me)
250 This email address is being protected from spambots. You need JavaScript enabled to view it.... Recipient ok
DATA
354 Enter mail, end with "." on a line by itself
Received: from mq.egroups.com ([208.50.144.79]) by ehost002.intermedia.net
ehost002.intermedia.net (8.11.0) id N8JKS1JG; Sat, 14 Jul 2001 10:57:50 -0700
From: This email address is being protected from spambots. You need JavaScript enabled to view it. (Diane Smither)
To: This email address is being protected from spambots. You need JavaScript enabled to view it.
Date: Sat, 14 Jul 2001 10:51:40 -0700
Message-ID:
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2910.0)
Subject: [someassociation] Porno Spam

I need some advice/direction on how to block pornographic Spam. I get soliciting emails that contain inappropriate content and language. I'm no prude, but this stuff is not what I want to receive on my email.

Thanks,

Diane Smith
BBS Consulting
.
250 LAA20869 Message accepted for delivery
QUIT
221 ehost002.intermedia.net closing connection

This whole transaction depends on five commands at the core of SMTP (there are a few others, but they're sideline issues to the actual process of passing mail from one place to another): HELO, MAIL FROM, RCPT TO, DATA, and QUIT.

HELO identifies the sending machine; "HELO pimout4-int.someisp.com " should be read as "Hello, I'm pimout4-int.someisp.com ". The sender can lie; nothing, in principle, prevents mail.bieberdorf.edu from saying "Hello, I'm thefonz.xyz.gov" (HELO thefonz.xyz.gov) or even "Hello, I'm a misconfigured computer" (HELO a misconfigured computer). However, in most circumstances, the receiver has some tools with which to discover this and find out the sending machine's real identity.

MAIL FROM initiates mail processing; it means "I have mail to deliver from so-and-so". The address given turns into the so-called "Envelope From"--it need not be the same as the sender's own address! This apparent security hole is inevitable (after all, the receiving machine doesn't know anything about who has what username on the sending machine), and in certain circumstances it turns out to be a useful feature. RCPT TO is dual to MAIL FROM; it specifies the intended recipient of the mail. One piece of mail can be sent to multiple recipients simply by including multiple RCPT TO commands (see the section below on mail relaying, which explains how this feature is sometimes abused on insecure systems). The given address turns into the so-called "Envelope To." It actually determines who the mail will be delivered to, regardless of what the To: line in the message says.

DATA starts the actual mail entry. Everything entered after a DATA command is considered part of the message; there are no restrictions on its form. Lines at the beginning of the message (before the first blank line) that start with a single word and a colon are considered to be headers by most mail programs. A line consisting only of a period terminates the message. QUIT terminates the connection.

SMTP is fully defined in RFC 821. Copies of the RFCs are widely available on the Web. Its well worth reading, as it sheds much light on the intricacies of mail processing.

Unusual Scenarios

The scenario above is a little bit oversimplified. The biggest assumption is that the mail servers of the two organizations involved have free access to one another. This was almost always true in the early days of the Internet, and it's still sometimes the case today, but as security has become a greater concern, and as organizations and networks have gotten bigger sometimes requiring many separate mail servers, it has become more and more unusual.

Firewalls

Many, perhaps most, organizations with computers on the Internet are protected by some kind of firewall. A firewall is just a computer whose primary job is to act as a gatekeeper between an organization's own machines and the great unwashed world of the net (so that, for instance, crackers can't easily connect to a piece of IBM's corporate network and start stealing corporate secrets). From the standpoint of another computer trying to deliver mail to a system behind a firewall, what this means is that you can't talk directly to the system; you have to talk to the firewall.

No surprises here; this just introduces another "hop" in the journey of a piece of email, with the firewall acting as just another machine that passes mail. If ehost002.intermedia.net had a firewall in place, here's what the headers from our sample piece of email might look like. Notice the first Received: line. Lets pretend that the firewall machine is named firewall.ehost002.intermedia.net; in fact, giving a machine a name like "firewall" is tantamount to inviting every teenage cracker-wannabe in the world to try to break in, so firewalls usually have perfectly ordinary, innocuous names.)

Received: from firewall.ehost002.intermedia.net (firewall.ehost002.intermedia.net
[121.214.13.129]) by ehost002.intermedia.net (8.11.0/8.11.0) with SMTP id
LAA20869 for ; Sat, 14 Jul 2001 10:57:50 -0700
Received: from mq.egroups.com ([208.50.144.79]) by
firewall.ehost002.intermedia.net (8.11.0/8.11.0) with ESMTP id LAA20869 for;
Sat, 14 Jul 2001 10:57:50 -0700

In similar fashion, if all outgoing mail from intermedia.net (my Exchange host for jobmachine.net) were routed through a firewall, there would be another Received: line inserted by that firewall machine. By the same token, there might be machines involved that aren't strictly firewalls, but simply common points for routing. Intermedia.net may maintain machines in many physical locations, with several separate mail servers. It may use a single machine (called, for example, mailgate.intermedia.net) to decide which server incoming mail should be routed to.

The history of the message can be reconstructed by reading the Received: headers from bottom to top: pimout4-int.someisp.com received it from Smiths who is (A010-1198.PHNX.splitrock.net at IP address 209.254.234.182). Pimout4-int.someisp.com then sent it to mta1, which in turn, routed it to mq.egroups.com (probably the egroups master email server) which sent it to 10.1.4.56 from where it went to ehost002.intermedia.net, which knew how to get hold of my inbox.

Relaying

Here are some possible headers from a message that had a very different "life cycle" than anything described so far:

Received: from unwilling.intermediary.com (unwilling.intermediary.com
[98.134.11.32]) by mail.someisp.com (8.8.5) id 004B32 for
; Wed, Jul 30 1997 16:39:50 -0800 (PST)
Received: from turmeric.com ([104.128.23.115]) by unwilling.intermediary.com
(8.6.5/8.5.8) with SMTP id LAA12741; Wed, Jul 30 1997 19:36:28 -0500 (EST)
From: Anonymous Spammer
To: (recipient list suppressed)
Message-Id:
X-Mailer: Massive Annoyance
Subject: WANT HOT PORN???

A variety of things in this header might clue the reader in to the fact that this is a piece of electronic junk mail, but the thing to focus on here is the Received: lines. This message originated at turmeric.com, was passed from there to unwilling.intermediary.com, and from there to its final destination at mail.bieberdorf.edu. All well and good--but how did unwilling.intermediary.com get there, since it has nothing to do with either the sender or the recipient? Understanding the answer requires some knowledge of SMTP. In essence, turmeric.com simply connected to the SMTP port at unwilling.intermediary.com and told it "Send this message to This email address is being protected from spambots. You need JavaScript enabled to view it.". It did this, probably, in the most direct manner imaginable, by saying RCPT TO: This email address is being protected from spambots. You need JavaScript enabled to view it.. At that point, unwilling.intermediary.com took over processing the message, since it had been told to send it to a user at some other domain (someisp.com ). It went out and found the mail server for that domain and handed off its mail in the usual manner. This process is known as mail relaying.

Historically, there are good reasons for allowing relaying. On much of the Net until about the late 1980s, machines rarely sent mail by talking directly to each other. Rather, they worked out a route for a message to travel, and sent it step-by-step along that route. It was a cumbersome system (especially since the sender often had to work out the route by hand!) By way of analogy, imagine sending a letter from Phoenix to New York, and having to address the envelope thus:

Phoenix, Flagstaff, Albuquerque, Salt Lake City, Rock Springs, Laramie, North Platte, Lincoln, Omaha, Des Moines, Cedar Rapids, Dubuque, Rockford, Chicago, Gary, Elkhart, Fort Wayne, Toledo, Cleveland, Erie, Elmira, Williamsport, Newark, New York City, Greenwich Village, #86 Deadbeat Row, Apt. #2b, Bob Dylan.

It's clear why this is a useful addressing model if you're a postal worker---the post office in Gary, Indiana only has to be able to communicate with the adjacent offices in Chicago and Elkhart, rather than having to devote its resources to figuring out how to get something to New York. (It's also clear why this isn't a good idea from the standpoint of the letter-writer, and why email is no longer commonly routed this way!) This is exactly how email was sent; so it was important that one machine be able to give another instructions that said "I have email for This email address is being protected from spambots. You need JavaScript enabled to view it., to be sent from you to turmeric.com to galangal.org to asafoetida.com to This email address is being protected from spambots. You need JavaScript enabled to view it.", hence relaying. In modern times, however, relaying is usually used by unethical advertisers as a technique for concealing the source of their messages, deflecting complaints to the innocent relay site rather than to the spammers' own ISPs. It also offloads the work of processing addresses and contacting recipients from the spammers' machines to those of an uninvolved third party. It's widely felt that relaying, especially large-scale relaying, constitutes theft of service for that reason. For that reason, reporting to the ISP is the best you can do, since they truly are motivated to stop this if they are unwilling participants. If they are not, they will still do it because you can sue based on the Telecommunications Act.

The essential point here is to realize that the content of the message was formulated at the sending point---turmeric.com, in the example above. Unwilling.intermediary.com, is involved only as an unwilling intermediary. They have no control over the sender, as much as the Flagstaff post office has no real influence over someone writing letters in Phoenix. The intermediate link does, however, have the power to turn off relaying at their site, though!

One more thing to notice in the sample headers: The Message-Id: line was filled in, not by the sending machine (turmeric.com), but by the relayer (unwilling.intermediary.com). This is a common feature of relayed mail. It just reflects the fact that the sending machine didn't supply a Message-Id.

Envelope Headers

The section on SMTP, above, alluded to a distinction between "message" and "envelope" headers. This distinction and some of its consequences are detailed here.

Briefly, the "envelope" headers are actually generated by the machine that receives a message, rather than by the sender. By this definition, Received: headers are envelope headers. However, the term usually refers to the "envelope From" and "envelope To" only.

The envelope From header is the header derived from the information in a MAIL FROM command. For instance, if a sending machine says MAIL FROM: This email address is being protected from spambots. You need JavaScript enabled to view it., the receiving machine will generate an envelope From header that looks like this:

>From This email address is being protected from spambots. You need JavaScript enabled to view it.

Notice the absence of the colon---"From", not "From:". Frequently, envelope headers don't have colons after them. This convention is not universal, but it is common enough to pay attention to.

Symmetrically, the envelope To is derived from a RCPT TO command. If the sender says RCPT TO: This email address is being protected from spambots. You need JavaScript enabled to view it., then the envelope To is This email address is being protected from spambots. You need JavaScript enabled to view it.. There often isn't an actual header containing this information; sometimes it's embedded in the Received: headers.

An important consequence of the existence of envelope information is that the message From: and To: headers are meaningless. The contents of the From: header are provided by the sender; and so, counter intuitively, are the contents of the To: header. Mail is routed only based on the envelope To, never based on the message To: header.

To see this in action, consider an SMTP transaction like this:

HELO galangal.org
250 This email address is being protected from spambots. You need JavaScript enabled to view it. Hello turmeric.com [104.128.23.115], pleased to meet you
MAIL FROM: This email address is being protected from spambots. You need JavaScript enabled to view it.
250 This email address is being protected from spambots. You need JavaScript enabled to view it.... Sender ok
RCPT TO: This email address is being protected from spambots. You need JavaScript enabled to view it.
250 This email address is being protected from spambots. You need JavaScript enabled to view it.... Recipient OK
DATA
354 Enter mail, end with "." on a line by itself
From: This email address is being protected from spambots. You need JavaScript enabled to view it.
To: (your address suppressed for stealth mailing and annoyance)
.
250 OAA08757 Message accepted for delivery
Here are the corresponding headers (excerpted for clarity):
>From This email address is being protected from spambots. You need JavaScript enabled to view it.
Received: from galangal.org ([104.128.23.115]) by This email address is being protected from spambots. You need JavaScript enabled to view it. (8.8.5)
for ...
From: This email address is being protected from spambots. You need JavaScript enabled to view it.
To: (your address suppressed for stealth mailing and annoyance)

Notice that the contents of the envelope From, the message From:, and the message To: are all dictated by the sender, and have no bearing whatsoever on reality! This example illustrates why the From, From:, and To: headers can never be trusted in mail that might be forged. They are simply too easy to falsify.

The Importance of Received: Headers

We've seen already, in the examples above, that the Received: headers provide a detailed log of a message's history, and so make it possible to draw some conclusions about the origin of a piece of email even when other headers have been forged. This section explores some details associated with these singularly important headers, and, in particular, how to circumvent common forgery techniques.

Unquestionably, the single most valuable forgery protection in the Received: headers is the information logged by the receiving host from the sender. Recall that the sender can lie about its identity by putting garbage in its HELO command to the receiver. Fortunately, modern mail transfer programs are able to detect such false information and correct it.

If, for instance, the machine turmeric.com, whose IP address is 104.128.23.115, sends a message to This email address is being protected from spambots. You need JavaScript enabled to view it., but falsely says HELO galangal.org, the resultant Received: line might start like this: Received: from galangal.org ([104.128.23.115]) by This email address is being protected from spambots. You need JavaScript enabled to view it. (8.8.5)... (The rest of the line is omitted for clarity.)

Notice that, although the someisp.com machine doesn't explicitly announce that galangal.org wasn't really the sending machine, it does record the correct IP address of the sender. If someone receiving the mail had reason to think that galangal.org appeared in the headers through the work of a forger, they could look up the IP address 104.128.23.115 (you can go to www.amnesi.com for this) and find that that address in fact belonged to turmeric.com (and not galangal.org). In other words, logging the IP address of the sending machine provides enough information to confirm a suspected forgery.

Many modern mail programs actually automate this process, looking up the name of the sending machine on their own. The lookup process is called reverse DNS (for Domain Name Service)---"reverse" because it reverses the usual process of translating a name to an address for routing purposes. If This email address is being protected from spambots. You need JavaScript enabled to view it. were using software that did this, the Received: line would start something like this:

Received: from galangal.org (turmeric.com [104.128.23.115]) by This email address is being protected from spambots. You need JavaScript enabled to view it....

Here the forgery is crystal clear; this line effectively says "turmeric.com, whose address is 104.128.23.115, reported its name as galangal.org." Needless to say, information like this is extremely helpful in identifying and tracking forged email! (For this very reason, spammers try to avoid using relaying machines that report reverse DNS information. Sometimes they even find machines that don't do the kind of IP logging described in the previous paragraph---though there aren't very many of those around on the net any more.) Another trick used by forgers of email, which is increasingly common, is adding spurious Received: headers before sending the offending mail. This means that the hypothetical email sent from turmeric.com might have Received: lines that looked something like this:

Received: from galangal.org ([104.128.23.115]) by This email address is being protected from spambots. You need JavaScript enabled to view it. (8.8.5)...
Received: from nowhere by fictitious-site (8.8.3/8.7.2)...
Received: No Information Here, Go Away!

Obviously, the last two lines are complete nonsense, written by the sender and attached to the message before it was sent.

Since the sender has no control over the message once it leaves turmeric.com, and Received: headers are always added at the top, the forged lines have to appear at the bottom of the list. This means that someone reading the lines from top to bottom, tracing the history of the message, can safely throw out anything after the first forged line. Even if the Received: lines after that point look plausible, they're guaranteed to be forgeries.

Of course, the sender doesn't have to use obvious garbage; a really devious forger could create a plausible list of Received: lines like this:

Received: from galangal.org ([104.128.23.115]) by This email address is being protected from spambots. You need JavaScript enabled to view it. (8.8.5)...
Received: from lemongrass.org by galangal.org (8.7.3/8.5.1)...
Received: from graprao.com by lemongrass.org (8.6.4)...

Here the only dead giveaway is the inaccurate IP address for galangal.org in the very first Received: line. The forgery would still be harder to detect if the forger had written in correct IP addresses for lemongrass.org and graprao.com. However, the IP mismatch in the first line would still reveal that the message had been forged and "injected" into the network at the site 104.128.23.115 (i.e., turmeric.com). Most header forgeries are considerably less sophisticated, and the extra Received: lines are obvious garbage.

List of Common Headers

Apparently-To: Messages with many recipients sometimes have a long list of headers of the form "Apparently-To: This email address is being protected from spambots. You need JavaScript enabled to view it." (one line per recipient). These headers are unusual in legitimate mail. They are normally a sign of a mailing list, and in recent times, mailing lists have generally used software sophisticated enough not to generate a giant pile of headers. In our case, with good old someassociation, we are dealing with a mediocre mail list system, so all someassociation list emails will have the X-Apparently-To:. We can tell who eGroups addresses the email to in X-eGroups-Return:, which in the example, shows my ID.

Bcc: (stands for "Blind Carbon Copy") If you see this header on incoming mail, something is wrong. It's used like Cc: (see below), but does not appear in the headers. The idea is to be able to send copies of email to persons who might not want to receive replies or to appear in the headers. Blind carbon copies are popular with spammers, since it confuses many inexperienced email users because it doesn't appear to be addressed to them.

Cc: (stands for "Carbon Copy", which is meaningful if you remember typewriters) This header is sort of an extension of "To:". It specifies additional recipients. The difference between "To:" and "Cc:" is essentially connotative. Some mailers also deal with them differently when generating replies.

Comments: This is a nonstandard, free-form header field. It's most commonly seen in the form "Comments: Authenticated sender is ". A header like this is added by some mailers (notably the popular freeware program Pegasus) to identify the sender. However, it is often added by hand (with false information) by spammers, as well. Treat with caution.

Content-Transfer-Encoding: This header relates to MIME, a standard way of enclosing non-text content in email. It has no direct relevance to the delivery of mail, but it affects how MIME-compliant mail programs interpret the content of the message.

Content-Type: Another MIME header, telling MIME-compliant mail programs what type of content to expect in the message.

Date: This header does exactly what you'd expect: It specifies a date, normally the date the message was composed and sent. If this header is omitted by the sender's computer, it might conceivably be added by a mail server or even by some other machine along the route. It shouldn't be treated as gospel truth. Forgeries aside, there are an awful lot of computers in the world with their clocks set wrong.

Errors-To: Specifies an address for mailer-generated errors to go to, like "no such user" bounced messages (instead of the sender's address.) This is not a particularly common header, as the sender usually wants to receive any errors at the sending address, which is what most (essentially all) mail server software does by default.

From (without colon) This is the "envelope From" discussed above.

From: (with colon) This is the "message From:" discussed above.

Message-Id: (also Message-id: or Message-ID:) The Message-Id is a more-or-less unique identifier assigned to each message, usually by the first mail server it encounters. Conventionally, it is of the form "This email address is being protected from spambots. You need JavaScript enabled to view it. ", where the "gibberish" part could be absolutely anything and the second part is the name of the machine that assigned the ID. Sometimes, but not often, the "gibberish" includes the sender's username. Any email in which the message ID is malformed (e.g., an empty string or no @ sign), or in which the site in the message ID isn't the real site of origin, is probably a forgery.

In-Reply-To: A Usenet header that occasionally appears in mail, the In-Reply-To: header gives the message ID of some previous message which is being replied to. It is unusual for this header to appear except in email directly related to Usenet. Spammers have been known to use it, probably in an attempt to evade filtration programs.

Mime-Version: (also MIME-Version:) Yet another MIME header, this one just specifies the version of the MIME protocol that was used by the sender. Like the other MIME headers, this one is usually eminently ignorable. Most modern mail programs will do the right thing with it. Newsgroups: This header only appears in email that is connected with Usenet---either email copies of Usenet postings, or email replies to postings. In the first case, it specifies the newsgroup(s) to which the message was posted. In the second, it specifies the newsgroup(s) in which the message being replied to was posted. The semantics of this header are the subject of a low-intensity holy war, which effectively assures that both sets of semantics will be used indiscriminately for the foreseeable future.

Organization: A completely free-form header that normally contains the name of the organization through which the sender of the message has Net access. The sender can generally control this header, and silly entries like "Royal Society for Putting Things on Top of Other Things" are commonplace.

Priority: An essentially freeform header that assigns a priority to the mail. Most software ignores it. It is often used by spammers, usually in the form "Priority: urgent" (or something similar), in an attempt to get their messages read.

Received: Discussed in detail above.

References: The References: header is rare in email except for copies of Usenet postings. Its use on Usenet is to identify the "upstream" posts to which a message is a response. When it appears in email, it's usually just a copy of a Usenet header. It may also appear in email responses to Usenet postings, giving the message ID of the post being responded to, as well as, the references from that post.

Reply-To: Specifies an address for replies to go to. Though this header has many legitimate uses (perhaps your software mangles your From: address and you want replies to go to a correct address), it is also widely used by spammers to deflect criticism. Occasionally a naive spammer will actually solicit responses by email and use the Reply-To: header to collect them, but more often the Reply-To: address in junk email is either invalid or an innocent victim.

Sender: This header is unusual in email (X-Sender: is usually used instead), but appears occasionally, especially in copies of Usenet posts. It should identify the sender. In the case of Usenet posts, it is a more reliable identifier than the From: line.

Subject: A completely free-form field specified by the sender, intended, of course, to describe the subject of the message.

To: The "message To: "described above. Note that the To: header need not contain the recipient's address!

Mailing-List: sometimes in mailing lists to identify the name of the actual list - in our case, it's This email address is being protected from spambots. You need JavaScript enabled to view it.. They are even nice enough to provide an address for the list owner to aid in abuse issues. They even provide the unsubscribe address!

Delivered-To: Usually for mailing lists; it shows the list's address.

Precedence: Again usually for mailing lists. Not very significant, unless being used to analyze content for junk mail rules.

X-headers is the generic term for headers starting with a capital X and a hyphen. The convention is that X-headers are nonstandard and provided for information only, and that, conversely, any nonstandard informative header should be given a name starting with "X-". This convention is frequently violated.

X-Confirm-Reading-To: This header requests an automated confirmation notice when the message is received or read. It is typically ignored. Presumably, some software acts on it.

X-Distribution: In response to problems with spammers using his software, the author of Pegasus Mail added this header. Any message sent with Pegasus to a sufficiently large number of recipients has a header added that says "X-Distribution: bulk". It is explicitly intended as something for recipients to filter against.

X-Errors-To: Like Errors-To: This header specifies an address for errors to be sent to. It is probably less widely obeyed.

X-Mailer: (also X-mailer:) A freeform header field intended for the mail software used by the sender to identify itself (as advertising or whatever.) Since much junk email is sent with mailers invented for this purpose, this field can provide much useful fodder for filters. Typical of Micro$oft to plug their MIME here.

X-PMFLAGS: This is a header added by Pegasus Mail. Its semantics are not obvious. It appears in any message sent with Pegasus, so it doesn't obviously convey any information to the recipient that isn't covered by the X-Mailer: header.

X-Priority: Another priority field, used notably by Eudora to assign a priority (which appears as a graphical notation on the message.)

X-Sender: The usual email analogue to the Sender: header in Usenet news. This header purportedly identifies the sender with greater reliability than the From: header. In fact, it is nearly as easy to forge, and should therefore be viewed with the same sort of suspicion as the From: header.

X-UIDL: This is a unique identifier used by the POP protocol for retrieving mail from a server. It is normally added between the recipient's mail server and the recipient's actual mail software. If mail arrives at the mail server with an X-UIDL: header, it is probably junk (there's no conceivable use for such a header, but for some unknown reason many spammers add one.)

Congratulations if you got this far! I hope this was as useful as I intended. Credit is due to Ken Lucke who mastered this back in 1997. Thanks Ken!

http://aces.arbita.net/sourcer/spaminfo