+91 9619904949
02-Email e-mail Infrastructure

02-Email e-mail Infrastructure

How Email Works

Email is a store-and-forward method of exchanging messages on the Internet. This means a message sent by a user goes through an asynchronous process of delivery, typically involving a series of steps. In each step, the message is stored by an intermediate server on the network to be forwarded at a later time until it finally reaches its destination. The timing of delivery depends on the availability of network connections.

Figure 1 illustrates the delivery process, which involves a sender, Alice, and a recipient, Bob. Both Alice and Bob use specific applications called email clients, which run on their PCs to send and receive emails. These clients do not communicate directly but connect to email servers, which are specialized applications operated by Alice’s and Bob’s organizations or ISPs that manage the delivery process.

Figure 1 – Basic Email Infrastructure

 

The email delivery process involves the following steps:

  1. Alice composes the message using her email client.
  2. The message is formatted by Alice’s email client in a specific Internet email format and then sent to her local email server.
  3. Alice’s email server locates the address of Bob’s email server using the Domain Name System (DNS), the distributed directory of the Internet.
  4. The two email servers exchange the message, which may pass through a series of intermediate servers on the network, until it is finally stored in Bob’s personal mailbox on Bob’s email server.
  5. The message remains in Bob’s mailbox until he reads or downloads it using his email client.

The procedure is quite similar to the process Alice and Bob follow when exchanging letters. Local post offices play a role similar to that of local email servers, and letter delivery may go through additional post offices (intermediate servers). In both cases, delivery time and even delivery itself are not guaranteed.

The Internet is a best-effort network, meaning the message, like any other information crossing the network, must pass through several servers run by independent organizations that make no commitment to service availability or quality. Therefore, delivery time cannot be predicted, and the message may even get lost along the way.

However, as we will discuss later in more detail, all clients and servers involved in the delivery process follow a set of strict rules (protocols). This allows for the tracing of all relevant events and the recording of detailed information in a report appended to the message. Additionally, in case of delivery failure, the server may attempt delivery again, and the sender may request delivery reports and receipts to confirm that the message has been delivered and/or read by the recipient.

End-User Access to Email

End users can access the email system in several ways:

  1. Email Client: This method corresponds to the basic process discussed in the previous section, where the user runs a special application on their PC designed to interact with the email server. Email clients can be proprietary or open-source software, and a wide variety of them are available. Besides the basic functions of sending and retrieving messages from the email server, which are performed according to standard interaction protocols that ensure interoperability, they usually offer user-friendly interfaces and additional functions to classify and store messages, manage directories, and more. In this setup, messages are typically downloaded and stored on the user’s PC, which may not be convenient for users who need to access their mail from multiple devices.
  2. Webmail: This is the most common way users access email from their home PC, through a service offered by their ISPs or third-party organizations like Hotmail or Gmail. In this setup (see Figure 2), the client application running on the end user’s PC is an Internet browser (e.g., Explorer, Mozilla), which connects to a web server running a special webmail application. The web server acts as an intermediary and manages the connection with the email server. Additionally, messages are not downloaded to the user’s PC but are managed and stored directly on the web server. This provides a significant advantage for users who need to access their mail from multiple devices.
  3. Integrated Systems: This is the typical solution used by most corporations and large organizations. It integrates email access into a broader ‘collaborative’ environment that includes additional functions such as direct messaging, calendaring, contacts, and tasks, as well as support for mobile and web-based access to information. It also manages message storage on a central server. Popular products of this kind include Microsoft Exchange and IBM Lotus Domino. Users run proprietary client applications (e.g., Microsoft Outlook or Lotus Notes) on their PCs that connect to the corporate server, which in turn connects to the email server (see Figure 3). To assist mobile users, these systems often include an optional web interface, functionally equivalent to webmail, which allows access through a web browser. However, the primary interface is typically the proprietary one used on the organization’s intranet. Although this setup is specific and includes proprietary elements, it is essential to consider because it represents a significant portion of the market, especially for email archiving in corporations and large organizations.

Figure 2 – Webmail

 

Figure 3 – Corporate Mail with Integrated System

Interoperability of Email Systems

As discussed in previous sections, exchanging a message involves interaction among several agents (email clients and servers), which are generally heterogeneous systems based on different hardware and software platforms. Additionally, these systems are independently designed and implemented by different parties, potentially without any direct coordination.

One of the main challenges in the Internet email system is ensuring interoperability, i.e., correct and reliable communication among these heterogeneous systems. Interoperability is based on two main elements:

  1. Communication Protocols: These are sets of rules governing communication between agents, ensuring that agents can reliably and correctly interact using a common language and standard procedures.
  2. Message Format: This is a set of formal definitions specifying the structure of the message and how the message and its attachments are encoded, ensuring correct interpretation by different email clients and guaranteeing that the content of the message is correctly rendered to its recipient.

Another requirement is that interoperability must also be guaranteed over time. This means that when the definitions of protocols and message formats evolve, they should maintain backward compatibility, i.e., new rules should still be compatible with old ones. For example, a message formatted according to an older version of the message format standard should be presented correctly by an email client compliant with the new version. Unfortunately, this is not always the case, and it is a major concern in email archiving, where ensuring that archived messages remain readable over time, even as standards evolve, is crucial.

Internet Standards

The standardization process of the Internet is somewhat different from the usual ISO/IEC track, so it is worth explaining how these standards are developed and allowed to evolve.

Internet standards are developed and promoted by the Internet Engineering Task Force (IETF), which cooperates closely with major international standard bodies like ISO/IEC and the World Wide Web Consortium (W3C), the main international standards organization for the World Wide Web.

The standardization process, which dates back to the early days of the ARPAnet project, is highly cooperative and based on special documents called Request For Comments (RFC). RFCs are draft documents, mostly proposals for standards, published by the IETF and posted on the network as a ‘request for comments.’ Each RFC is assigned a unique number and is never rescinded or modified. If amendments are needed, a new RFC is issued with a different number, superseding the old one.

As stated in RFC 1796, which discusses the standardization process, “Not all RFCs are standards.” Some are just memoranda, remarks that people wish to share, research papers, or preliminary proposals on any matter concerning the Internet and Internet-based systems. The IETF assigns a status to each RFC.

‘Mature’ RFCs are rated Standard Track and are further divided into Proposed Standard, Draft Standard, and Internet Standard. Internet Standards (STD) each refer to an RFC (or a set of RFCs) and are given a unique number. Unlike the RFC number, when the standard evolves, the STD number does not change but simply refers to a new RFC that supersedes the original one.

Standardization of Email Transmission

Server-to-server and client-to-server interoperability are ensured by SMTP (Simple Mail Transfer Protocol), which is Internet Standard STD 10. SMTP dates back to August 1982 and is based on RFC 821. However, the protocol currently used by the majority of email applications is known as ESMTP (Extended SMTP) and is defined in RFC 2821, published in April 2001.

However, formally, the status of RFC 2821 is still a Proposed Standard, and the official standard is still the one defined by RFC 821. This situation of ‘going ahead of the official standard’ is typical of the Internet world, and it is of no use to argue whether it is right or wrong; we must simply cope with it.

SMTP specifies how the email client interacts with the email server to deliver the message and how email servers (often called SMTP servers) interact with each other to ensure the message passes through several agents and finally reaches its destination. The use of the SMTP protocol in the message delivery process is clearly shown in Figures 1 and 2.

Regarding the problem of email archiving, this standard is important because it defines the basic format of messages that can be handled by SMTP servers and go through the delivery process. This is a very basic format, supporting only simple text messages in plain ASCII (also called 7-bit ASCII or US-ASCII) characters, which are sufficient only for English and a few other languages. This limitation is overcome by defining a special way to encode richer content in plain ASCII characters, allowing the use of a more general set of characters in the message text, and including formatted text and multimedia content in email messages, as we will discuss in section 2.7.

Standardization of Client-Server Communication

Email clients can retrieve email from servers in several ways, supported by both standard and proprietary protocols. This is relevant to email archiving because the process of storing email messages must deal with how they are downloaded and handled by different client applications, which may affect the process and determine the format of archived messages.

POP3

POP3 (Post Office Protocol version 3) is the protocol most commonly used by email clients to retrieve messages from servers. The official Internet Standard is defined in STD 53 and is based on RFC 1939, published in May 1996. This protocol is limited in scope and allows for the download of messages only. It does not include the management of mail folders on the server side (e.g., Inbox, Sent, Drafts) or any other advanced features like server-based search or access to metadata. This is a severe limitation, especially when dealing with multiple clients, such as a PC and a smartphone, where folders should be synchronized.

IMAP4

IMAP4 (Internet Message Access Protocol version 4) is the most advanced and feature-rich protocol, officially defined by STD 55 and based on RFC 3501, published in March 2003. IMAP4 supports advanced folder management, server-based search, access to metadata, and offline operations. This makes it much better suited for use with multiple clients. However, it is more complex and demanding in terms of computing and network resources.

Webmail Protocols

The Webmail interface uses an Internet browser as the client application and a Web server (or a special Webmail server) as an intermediary that connects to the email server. The protocols used by the browser to communicate with the Web server are HTTP (Hypertext Transfer Protocol) and HTTPS (Secure HTTP). These protocols are not email-specific and are defined by Internet Standards STD 1 (RFC 2616, June 1999) and STD 66 (RFC 2818, May 2000), respectively.

The protocols used by the Web server to communicate with the email server are generally SMTP and IMAP, already discussed in previous sections.

This setup is highly relevant to email archiving, especially when it comes to ensuring that the archived message’s format includes all the information and content needed to faithfully reconstruct the message as seen by the user when accessing it via Webmail.

Standardization of Message Format

The Internet Standard format for email messages is defined by RFC 822 (August 1982), later superseded by RFC 2822 (April 2001), which specifies the format of the email header and body. The format defined by RFC 2822 is still the official standard, though it has been further refined by several other RFCs. The standard email format supports only plain text messages in US-ASCII encoding, which is a major limitation for modern email communication.

This limitation is overcome by the Multipurpose Internet Mail Extensions (MIME) standard, defined by STD 11, which is based on a set of five RFCs (RFC 2045 to RFC 2049, published in November 1996). MIME allows for the use of various character sets and multimedia content (e.g., images, sound, video) in email messages. It also supports the encoding of binary content in a 7-bit ASCII format, which is essential for the correct transmission of non-ASCII content in email messages. MIME is fundamental to modern email communication and is supported by almost all email clients and servers.

Conclusion

The email system is an essential part of modern communication, involving a complex and well-coordinated process of message exchange between various agents (clients and servers) over the Internet. The system is based on a set of standard protocols and message formats that ensure interoperability among heterogeneous systems and reliable communication across the network. These standards are defined and promoted by the IETF through a cooperative and evolving process of RFC publication and review.

The email system supports various access methods for end users, including traditional email clients, Webmail interfaces, and integrated corporate systems. Each method has its own strengths and weaknesses, but all rely on the same underlying protocols and message formats. The system’s success and widespread adoption are due to the interoperability guaranteed by these standards, which allow for seamless communication between different systems, platforms, and applications.

Understanding the standardization of email transmission, client-server communication, and message format is crucial for ensuring the long-term usability and accessibility of archived email messages. By following these standards, organizations can ensure that their archived email messages remain readable and accessible, even as technology evolves.

01-Email Introduction

The first email was sent in 1971 between two computers sitting side by side in the same room, but it traveled through ARPAnet, the ancestor of the Internet. This marked the first time a message was systematically transmitted across a computer network.

The insightful remark by J.C.R. Licklider, quoted above, was made just a few years later when email was still confined to a limited circle within the scientific community, with widespread use at least a decade away. Licklider, a psychologist from MIT who conceived some of the earliest ideas of a global computer network and significantly contributed to ARPAnet, had a remarkably clear vision of what was to come and a prophetic sense of the role that this new medium would play in human communication.

Today, email is by far the most widely used form of written communication. It is estimated that more than 100 billion emails are sent daily, with that number projected to reach 300 billion by 2010. Additionally, over the last decade, it has become increasingly evident that in business, government, and even personal activities, a crucial share of relevant information is exchanged through email. In many cases, this information exists solely in email. For example, it has been estimated that email accounts for about 75% of corporate intellectual property.

Given this, the need to preserve and archive email has become clear. It would be unwise to preserve other documents while neglecting email, where the majority of information is concentrated. As a matter of fact, in recent years, many corporations and government agencies have dedicated significant resources to email archiving, triggering a market expected to reach half a billion dollars in software licenses and maintenance services by 2008.

A more detailed analysis reveals several motivations for email archiving:

Storage Concerns

The volume of email messages that corporations and large organizations must handle is vast and growing rapidly. However, email servers were not designed to store and manage large amounts of messages and attachments for extended periods. Consequently, most organizations enforce size limits on their employees’ mailboxes, often leading users to back up messages they consider important on their own PCs before they disappear from the servers. This process is informal, uncontrolled, and unreliable, with backed-up messages accessible only to the individual users who stored them—if they can still find them. Addressing storage concerns remains the primary motivation for email archiving and the strongest market driver.

Strategic Relevance

Email messages have become an increasingly important and strategic resource for organizations, and therefore should be centrally managed and archived according to precise and well-defined criteria. This approach automates and accelerates business processes, potentially leading to substantial savings by reducing the time spent locating and retrieving messages. Moreover, when an archival solution is implemented, email messages can be integrated with other organizational data and analyzed to monitor business processes and extract knowledge that can inform business strategies.

Regulatory Compliance

In recent years, many companies have faced substantial fines for failing to preserve corporate email records. For instance, in 2005, Morgan Stanley was fined $1.45 billion—an event some have dubbed a “legal Chernobyl”—for being unable to produce corporate email records during an investigation (due to lost or unrecoverable backup tapes). While the fines in other cases have been smaller, the total amount awarded in recent years has reached several billion dollars. In the U.S., new Federal Rules of Civil Procedure Amendments mandate that the production of electronic information is no longer optional. U.S. companies must be prepared to support electronic discovery and be able to quickly produce all records requested by the court, particularly emails, which have played a central role in many recent cases. Although the most prominent cases involve private organizations, government agencies must comply as well. Regulatory compliance has driven many organizations to implement email archiving systems in recent years, making it a significant market driver in the U.S.

Historical Preservation

Last but not least, in many situations, email messages should be archived and preserved as historical records for the benefit of future generations. This is especially important since, as noted earlier, email has become the most important form of communication between individuals, replacing paper-based correspondence and, in many cases, substituting or integrating telephone conversations. Historians of future generations may have a better chance of studying the Internet age than earlier parts of the 20th century when most rapid communication occurred via telephone, leaving almost no record in archives. We have a responsibility to preserve this valuable information.

The purpose of this document is to provide a concise but comprehensive account of the main issues related to email preservation and archiving, highlight the key challenges, and outline the basic policies and procedures. This is no trivial task, as email messages are a unique type of electronic document with a complex structure. Additionally, the specific infrastructure through which they are delivered—namely, the Internet—must be considered to some extent. Therefore, we have included a preliminary section on the email infrastructure and message format, issues that some users may view as technicalities, but which we believe are essential to understanding the challenges associated with preserving and archiving email messages.

2-GCP Region-and-Zone

Regions:-
Regions are Independent geographic areas that Consist of Zones. They affect pricing, reliability networking, and Performance. EG:- Zonal Resource:- VM
Every Regios have 3 zones in GCP.

Zone:-
A Zone is a deployment area for google cloud Resources within a region Zones Should be Considered
a single failure domain within a Region.
To deploy fault-tolerant applications with high availability and help protect against unexpected failures, deploy your application across multiple Zones in a region. EG:- Regional Resource → app engine

Multi-region:-

Multi-regional Services are designed to be able to function following the loss of a single region.

Multi-regional Resources:- BigQuery, Bigtable, Cloud Storage, Spanner, Datastore, Firestore, Artifact Registry

If a single region fails then only Customers in that region are impacted, Customers who have multi-region products are not impacted.

The fully qualified name for a zone is made up of <region> <zone >

EG:- Zone-a in region Us-central1 is Us-central1-a
every Regios end with 1,2,3 and zone ends with a,b,c

for more details visit:-
https://cloud.google.com/about/locations
And
https://cloud.google.com/compute/docs/regions-zones

07-Email Disposable E-mail Address

 As an email marketer, reaching his audience effectively is essential for the success of your campaigns. In today’s online world, many people use temporary email addresses called disposable email domains. This can be both good and bad for email marketers. Understanding how disposable email domains impact your email marketing efforts is crucial for optimizing engagement and maximizing results.

What are Disposable Email Domains?

Disposable email domains are temporary email addresses created for short-term use. Users often employ these addresses to sign up for online services, newsletters, or forums without giving out their main email address. Popular disposable email services include 10minutemail.com, Guerrilla Mail, and Temp Mail.

Challenges:-

Low Engagement Rates: Emails sent to disposable addresses may have lower open and click-through rates since users often use them for temporary purposes and may not engage with the content.

Spam Filtering: Many disposable email domains are flagged by spam filters, leading to your emails being automatically routed to the spam folder or rejected altogether.

Data Quality Concerns: Since disposable email addresses are temporary, maintaining accurate subscriber data becomes challenging, impacting the quality of your email list.

Deliverability Issues: Email service providers may view emails sent to disposable addresses as suspicious, affecting deliverability rates and sender reputation.

Strategies for Overcoming Challenges:-

Segmentation: Segment your email list to identify and exclude disposable email addresses. Focus your efforts on engaging with subscribers who are more likely to interact with your content.

Email Verification: Implement email verification processes during the signup phase to detect and block disposable email addresses. This ensures that your list comprises genuine subscribers who are interested in your content.

Content Personalization: Tailor your email content to resonate with your audience’s interests and preferences. Personalized emails are more likely to capture the attention of subscribers, regardless of the email address type.

Optimize Deliverability: Monitor your email deliverability metrics closely and address any issues promptly. Utilize best practices for email authentication, such as SPF, DKIM, and DMARC, to enhance deliverability and inbox placement.

Incentivize Engagement: Offer incentives or exclusive content to encourage subscribers to use their primary email addresses rather than disposable ones. This fosters a more meaningful connection with your audience and increases the likelihood of sustained engagement.

block and verification:-
block into PowerMTA:-

domain-macro Disposable_dom nitwings.com, 0-00.usa.cc, 001.igg.biz

<domain $Disposable_dom>
type discard
discard-as-bounce yes
</domain>

block into Postfix:-

add the line in the Postfix’s “/etc/postfix/main.cf”.
transport_maps = hash:/etc/postfix/transport

Way one:- only send an email for Yahoo and Gmail all other domains are discarded

write /etc/postfix/transport

gmail.com:
yahoo.com:
* discard:

Way two:- only discard specific domains and all others are allowed.
create /etc/postfix/transport

nitwings.com discard:
blackpost.net discard:

Where to get Disposable:-

https://raw.githubusercontent.com/iocium/download.throwaway.cloud/main/list.txt

“https://raw.githubusercontent.com/andreis/disposable-email-domains/master/domains.json”

“https://github.com/ivolo/disposable-email-domains/blob/master/wildcard.json”

You can use open.kickbox.com free API to write automation and find out whether the domain is disposable or not.

“https://open.kickbox.com/v1/disposable/yopmail.com”

Small Providers need to block disposable domains because spam filter providers use unused disposable boxes as traps and generate high spam complements. Also, it is included in phishing and suspicious activity.