High availability webserver architecture in AWS

7 min readOct 26, 2020

High availability webserver architecture

We are going to set up webserver (client-server architecture) with high availability feature in this article. Generally, a server is the system which provides some services to outside world and if host website on server then it becomes webserver and any system that accesses webserver for any services is client (web client).

As a first step we will launch ec2 instance with existing key pair and attach existing security group to that instance so that the instance is having proper authentication when accesses by outside world.

1) Security group ingress rules set up so that it is not accessed by inappropriate users:

ssh access from only your system:

aws ec2 authorize-security-group-ingress — group-id sg-00059bd66a5dxxxx — protocol tcp — port 22 — cidr xx.xxx.xxx.xxx/32

http access from outside world:

aws ec2 authorize-security-group-ingress — group-id sg-00059bd66a5dxxxx — protocol tcp — port 80 — cidr 0.0.0.0/0

aws ec2 run-instances — image-id ami-0e306788ff2473ccb — count 1 — instance-type t2.micro — key-name xxxxxxx — subnet-id subnet-59037915 — security-group-ids sg-00059bd66axxxxxxxx

Now the instance is up and running. To set up this instance as webserver we have to install software and, in our case, we are going to use software provided by Apache and it is httpd. We will use yum to install httpd as below.

yum install httpd

After installing httpd query httpd package using rpm to make sure it is installed

After installation start httpd service as below:

systemctl start httpd è For every reboot we have to execute this to start the service and if we need to automatically start the service after reboot use “enable” instead of “start”.

Check for status of httpd server

If we are using Apache webserver (httpd) by default this stores/hosts webpages in document root which is /var/www/html.

There is nothing initially in document root.

Create one simple webpage in document root.

Accessed through chrome browser:

In the above set up we have one issue where our OS and all website related pages (web pages) and data is stored in root volume. If by chance if OS is corrupted we will loose entire data that is critical for the website.

To over come this issue it is best practice to maintain critical independent of OS block storage and to achieve this we have to create EBS volume and attach it to our instance and maintain document root over there. This will make sure even if OS is corrupted we have our critical data available in EBS volume. (this is like storing data in pen drive or external hard disk) and when ever OS is corrupted we can install new OS and attach our device where document root data is maintained.

Create EBS volume in same availability zone where ec2 instance is launched as below:

aws ec2 create-volume — availability-zone ap-south-1b — size 1 — volume-type gp2

Attach above EBS volume to instance that is already launched:

aws ec2 attach-volume — device xvdc — instance-id i-04606089e88exxxx — volume-id vol-0644b990daec6xxxx

fdisk -l

Lists the new device of size 1 GiB that we attached.

To use storage from this new device we have to perform below steps:

a) Create physical partition

b) Format the partition

c) Mount partition to document root /var/www/html so that web site related data is maintained in this device separate from root volume.

fdisk /dev/xvdc To create partition

mkfs.ext4 /dev/xvdc1 è Format the partition with ext4 file system

mount /dev/xvdc1 /var/www/html è Mount new device to document root.

Now the html page is kept in new device and even if root volume is corrupted we have our page available in external device that is attached to instance.

In the above website if we want to display any static data like image then we need to put that data also in document root which is default one in our case /var/www/html

image.png is the image file.

vi simple.html

Now we have html code and image located in document root which is located in EBS storage independent of / drive. But issue is that EBS storage is not durable and some times we might face issue with best of hardware's. If we loose EBS storage dud to some failure we can retrieve html code from external repo but the issue will be with critical data like image and others will be problem for business.

As EBS is not great for availability and durability we have to store static critical data in high availability and durability storage. AWS provides one such storage S3. We can place static critical data in S3 and fetch data from S3.

Now create a bucket through cli and place data over like images over there.

aws s3api create-bucket — bucket websrvr-static-data — create-bucket-configuration LocationConstraint=ap-south-1

Data is placed in s3 bucket.

This created S3 bucket need to have appropriate privilege's.

html code modified to point to s3 bucket.

Webpage displayed as below:

Now with above set up we have high availability and durability of data and web server is also highly available. Another issue we have here is we might have latency as our website is located in one region but customer can come anywhere globally.

If customer trying to access this website is from far distance compared to website location then we will have latency. AWS provides content delivery as a services (CDN ) by maintaining small data centers which are connected through their isolated private network. These small data centers are called edge locations.

When customer access website from different location then AWS has service called cloud front through which for the first time they transfer data through their high speed private network and place it in edge location cache and they will server subsequent requests from cache to avoid latencies.

We have to set up the distribution network in cloud front so that data is cached from origin point when it is accessed first time. When we set up distribution network in cloud front we get one URL which behaves intelligently based on the request origin location and caches the data in appropriate edge location.

Creating distribution through cli:

AWS cloud front create-distribution — origin-domain-name websrvr-static-data.s3.amazonaws.com

All edge locations are being updated to have one more cache:

Deployed distribution:

html code is modified to point to cloudfront intelligent URL so that the data can be read from edge location cache.

Webpage after modifying html code in document root.

In the above article we have below architecture:

1) EC2 instance with OS in root volume.

2) EBS volume where we have mounted document root and it contains html code.

3) S3 bucket where we have our static image that is displayed in website.

4) Cloud front distribution to source the data from S3 bucket and place it in local cache.

Miss and hit statistics from cloud front:

Miss represent data is sourced from origin S3 bucket and hit represents data is sourced from nearest edge location cache.

#awscloud #awscli #aws #vimaldaga #righteducation #educationredefine #rightmentor #worldrecordholder #linuxworld #makingindiafutureready #righeudcation #awsbylw #arthbylw

High availability webserver architecture in AWS

Written by HVC

No responses yet