S3 CloudFront RDS DynamoDB

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 49

AWS S3

What is S3 ..?
• S3 Stands for Simple Storage Service.

• S3 provides developers and IT teams with Secure, Durable, High-Scalable object storage.

• Amazon S3 is easy to use with simple web service interface to store and retrieve any amount of data from
anywhere on the web.
S3- The basics
• S3 is Object based i.e. allows you to upload files.

• In storage terms there are Object Based and Block Based Storage. Object simply things like videos, photos,
pdf documents or words documents they called flat files.

• S3 is not place where you can install OS and run a Database for that we need Block Based Storage.

• Files can be from 0 bytes to 5 TB in size.

• There is unlimited storage.

• Files are stored in bucket.


S3- The basics
• S3 is universal namespace i.e. names must be unique globally.

• S3 bucket name always be in lower case characters.

• DNS Address of bucket : https://s3-eu-west-1.amazonaws.com/testbucket

• Built for 99.99% availability for the s3 platform.

• Amazon Guarantee 99.9% availability.

• Amazon Guarantee 99.999999999% durability for S3 information. (Remember 11*9’s)

• Durability means simply you don’t lose files.

• When you upload file to s3 you will receive a http 200 code if the upload was successful .
S3- S3 is a simple key, value store
• Key (This is simply the name of the object).

• Value (This is simply the data and is made up of sequence of bytes).

• Version ID (Important for versioning).

• Metadata (Data about the data you are storing).


Topics cover in S3
• Data consistency model. • Logging.

• Storage classes/Tiers. • Events.

• Versioning. • Permissions.

• Cross Region Replication. • Tags.

• Lifecycle Management. • Requester Pay.

• Static Website Hosting. • Storage Management.

• Transfer Acceleration
S3- Data Consistency Model
• Read after Write consistency for PUTS of new objects.

• Eventual consistency for overwrite PUTS and DELETES (can take some time to propagate).
S3- Storage Classes / Tiers
• S3 : 99.99% availability, 99.999999999% durability, stored redundancy across multiple devices in multiple
facilities and is designed to sustain the loss of 2 facilities concurrently.

• S3 – IA (Infrequent Accessed) : For data that is accessed less frequently, but requires rapid access when
needed. Lower fee than S3, but you are charged a retrieval fee.

• Reduced Redundancy Storage : Designed to provide 99.99% availability and 99.99% durability of objects
over given year.

• Glacier : Very cheap, but used for archival only. It takes 3-5 hours to restore from Glacier.
S3- Versioning
• Versioning enables you to keep multiple versions of an object in one bucket.

• You can have two objects with the same key but different version IDs such as photo.png (version 111111) and
photo.png (version 22222)

• Versioning-enabled buckets enable you to recover objects from accidental deletion or overwrite. For example:

• If you delete an object, instead of removing it permanently, Amazon S3 inserts a delete marker, which becomes
the current object version. You can always restore the previous version.

• If you overwrite an object, it results in a new object version in the bucket. You can always restore the previous
version.

• Great backup tool.

• Once enabled versioning cannot be disabled only suspended.


S3- Cross Region Replication
• Cross region replication is a bucket level feature that enables automatic, asynchronous copying of objects
across buckets in different AWS regions.

• The object replicas in the destination bucket are exact copy of the object in the source bucket. They have same
key names and the metadata – for example creation date, owner, user-defined metadata, version ID, ACL,
storage class.

• Amazon S3 encrypts all data in transit across AWS regions using SSL.

• You can apply replication on whole bucket or prefix in this bucket.

• You can change destination storage class at the time of replication.

• Replication applies only on objects that are created after you add a replication configuration on bucket.
S3- Requirements For Cross Region Replication
• The source and destination buckets must be versioning-enabled.

• The source and destination buckets must be in different AWS regions.

• You can replicate objects from a source bucket to only one destination bucket.

• Amazon S3 must have permission to replicate objects from that source bucket to the destination bucket on your
behalf.

• If you are setting up cross-region replication in a cross-account scenario (where the source and destination
buckets are owned by different AWS accounts), the source bucket owner must have permission to replicate
objects in the destination bucket.
S3- Static website hosting
• You can host a static website on Amazon Simple Storage Service (Amazon S3). On a static website, individual
webpages include static content.

• Index document
• Error document
• Redirect requests
S3- Transfer Acceleration
• Amazon S3 Transfer Acceleration enables fast, easy, and secure transfers of files over long distances between
your client and an S3 bucket.

• Transfer Acceleration takes advantage of Amazon CloudFront’s globally distributed edge locations.
S3- Event Trigger
S3- Permissions
• Access Control List:
1. Owner access
2. Access for other AWS account
3. Public access
4. S3 log delivery group

• Bucket Policy:

• CORS Configuration
S3- Management
• Lifecycle

• Analytics

• Metrics
AWS Cloud Front
What is Cloud Front ..?
• Amazon Cloud Front is a global content delivery network (CDN) service that securely delivers data, videos,
applications, and APIs to your viewers with low latency and high transfer speeds.

• Request for your content are automatically routed to the nearest edge location, so content will delivered with the
best possible performance.

• Amazon Cloud Front is optimized to work with other Amazon Web Services, like Amazon Simple Storage
Service (S3), Amazon Elastic Compute Cloud (EC2), Amazon Elastic Load Balancing, and Amazon Route53.

• Amazon Cloud Front also works seamlessly with any non-AWS origin server, which stores the original,
definitive versions of your files.
Cloud Front Key Terminology
• Edge Location:

 This is location where content will be cached.


 This is separate to an AWS Regions/AZ.
 Edge locations are not just read only, you can write to them too.
 Objects are cached for the life of the TTL (Time To Live).
 You can clear cached objects using Invalidation manually, but you will be charged.

• Origin : This is origin of the all files that the CDN will distribute. This can be either an S3 Bucket, an EC2
Instance, an Elastic Load Balancer or Route53.

• Distribution : This is name given to the CDN which consists of a collection of Edge Locations.
1. Web Distribution : Typically used for Websites.
2. RTMP : Used for Media Streaming.
Cloud Front Components
• Origins: S3, Elastic load balancer
• Behaviours
• Error pages
• Restrictions : Geo restriction whitelist and blacklist.
• Invalidations
• WAF
• SSL Certificates.
• Price class: Use all edge location or only particulars.
AWS RDS
What is RDS ..?
• RDS Stands for Relational Database Service.

• Amazon Relational Database Service makes it easy to set up, operate and scale a relational database in AWS
cloud.

• It provides cost-efficient and resizable capacity while managing time-consuming database administration tasks,
freeing you up to focus on your application and business.

• Amazon RDS doesn’t provide shell access to DB instances. (.pem key is not generated).

• Database supported: SQL, MySQL, PostgreSQL, Oracle, MariaDB and Amazon Aurora.
AWS RDS Components
• DB Instances.

• Regions and Availability Zones.

• Security Groups.

• DB Parameter Groups: You manage the configuration of a DB engine by using a DB parameter group. A DB
parameter group contains engine configuration values that can be applied to one or more DB instances of the same
instance type. Amazon RDS applies a default DB parameter group if you don’t specify a DB parameter group when
you create a DB instance. The default group contains defaults for the specific database engine and instance class of
the DB instance.

• DB Option Groups: Some DB engines offer tools that simplify managing your databases and making the best use
of your data. Amazon RDS makes such tools available through option groups. Examples of available options are
Oracle Application Express (APEX), SQL Server Transparent Data Encryption, and MySQL memcached support.
AWS RDS Components
• Multi-AZ deployment: On Amazon RDS, a standby replica of your DB instance that can be used in the event of a
failover is called a Multi-AZ deployment.

• Backup Retention Period: Set the number of days you want automatic backups of your database to be retained.
Max retention period 35 days.

• Backup Window: The daily time range (in UTC) during which automated backups are created if automated
backups are enabled.

• Auto Minor Version Upgrade: Enable automatic upgrades to new minor versions as they are released. The
automatic upgrades occur during the maintenance window for the DB instance.

• Maintenance Window: Select the period in which you want pending modifications (such as changing the DB
instance class) or patches applied to the DB instance by Amazon RDS. Any such maintenance should be started
and completed within the selected period. If you do not select a period, Amazon RDS will assign a period
randomly.
Note
• Automated Backup: The
number of days for which
automated backups are
retained. Setting this
parameter to a positive
number enables backups.
Setting this parameter to 0
disables automated backups.
Note
• Backup Window and
Maintenance Window time
should not be same.
Note
• If you choose to apply
immediately, please note that
any changes in the pending
queue are also applied. If any
of the pending modification
require downtime, choosing
this option can cause
unexpected downtime.
AWS RDS Snapshot
• Amazon RDS creates a storage volume snapshot of your DB instance, backing up the entire DB instance and not
just individual databases. Creating this DB snapshot on a Single-AZ DB instance results in a brief I/O suspension
that can last from a few seconds to a few minutes, depending on the size and class of your DB instance. Multi-AZ
DB instances are not affected by this I/O suspension since the backup is taken on the standby.
AWS Dynamo DB
What is Dynamo DB ..?
• Amazon Dynamo DB is a fully managed NoSQL database service that provides fast and predictable
performance with seamless scalability.

• Its flexible data model and reliable performance make it a great fit for mobile, web, gaming, ad-tech, IoT, and
many other applications.

• Amazon Dynamo DB Accelerator (DAX) is a fully managed, highly available, in-memory cache that can reduce
Amazon Dynamo DB response times from milliseconds to microseconds, even at millions of requests per
second.

• In Dynamo DB, tables, items, and attributes are the core components that you work with. A table is a collection
of items, and each item is a collection of attributes. Dynamo DB uses primary keys to uniquely identify each
item in a table and secondary indexes to provide more querying flexibility.
Index SQL NoSQL
1 Databases are categorized as Relational Database NoSQL databases are categorized as Non-relational or
Management System (RDBMS). distributed database system.

2 SQL databases have fixed or static or predefined NoSQL databases have dynamic schema.
schema.

3 SQL databases display data in form of tables so it NoSQL databases display data as collection of key-value
is known as table-based database. pair, documents, graph databases or wide-column stores.

4 SQL databases are vertically scalable. NoSQL databases are horizontally scalable.
5 SQL databases use a powerful language In NoSQL databases, collection of documents are used to
"Structured Query Language" to define and query the data. It is also called unstructured query language.
manipulate the data. It varies from database to database.

6 SQL databases are best suited for complex NoSQL databases are not so good for complex queries
queries. because these are not as powerful as SQL queries.

7 SQL databases are not best suited for hierarchical NoSQL databases are best suited for hierarchical data
data storage. storage.

8 MySQL, Oracle, Sqlite, PostgreSQL and MS-SQL MongoDB, BigTable, Redis, RavenDB, Cassandra, Hbase,
etc. are the example of SQL database. Neo4j, CouchDB etc. are the example of nosql database
Dynamo DB Components
• Primary Key: The primary key uniquely identifies each item in the table, so that no two items can have the
same key.

• Primary Key = partition key + short key (optional)

• Dynamo DB supports two different kinds of primary keys:

1. Partition key (hash key): A simple primary key, composed of one attribute known as the partition key.

2. Composite primary key: This type of key is composed of two attributes. The first attribute is
the partition key (hash key), and the second attribute is the sort key (range key).
Dynamo DB Components

• Secondary Indexes: You can create one or more secondary indexes on a table. A secondary index lets you
query the data in the table using an alternate key, in addition to queries against the primary key.

1. Global secondary index: An index with a partition key and sort key that can be different from those on
the table.

2. Local secondary index: An index that has the same partition key as the table, but a different sort key.

• Note: You can define up to 5 global secondary indexes and 5 local secondary indexes per table.
Dynamo DB Stream

• Dynamo DB Streams is an optional feature that captures data modification events in Dynamo DB tables.

1. If a new item is added to the table, the stream captures an image of the entire item, including all of its
attributes.

2. If an item is updated, the stream captures the "before" and "after" image of any attributes that were
modified in the item.

3. If an item is deleted from the table, the stream captures an image of the entire item before it was deleted.
Dynamo DB API

• get_item()

i. The GetItem operation returns a set of attributes for the item with the given primary key.
ii. If there is no matching item, GetItem does not return any data and there will be no Item element in the
response.
iii. GetItem provides an eventually consistent read by default. If your application requires a strongly
consistent read, set ConsistentRead to true .
iv. Although a strongly consistent read might take more time than an eventually consistent read, it always
returns the last updated value.
Dynamo DB API

• put_item()

i. Creates a new item, or replaces an old item with a new item.


ii. If an item that has the same primary key as the new item already exists in the specified table, the new item
completely replaces the existing item.
iii. When you add an item, the primary key attribute(s) are the only required attributes.
iv. String and Binary type attributes must have lengths greater than zero.
v. Set type attributes cannot be empty.
vi. Requests with empty values will be rejected with a ValidationException exception.
vii. To prevent a new item from replacing an existing item, use a conditional expression that contains the
attribute_not_exists function with the name of the attribute being used as the partition key for the table.
Since every record must contain that attribute, the attribute_not_exists function will only succeed if no
matching item exists.
Dynamo DB API
• query()

i. The Query operation finds items based on primary key values.


ii. You can query any table or secondary index that has a composite primary key (a partition key and a sort
key).
iii. Use the KeyConditionExpression parameter to provide a specific value for the partition key.
iv. The Query operation will return all of the items from the table or index with that partition key value. You
can optionally narrow the scope of the Query operation by specifying a sort key value and a comparison
operator in KeyConditionExpression .

• scan()

i. Scan operations proceed sequentially


ii. The Scan operation returns one or more items and item attributes by accessing every item in a table or a
secondary index.
Dynamo DB API

• update_item()

i. Edits an existing item's attributes, or adds a new item to the table if it does not already exist.
ii. You can put, delete, or add attribute values. You can also perform a conditional update on an existing item

• delete_item()

i. Deletes a single item in a table by primary key.


ii. You can perform a conditional delete operation that deletes the item if it exists, or if it has an expected
attribute value.
Dynamo DB API

• batch_get_item()

i. The BatchGetItem operation returns the attributes of one or more items from one or more tables. You
identify requested items by primary key.
ii. A single operation can retrieve up to 16 MB of data, which can contain as many as 100 items.
iii. BatchGetItem will return a partial result if the response size limit is exceeded, the table's provisioned
throughput is exceeded, or an internal processing failure occurs. If a partial result is returned, the operation
returns a value for UnprocessedKeys . You can use this value to retry the operation starting with the next
item to get.
iv. If you request more than 100 items BatchGetItem will return a ValidationException with the message "Too
many items requested for the BatchGetItem call".
v. If none of the items can be processed due to insufficient provisioned throughput on all of the tables in the
request, then BatchGetItem will return a ProvisionedThroughputExceededException .
Dynamo DB API

• batch_write_item()

i. The BatchWriteItem operation puts or deletes multiple items in one or more tables.
ii. A single call to BatchWriteItem can write up to 16 MB of data, which can comprise as many as 25 put or
delete requests.
iii. Individual items to be written can be as large as 400 KB.
iv. BatchWriteItem cannot update items. To update items, use the UpdateItem action.
v. If any requested operations fail because the table's provisioned throughput is exceeded or an internal
processing failure occurs, the failed operations are returned in the UnprocessedItems response parameter.
vi. Note that if none of the items can be processed due to insufficient provisioned throughput on all of the
tables in the request, then BatchWriteItem will return a ProvisionedThroughputExceededException .
Dynamo DB Data Types
 Scalar Types:
1. number
2. string
3. Binary
4. Boolean
5. null.

 Document Types:
1. List
2. map.

 Set Types :
1. string set
2. number set
3. binary set.
Dynamo DB Read Consistency
• Eventually Consistent Reads (Default): When you read data from a Dynamo DB table, the response
might not reflect the results of a recently completed write operation. The response might include some stale
data. If you repeat your read request after a short time, the response should return the latest data.

• Strongly Consistent Reads: When you request a strongly consistent read, Dynamo DB returns a response
with the most up-to-date data, reflecting the updates from all prior write operations that were successful.
Dynamo DB Throughput Capacity
You specify throughput capacity in terms of read capacity units and write capacity units:

• One read capacity: unit represents one strongly consistent read per second, or two eventually consistent reads
per second, for an item up to 4 KB in size. If you need to read an item that is larger than 4 KB, Dynamo DB will
need to consume additional read capacity units. The total number of read capacity units required depends on the
item size, and whether you want an eventually consistent or strongly consistent read.

• One write capacity: unit represents one write per second for an item up to 1 KB in size. If you need to write an
item that is larger than 1 KB, Dynamo DB will need to consume additional write capacity units. The total
number of write capacity units required depends on the item size.

• If you aren't using Dynamo DB auto scaling, you have to manually define your throughput requirements.
Dynamo DB Auto Scaling
• Dynamo DB auto scaling actively manages throughput capacity for tables and global secondary indexes. With
auto scaling, you define a range (upper and lower limits) for read and write capacity units. You also define a
target utilization percentage within that range. Dynamo DB auto scaling seeks to maintain your target
utilization, even as your application workload increases or decreases.

• With Dynamo DB auto scaling, a table or a global secondary index can increase its provisioned read and write
capacity to handle sudden increases in traffic, without request throttling. When the workload decreases, Dynamo
DB auto scaling can decrease the throughput so that you don't pay for unused provisioned capacity.

• If you use the AWS Management Console to create a table or a global secondary index, Dynamo DB auto
scaling is enabled by default.
Dynamo DB Reserved Capacity
• As a Dynamo DB customer, you can purchase reserved capacity in advance. With reserved capacity, you pay a
one-time upfront fee and commit to a minimum usage level over a period of time.

• By reserving your read and write capacity units ahead of time, you realize significant cost savings compared to
on-demand provisioned throughput settings.
DAX
• Dynamo DB Accelerator (DAX) delivers fast response times for accessing eventually consistent data.

• DAX is a Dynamo DB-compatible caching service that enables you to benefit from fast in-memory performance
for demanding applications.

• As an in-memory cache, DAX reduces the response times of eventually-consistent read workloads by an order
of magnitude, from single-digit milliseconds to microseconds.

• DAX does not support Transport Layer Security (TLS).


Dynamo DB Monitoring
• Cloud watch matric:
1. Read capacity
2. Throttled read requests
3. Write capacity
4. Throttled write request
5. Get latency
6. Put latency
7. Query latency
8. Scan latency
9. TTL deleted items
10. Scan returned item count
11. System error
12. Users error
13. Conditional check failed
14. Get records
Dynamo TTL
• Time To Live (TTL) for Dynamo DB allows you to define when items in a table expire so that they can be
automatically deleted from the database.

• TTL is provided at no extra cost as a way to reduce storage usage and reduce the cost of storing irrelevant data
without using provisioned throughput.

You might also like