Following on from our first post, we’re now going to take another look at S3 buckets this time we’ll take a look at some of the different bucket types and why you want to use S3 buckets with the pros and cons for each type.
Storage Classes
S3 buckets have different storage options depending on the needs of the user. Below we’re going to take a look at the available options and the reasons as to why you might want to choose them.
Standard Access
- General purpose Storage
- Frequently Accessed Data
- Low Latency & High Throughput
- Can access data in milliseconds
S3 Intelligent-Tiering
- Data with unknown or changing access patterns
- Automatically moves between frequently accessed and infrequently accessed tiers based on changing access patterns
- This can be done without performance impact or operational overhead
- Has low latency and high throughput
S3 Standard Intelligent Access [IA]
- For long lived but less frequently accessed data
- Also provides low latency and high throughput
S3 One Zone IA
- Long lived data but less frequently accessed
- Best for data that can be recreated
- low latency and also high throughput
Long Term Archive and Digital Preservation
S3 Glacier Instant Retrieval
- Delivers the lowest cost storage with millisecond retrieval
- Designed for archive data that needs immediate access
S3 Glacier Flexible Retrieval
- Does not require immediate access
- retrieval in minutes
- free bulk retrievals in 5 – 12 hours
S3 Glacier Deep Archive
- Save more long lived archive
- Lowest cost storage
- data retrievals in 12 – 24 hours
Each of these storage classes also support specific data access levels for better control
Management Tools For Granular Data Control
S3 Storage Class Analysis
- discover data that should move to a lower cost storage class based on access patterns
- Use this information to configure an S3 lifecycle policy that makes the data transfer
- lifecycle policy define when objects transition to another storage class and when objects expire
- helps to manage your objects so that stored cost effectively throughout their life cycle
- Can Use S3 cross-region replication to replicate to other AWS regions.
- Also replicate data within the same AWS region
S3 Object Lock
- Objects are locked for a defined duration period for retention and are protected from deletion
- Enforce write-once, read-many policy
S3 Inventory
- Reports lists your stored objects, metadata and encryption status
Batch Operations
- S3 batch operations can change object properties and perform storage management tasks for billions of objects
- With Batch Operations and S3 API requests or in the AWS console, can perform batch operations such as
- copy objects between buckets
- replace object tag sets
- modify access controls and restore archived object from S3 Glacier
- Since S3 works AWS Lambda you can:
- log activities
- define alerts
- automate workflows without additional infrastructure
Query-in-place services for analytics
- With Query-in-place services you can run
- big data analytics directly across your S3 object and other datasets in AWS
- Run big data analytics directly on Data Stored in Amazon S3 that allows you to query data without needing to copy and load it into a separate analytics platform or data warehouse
- S3 is also compatible with AWS analytics services
- [[Amazon Athena]]
- Query S3 data with SQL Expressions
- [[Amazon RedShift Spectrum]]
- Runs SQL queries directly against data at rest in S3
- More appropriate for complex queries and large data sets up to exabytes
- [[S3 Select]]
- To retrieve subsets of object data instead of the entire object – up to 5 Terabytes in size
- Designed to increase query performance by up to 400% and reduce querying costs as much as 80%
- Versioning
- A means of keeping multiple variants of an object in the same bucket
- [ preserve | retrieve | restore ] every version of every object stored in S3 bucket
- Version Abled buckets
- recover object from accidental deletion or overwrite
- If you delete and object it is not permanently deleted S3 inserts a delete marker
- If you overwrite an object it results in a new object version in the bucket. You can restore the previous version as long as the object has not been deleted
- Bucket States
- non-versioned (default)
- versioning enabled
- Combination of a bucket key and version ID uniquely identify each object
- Cannot return to a non-versioned state
- but can suspend
- versioning suspended
- Suspend the creation of version ID for new object
- preserve all the existing object versions
