Wednesday, March 16, 2016

S3 Lifecycle Management Update – Support for Multipart Uploads and Delete Markers

It is still a bit of a shock to me to realize that Amazon S3 is now ten years old! The intervening decade has simply flown by.


For several years, you have been able to use S3’s Lifecycle Management feature to control the storage class and the lifetime of your objects. As you may know, you can set up rules on a per-bucket or per-prefix basis. Each rule specifies an action to be taken when objects reach a certain age.


Today we are adding two rules that will give you additional control over two special types of objects: incomplete multipart uploads and expired object delete markers. Before we go any further, I should define these objects!


Incomplete Multipart Uploads – S3’s multipart upload feature accelerates the uploading of large objects by allowing you to split them up into logical parts that can be uploaded in parallel.  If you initiate a multipart upload but never finish it, the in-progress upload occupies some storage space and will incur storage charges. However, these uploads are not visible when you list the contents of a bucket and (until today’s release) had to be explicitly removed.


Expired Object Delete Markers – S3’s versioning feature allows you to preserve, retrieve, and restore every version of every object stored in a versioned bucket. When you delete a versioned object, a delete marker is created. If all previous versions of the object subsequently expire, an expired object delete marker is left. These markers do not incur storage charges. However, removing unneeded delete markers can improve the performance of S3’s LIST operation.


New Rules
You can now exercise additional control over these objects using some new lifecycle rules, lowering your costs and improving performance in the process. As usual, you can set these up using the AWS Management Console, the S3 APIs, the AWS Command Line Interface (CLI), or the AWS Tools for Windows PowerShell.


Here’s how you set up a rule for incomplete multipart uploads using the Console. Start by opening the console and navigating to the desired bucket (mine is called jbarr):



Then click on Properties, open up the Lifecycle section, and click on Add rule:



Decide on the target (the whole bucket or the prefixed subset of your choice) and then click on Configure Rule:



Then enable the new rule and select the desired expiration period:



As a best practice, we recommend that you enable this setting even if you are not sure that you are actually making use of multipart uploads. Some applications will default to the use of multipart uploads when uploading files above a particular, application-dependent, size.


Here’s how you set up a rule to remove delete markers for expired objects that have no previous versions:



S3 Best Practices
While you are here, here are some best practices that you should consider using for your own S3-based applications:


Versioning – You can enable Versioning for your S3 buckets in order to be able to recover from accidental overwrites and deletes. With versioning turned on, you can preserve, retrieve, and restore earlier versions of your data.


Replication – Take advantage of S3’s Cross-Region Replication in order to meet your organization’s compliance policies by creating a replica of your data in a second AWS Region.


Performance -If you anticipate a consistently high number of PUT, LIST, DELETE, or GET requests against your buckets, you can optimize your application’s performance by implementing the tips outlined in the performance section of the Amazon S3 documentation.


Cost Management – You can reduce your costs by setting up S3 lifecycle policies that will transition your data to other S3 storage tiers or expire data that is no longer needed.


Jeff;

 



No comments:

Post a Comment