Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save pecigonzalo/3f89a4b29b6fae933000ca03720e15c5 to your computer and use it in GitHub Desktop.
Save pecigonzalo/3f89a4b29b6fae933000ca03720e15c5 to your computer and use it in GitHub Desktop.
Delete all versions (except latest) of all files in s3 versioned bucket using AWS CLI and jq.
#!/bin/bash
bucket=$1
set -e
echo "Removing all versions from $bucket"
versions=`aws s3api list-object-versions --bucket $bucket |jq '.Versions | .[] | select(.IsLatest | not)'`
markers=`aws s3api list-object-versions --bucket $bucket |jq '.DeleteMarkers'`
echo "removing files"
for version in $(echo "${versions}" | jq -r '@base64'); do
version=$(echo ${version} | base64 --decode)
key=`echo $version | jq -r .Key`
versionId=`echo $version | jq -r .VersionId `
cmd="aws s3api delete-object --bucket $bucket --key $key --version-id $versionId"
echo $cmd
$cmd
done
echo "removing delete markers"
for marker in $(echo "${markers}" | jq -r '.[] | @base64'); do
marker=$(echo ${marker} | base64 --decode)
key=`echo $marker | jq -r .Key`
versionId=`echo $marker | jq -r .VersionId `
cmd="aws s3api delete-object --bucket $bucket --key $key --version-id $versionId"
echo $cmd
$cmd
done
@nashjain
Copy link

nashjain commented Apr 8, 2020

There is actually a much simpler and faster approach:

bucket=$1
fileToDelete=$2
deleteBefore=$3
fileName='aws_delete.json'
rm $fileName
versionsToDelete=`aws s3api list-object-versions --bucket "$bucket" --prefix "$fileToDelete" --query "Versions[?(LastModified<'$deleteBefore')].{Key: Key, VersionId: VersionId}"`
cat << EOF > $fileName
{"Objects":$versionsToDelete, "Quiet":true}
EOF
aws s3api delete-objects --bucket "$bucket" --delete file://$fileName

s3api delete-objects can handle up to 1000 records.

Want to do more advance stuff? Check out my gist.

@pecigonzalo
Copy link
Author

I do not agree with that being cleaner to be honest, it is probably faster given it does 1000 objects at a time. You example does not seem to lean delete markers either and it has some deleteBefore and other stuff that is not really related to the intention of this script.

Regarding your gist, It's not really more advanced stuff, it's simply different stuff.

@nashjain
Copy link

nashjain commented Apr 9, 2020

The delete-objects api does not leave delete markers. So you don't have to worry about them separately. Please correct me if I'm missing something here.

deleteBefore is an option you could pass in to delete version only to a certain date. If you want to delete everything, you can pass today's date.

Regarding my gist, it has more server-side filtering options. For example, I want to leave 1 version of the file for each day and delete the rest of them. Just showing have you can do server-side filtering.

@pecigonzalo
Copy link
Author

pecigonzalo commented Apr 9, 2020

The API itself might not, I cant remember to be honest, but if you had objects there, you potentially have delete markers from other things.

I understand what the deleteBefore is doing but its not required in the context of this script, as its a script to delete ALL object versions except latest. No options needed for this script.

@ranjiniganeshan
Copy link

aws s3api delete-object --bucket ******* --key ****/*****/extensions/Common 1201 (Extension).Zip. filename with spaces are not getting deleted.

@pecigonzalo
Copy link
Author

@ranjiniganeshan, try modifying lines 18 and 29 to ensure that $key is escaped when executed. Like \"$key\".

@ranjiniganeshan
Copy link

@pecigonzalo Thanks for the helping. but still the issue is not sorted.
aws s3api delete-object --bucket *************** --key "-/****/extensions/*****LIC 10.6.3 Build 101(Extension).zip" --version-id p7wu7kcqrqfuwvdM7zQhFvLgm9hIL8P9

Unknown options: Build, 101(Extension).zip", 10.6.3

@pecigonzalo
Copy link
Author

As you can see a split, something is probably escaping the "" you will have to check for that, might want to try using \' instead of \"

@Tarvinder91
Copy link

Tarvinder91 commented Nov 17, 2020

@nashjain your solution is more efficient as it deletes 1000 objects at a time and its blazing faster than deleting 1 object at a time. I had millions of objects so listing all the objects made the list query getting time out so we needed to limit the results.

Also very IMP- if you delete the delete marker and object has previous versions still existing, the object will start appearing in bucket as if its not deleted. So very imp that delete markers are only deleted when previous version objects are fully deleted.

My comprehensive script to delete the previous versions of object except the latest one is here.

Another caveat, if u have more than half million objects under a prefix, use this command to limit the results :
aws s3api list-object-versions --prefix $prefix --bucket $bucket --max-items 300000

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment