AWS, Cloud

AWS Glacier – why to avoid

I’m using Glacier for long term backups. I was pretty happy with upload process (setup, performance) until I provisioned a vault for a long term backup (1.5~2 years) and right now I was willing to get rid of it. Unfortunately you cannot just drop it when it is not empty. You have to remove all the data first (and it is very slow process). Yet another disadvantage is slow data retrieval comparing to competitors.

I found a solution on https://superuser.com/questions/687785/how-to-delete-all-glacier-data (6th answer). So you need 4 steps (!) to delete the archive as you need to pass file by file.

The steps are:

  •  Retrive inventory – you just request to prepare it (the process takes several hours like around 4-6 for my case, executed remotely)
  • Get inventory IDs – it is much faster, few seconds to obtain 200-300 MBs JSON with IDs
  • Request to delete file-by-file based on retrieved IDs – that the longest part

It took over 3 weeks (entire process) to remove 1.3 TB (over 600k files) and finaly drop the vault. I had impression that less data was there, faster they were removed (I have no idea why it behaved like that).

This expierence proves that we have to be very careful which solution we choose. Next time I will use Azure as they have lower storage cost and better retrieval conditions (or GCP but prices are higher – multiregion).

Leave a Reply

Your email address will not be published.