Upload files to S3 using presigned URLs

Recently, I had to implement file uploads in one of my applications. Now, you would probably think... just use django-storages and be done with it. Yes, but actually no. There is one big problem with the set up of that package: You are always routing the files through your own server. This could be nice for validating files before storing them, but for large files or people with bad internet connections, this could be a major issue. Servers time out easily and bandwitdh is often limited. Extending the time-out period is a bad idea for many reasons, for one being that it is much easier to DDoS a server when the server is working longer on a request.

So, what now? Create presigned urls on the server and then let users upload to that specific URL. AWS S3 has API endpoints for this. It's fairly easy to set up and takes the heavy lifting from your server. On top of that, it could protect you against path traversal attacks.

Now, I will use AWS S3 for this tutorial, but you can use any provider that supports the full S3 API (Ceph and Min.io should both work fine). There are many reasons to why someone would use AWS, but personally, I try to stay away from them as much as possible.

For this project, I am using a blank Django project with DRF installed. Make sure to install boto3 with pip install boto3.

Up next, create a new file called s3.py and put this class in it:

from django.conf import settings
import boto3
from botocore.config import Config


class S3:
    def __init__(self):
        self.client = boto3.client('s3',
                                   settings.AWS_REGION,
                                   endpoint_url=settings.AWS_S3_ENDPOINT_URL,
                                   aws_access_key_id=settings.AWS_ACCESS_KEY_ID,
                                   aws_secret_access_key=settings.AWS_SECRET_ACCESS_KEY,
                                   config=Config(signature_version='s3v4')
                                   )

    def get_presigned_url(self, key, time=3600):
        return self.client.generate_presigned_url(ClientMethod='put_object', ExpiresIn=time,
                                                  Params={'Bucket': settings.AWS_STORAGE_BUCKET_NAME, 'Key': key})

    def get_file(self, key, time=3600):
        return self.client.generate_presigned_url(ClientMethod='get_object', ExpiresIn=time,
                                                  Params={'Bucket': settings.AWS_STORAGE_BUCKET_NAME, 'Key': key})

    def delete_file(self, key):
        return self.client.delete_object(Bucket=settings.AWS_STORAGE_BUCKET_NAME, Key=key)

Also add these things in your settings:

AWS_REGION
AWS_STORAGE_BUCKET_NAME
AWS_S3_ENDPOINT_URL
AWS_ACCESS_KEY_ID
AWS_SECRET_ACCESS_KEY

Please do not enter the credentials right into your settings files, but use something like django-environ to pull them from a .env file or environment variables.

The class we just created will help us to easily call the boto3 code. We can now do S3().get_presigned_url('pictures/hello.png') to get an upload endpoint for pictures/hello.png. Note that it will overwrite items when you upload to the same url twice, so it's a good idea to add a unique identifier, such as the primairy key or a uuid. Also note that the examples below are very basic and straight to the point. You will probably have to extend it to your needs.

To save files to models, you could create a model like this:

class File(models.Model):
    name = models.CharField(max_length=100)
    ext = models.CharField(max_length=10)

    @property
    def key(self):
        # for the sake of simplicity; assuming that all files have the format <name>.<ext>. Files without extension would error here.
        return str(obj.id) + '-' + serializer.data['name'].split('.')[0] + '/' + serializer.data['name']

@receiver(pre_delete, sender=File)
def remove_file(sender, instance, **kwargs):
    S3().delete_file(instance.key)

With that, the file would be deleted automatically from S3 once you delete the object. We generate the key automatically based on the pk and id. This also means that people can't update the names with the current set up as that would require the file to be moved/renamed.

Up next, we need to create the API endpoints and the JavaScript code to get it all to work. I won't go into detail on this as it's pretty basic stuff, but I will show you how you could do this in a very basic way. You could add more validation/error handling/confirmation yourself, if necessary.

Uploading files from JavaScript (VueJS):

this.uploading = true
this.$axios.post('api/get_upload_url/', { name: file.name }).then((data) => {
  this.$axios.put(data.url, file).then(() => {
    // optionally, you could let the app know that the file upload was successful and mark it as completed.
    this.uploading = false
  })
})

On the back end side:

# views.py
class FileView(APIView):
    def post(self, request):
        serializer = FileSerializer(data={'name': request.data['name'], 'ext': request.data['name'].split('.')[1]})
        serializer.is_valid(raise_exception=True)
        obj = serializer.save()

        url = S3().get_presigned_url(obj.key)
        return Response({'url': url, 'id': f.id})

# serializers.py
class FileSerializer(serializers.ModelSerializer):
    file_url = serializers.SerializerMethodField()

    class Meta:
        model = File
        fields = '__all__'

    def get_file_url(self, obj):
        return S3().get_file(obj.key)

You could add extra validators in the serializer to make sure it matches a certain extension (beaware though, validating that won't block all upload attacks). But I will not get into that too much here.

The file_url field will automatically add a temporary file link to that serializer, so you can use it on the front end.

That's it!

AWS DRF Django S3 tutorial

Written by Stan Triepels

Stan is professional web developer working mainly with Django and VueJS. With years of experience under the belt, he is comfortable writing about his past mistakes and ongoing learnings.