If you ask me: what is the best way to upload files in a server to server application? I would tell you to upload them directly to an external storage service like Amazon S3 with a Rails API.
However, we have two options. We can expose an endpoint that accepts files and let Rails handle the logic for saving them into an Amazon S3 bucket. Or we can expose an endpoint to return a presigned URL from S3 and upload the file directly to the S3 bucket.
But first, let’s explain what is Amazon S3 and what is a presigned URL.
Amazon S3 means Amazon Simple Storage Service and “it is an object storage service that offers industry-leading scalability, data availability, security, and performance”.
Amazon S3 uses a presigned URL to give the user access to upload specific objects in a bucket without AWS security credentials or permissions. However, the creator of the presigned URL needs the AWS security credentials, specify a bucket name, an object key, an HTTP method, and expiration date and time.
Now going back to the ways of uploading files that we were talking about.
To choose between the two options we mentioned before, it really depends on the requirements of our application. Choosing one over the other can save you a lot of money.
Here are some things you might need to consider:
1. A lower request time-out
We can have time-out problems while uploading large files. For that reason, services like Heroku will terminate a request if it takes more than 30 seconds to complete.
One solution for uploading a large file could be to break it down into small chunks, send them, then merge them back together on the server. More could be said about it, but that’s a topic for another post.
2. The round trip flow
From the moment we make a request to our Rails API, many steps happen until we get a response. The following diagram gives us a better idea about it.
Diagram 1: The client makes a request to upload the binary files(1). The API processes the file to extract information(2) and then it creates a record with the blob data(3). The API connects with Amazon S3(4) and stores the file into the bucket(5). Amazon S3 sends back a success response(6) and the API creates a record to attach the blob information(7). Finally, the API sends a response to the client(8).
As we can see, the backend handles all of the logic to communicate with the storage service. It saves the files and creates the records in the database, so we can reference the saved objects in our application later on. But, this process can take more time and consume more resources than necessary in our server.
How does the flow with direct upload works?
Perhaps you might think the standard flow is better and faster because the client interacts with the server only once. Unlike with the direct upload flow where it does it three times. But, is it better for the client or the server?
With a direct upload flow, we leave more work to the client and less to the server!
And now there is less load on our server because it is free from uploading the files to our Rails application and then off to the S3 bucket. This can save a lot of time and improve the user experience.
Another advantage is the presigned URL can be initialized with an expiration time, so you won’t have to worry about invalid URLs.
Diagram 2.1: The client makes the first request to send the file data(1) and the backend creates a record in the database(2), then this connects to Amazon S3(3) to return a blob identifier and a presigned URL that expects a file with the specifications received(4) and the frontend has to save the presigned URL(5) to the next step.
Diagram 2.2: For the second request, instead of connecting with our API the client sends the binary files(1) directly to Amazon S3 with the URL received and stores it into the bucket(2), and then S3 returns a success or failure response(3).
Diagram 2.3: Rails needs a blob signed ID(generated in the first step)(1) to attach the file to a model with Active Storage(2) and responds to the client(3).
Rails make Direct Uploads easy for us!
Active Storage provides us a set of methods to make an easier implementation of presigned URLs. However, Amazon also offers a gem called “aws-sdk-s3“ which provides different ways to implement presigned URLs. Such as methods of the Class Object with a POST or PUT request or methods of PresignedPost and Presigner class where Active Storage is no longer necessary.
For this example we need to:
- Configure your AWS S3 Bucket.
- Create a Ruby on Rails API.
- Add the gem ‘aws-sdk-s3’.
- Add the gem ‘dotenv-rails’ to our secret keys.
- Create a User Model.
- Create the Active Storage tables and config the active storage service with amazon.
- Create a service that creates presigned URLs.
- Create a controller to expose the endpoint.
For a step by step tutorial with an example, you can visit my Github repo and test it yourself.
This is the code that you should check:
- Controller: app/controllers/api/direct_upload_controller.rb
- Service: app/services/direct_upload.rb
- User model: app/models/user.rb
Controller – Api::DirectUploadController
Route: POST /api/presigned_url api/direct_upload#create
According to diagram 2.1 above we need to expose an endpoint that receives the file data (blob_params) to return a presigned URL.
Service – DirectUpload
Invoke this service with Directupload.call(…). This service creates the blob record in the active_storage_blobs table and then with the Active Storage Blob instance methods it creates the presigned_url (service_url_for_direct_upload) and the signed_id. With these Active Storage methods, both headers “Content-Type” and “Content-MD5” (service_headers_for_direct_upload) must be included in the PUT request to S3.
You can test other SDK methods invoked in this service with my post_request_for_presigned_url() and put_request_for_presigned_url() methods (take a look at this class for more details). The first one returns an XML success response and the other one doesn’t.
Model – User
Finally, we need to create a helper to access the file public_url. For this example, we have a User that has an avatar, and we can access the avatar through the avatar_url instance method.
And basically that is all we need to do in our Rails API to implement a direct upload to S3.
Testing our API endpoints
The first request looks like this:
And this is the AWS PUT request (don’t forget to include the headers):
The last step is to attach the blob (record created in the active_storage_blobs table) to the User model. That means that we need to create a new record in our active_storage_attachments table. To do that we need to create a new User (see the CreateUser service) or update a user and pass the blob_signed_id instead of the image file.
As explained in the previous diagrams, we can see that the client makes three requests.
And that’s it! The direct upload flow is now complete.
I hope this article has been helpful to you.
About the author
Graduated from Computer Science Engineering from Universidad Don Bosco. Arely has over two years of experience in Software Development. She’s currently a Ruby on Rails Developer at Applaudo Studios.