In today’s digital landscape, delivering high-quality video content across a variety of devices and network conditions is more important than ever. Whether you're developing a streaming platform, an online learning portal, a social media app, or any application requiring video playback, seamless video streaming is essential for an optimal user experience.
Automating video conversion using a cloud-based encoding service allows you to effortlessly generate adaptive streams, ensuring the best video quality for your users while reducing infrastructure complexities. Let’s explore how you can implement this solution to meet the demands of your application. Our goal is to create a solution that’s easy to deploy, doesn’t require maintenance, and supports a growing user base.
We'll focus on HLS (HTTP Live Streaming), one of the most popular video streaming protocols. HLS uses encoding and segmentation: the original video is encoded into multiple versions with different bitrates and resolutions. These encoded versions are then divided into small chunks, typically 2 to 10 seconds in length, with each chunk stored as an individual file. Playlists are then created: first, a playlist for each encoded version of the video is generated, containing the URLs of the individual chunks.
Next, a single master playlist is created, referencing the different versions of the stream (and their corresponding playlists) along with their resolutions and bitrates. This setup allows client apps to select the optimal stream based on factors like the client's device resolution, viewport size, and network conditions, ensuring only the necessary segments are downloaded.
There are many tools available for creating HLS streams from videos. These include running FFmpeg directly or using cloud-based services to handle the conversion and avoid infrastructure management. Examples of such services include AWS Elemental MediaConvert, Google Cloud Transcoder, Bitmovin, and others. In this post, we will focus on MediaConvert. Below is a possible workflow for automatically converting uploaded videos to HLS and making the streams available to users. As you go through the workflow, please refer to the attached diagram, where each step is labeled.
A user uploads a video to an S3 bucket using the mobile or web client app.
A Lambda function is triggered by the ObjectCreate event in the Video Uploads S3 bucket. This function creates a MediaConvert job using the provided configuration and then exits (it doesn't wait for the video conversion to complete). The MediaConvert API offers various settings, including codec selection, bitrate, quality, audio processing, and more. It can also generate multiple versions of the stream with different compression settings, such as 360p, 720p, 1080p, etc.
While selecting the encoding configuration is out of scope for this post, the code sample includes a basic HLS packaging job with one rendition at a 1 Mbps bitrate. The configuration can be easily extended to meet the requirements of each application. As for IAM permissions, this function needs read access to the source S3 bucket, write access to the destination S3 bucket, and access to the MediaConvert API.
import boto3
import re
output_bucket_name = 'converted-videos-bucket'
mediaconvert_role_arn = 'arn:aws:iam::123456789012:role/MediaConvertRole' # output bucket access
s3_client = boto3.client('s3')
mediaconvert_client = boto3.client('mediaconvert')
hls_main_playlist_suffix = '-hls.m3u8'
# regex used to normalize the object key for the client request token
client_request_token_symbols_to_skip = r'[^a-zA-Z0-9-_]'
def lambda_handler(event, context):
# get S3 bucket name and object key from the event
bucket_name = event['Records'][0]['s3']['bucket']['name']
object_key = event['Records'][0]['s3']['object']['key'] # also used as media id
# normalize the object key for the client request token
client_request_token_obj_key = re.sub(client_request_token_symbols_to_skip, '_', object_key)
# call MediaConvert to transcode the video
create_job_response = mediaconvert_client.create_job(
Role=mediaconvert_role_arn,
ClientRequestToken=client_request_token_obj_key,
Settings={
'Inputs': [
{
'FileInput': f's3://{bucket_name}/{object_key}',
'AudioSelectors': {
'Audio Selector 1': {
'DefaultSelection': 'DEFAULT',
},
},
}
],
'OutputGroups': [
{
'Name': 'DefaultOutputGroup',
'OutputGroupSettings': {
'Type': 'HLS_GROUP_SETTINGS',
'HlsGroupSettings': {
'Destination': f's3://{output_bucket_name}/{object_key}-hls',
'DirectoryStructure': 'SUBDIRECTORY_PER_STREAM',
'SegmentLength': 5,
'MinSegmentLength': 2,
'SegmentsPerSubdirectory': 500,
'ProgressiveWriteHlsManifest': 'DISABLED',
},
},
'Outputs': [
{
'NameModifier': '-h264',
'ContainerSettings': {
'Container': 'M3U8',
},
'VideoDescription': {
'CodecSettings': {
'Codec': 'H_264',
'H264Settings': {
'RateControlMode': 'VBR',
'Bitrate': 1000000,
},
},
},
'AudioDescriptions': [
{
'AudioSourceName': 'Audio Selector 1',
'CodecSettings': {
'Codec': 'AAC',
'AacSettings': {
'Bitrate': 96000,
'CodingMode': 'CODING_MODE_2_0',
'SampleRate': 48000,
},
},
},
],
},
],
}
],
},
)
print('Created a MediaConvert job:', create_job_response)
return {
'statusCode': 200,
'body': 'OK',
}
MediaConvert processes the video and generates HLS playlists and video segments in the output S3 bucket. The output bucket is connected to a CDN that caches playlists and video segments. In this example, we’re using Cloudfront but any CDN that’s compatible with S3 can be used.
Another Lambda function is triggered by the ObjectCreate event in the output bucket. An object name filter is attached to that trigger to ensure the function only runs when a playlist file is created (segment files are ignored).
This function adds the playlist URL to the media record in the database. The storage layer is beyond the scope of this post, so in the code sample, the URL is simply printed.
import boto3
s3_client = boto3.client('s3')
def lambda_handler(event, context):
# this function is triggered only when a playlist file
# with object key that looks like this '<video_id>-hls.m3u8'
# is created in the S3 bucket
# get object key from the event
object_key = event['Records'][0]['s3']['object']['key']
# extract video id from the object key
video_id = object_key.replace('-hls.m3u8', '')
print(f'HLS playlist {object_key} created for video {video_id}')
# TODO: update the video record in the database
return {
'statusCode': 200,
'body': 'OK',
}
When users open the video in the client app's UI, the client app retrieves the media record from the database using the API. This media record contains the master playlist URL.
The video player fetches the master playlist from the CDN and decides which stream to play based on factors such as viewport size, screen resolution, network conditions, etc. Then it fetches the stream playlist and the video segments from the CDN and starts playing the video.
This solution is very easy to deploy and requires no maintenance. Regarding scalability for many users, it's important to note that, by default, MediaConvert jobs are added to a single queue that can process 100–200 videos concurrently (depending on the region). Additional queues can be created (up to 10 per region), and priorities can be assigned to jobs when they are added to queues. There is also an option to request quota increases from AWS.
In conclusion, automating video conversion using cloud-based services like AWS Elemental MediaConvert is an effective way to deliver high-quality streaming content across devices without the burden of managing complex infrastructure. This approach not only simplifies the video encoding process but also enhances scalability, ensuring that your platform can handle growing demand.
By leveraging tools such as S3, Lambda functions, and CloudFront in conjunction with MediaConvert, you can efficiently generate and deliver adaptive HLS streams, providing users with an optimized viewing experience.