Without encoders, video file sizes would be massive! Powerful hardware is needed to balance image quality and size, and advanced editing isn’t easy either. Thankfully Intel Arc graphics have what creators need.
Tom Petersen and Tom Green take us through the advanced algorithms of video compression, leveraging computer science several decades in the making. As the work behind every digital video today gets more and more complex, specialized computer hardware – like the Intel Xe Media Engine inside Arc graphics – must handle billions of arithmetic and trigonometric calculations every second. Don’t have your calculator handy? Don’t worry, we let TAP explain the math behind video compression and editing.
What is a codec?
Short for “encoder/decoder,” a video codec makes video files smaller, and therefore easier to store, upload, download, and stream. Raw video data would take up too much disk space to ever be stored or transmitted en masse, but how much storage does uncompressed video take up?
As TAP and Tom explain, each pixel is made up of three colors (red, green, and blue) and all three each are 8 bits (one byte) at standard depth. Now multiply that by about 2 million, the number of pixels on a 1080p screen. Without compression, that’s 11 GB per minute at 30 FPS, or enough to fill a 1 TB drive with just one full-length feature film. 4K quadruples that, and HDR color can increase the size by another 1.5x. Now multiply this by how much video is transmitted just by YouTube and you’ll see this doesn’t work… even if you used the world’s entire internet capacity!
So different codecs have been created to adapt to this problem. Codecs contain layers of algorithms which compress video files and then decompress them at their destination. Technologies have advanced over time but the fundamentals are largely the same:
1. Color Space Conversion
Instead of using true RGB values as a computer does, changing the color space to YUV suits the perceptual characteristics of a human eye. Our eyes have dedicated rod cells for sensing brightness, represented by the Y component, or luma. Our eyes’ cone cells see color, in this space the U and V components, or chroma.
Translating RGB to YUV can be a visually lossless process resulting in no compression, but compression is possible. Since our eyes are most sensitive to brightness due to those dedicated rod cells, compressed YUV formats combine uncompressed luma over reduced-resolution chroma for a 1.5x to 2x compression with very little perceptual difference.
2. Spatial & Temporal Redundancy Search
The majority of subjects and their backgrounds in a video frame often appear again and again in subsequent video frames. Redundancy search algorithms can detect pixels to change, and which can stay the same and therefore reduce the video’s filesize.
As this process happens, there are intraframes (I-frames) which are full screen images used to base future predictions on. Predicted frames (P-frames) can be seen a set of instructions to reconstruct a full image based on deltas – we call them residuals – with previous frames and I-frames. This is where much of the magic happens with 5x-20x compression, but further steps are needed to correct errors and compress these residuals.
3. Generate Decoding Error Correction Terms
To make redundancy search even more powerful, encoders don’t just detect identical blocks of pixels, but also similar blocks – for example the sun with changing color in the previous image is detected as a redundant block. Correction terms represent that difference or error compared to the original block and this step defines how these errors from the redundancy search get resolved.
4. Quantization in Frequency Domain
This is where handling the residuals gets real mathematical. Instead of storing pieces of the exact image, we store how much those sections match particular patterns. The typical method is a discrete cosine transformation, which transforms an image from pixels into a series of frequencies.
Once this has been done to the whole image, high frequencies which contribute little to what the human eye perceives are discarded. Any pieces of the image which didn’t change keep the original information. This step gives a massive 2x-40x compression ratio depending on the exploited redundancy and the amount of lossy compression required to achieve the desired filesizes.
5. Symbol Coding
The last step stems from information theory, a field of computer science originating all the way back in the 1920s. Even after all the above transformations to the data, it’s still a bunch of 0s and 1s. Instead of listing them exactly in their original quantities, repeating patterns of bits seen throughout the file can be represented with a shorter pattern of bits. By assigning fewer bits to sequences that appear with a high probability, this adds another 1x-2x compression to the process.
By the time all these layers have done their work, the final video file is 600x-1000x smaller using modern codecs compared to uncompressed video.
Encode and decode faster with Intel Arc graphics
Our engineers have a long history of developing acceleration hardware for media processing. That legacy spans from the original MMX instruction set to our first full hardware encoder and decoder – Quick Sync Video (QSV) – introduced with the 2nd Gen Intel Core processors.
It should come as no surprise that Intel Arc graphics has dedicated hardware for modern codecs, running faster than other GPUs in most cases too. We set the Intel Arc A750 versus the Nvidia RTX 4060 and the Intel® Core™ Ultra 155H with Intel Arc built-in versus the AMD Ryzen 7 PRO 7840U in several head-to-head transcoding races across a variety of programs and codecs.
The closer a program is to transcoding video directly, the faster it performs on Intel Arc graphics. HandBrake is an excellent example of that, running 1.5x-3.5x faster on Intel Arc graphics than other comparable hardware.
Fully ready for modern video workflows
Supporting the most advanced space-saving and high-quality video file formats is only half of the work of a competent graphics card. Before a video is exported, your GPU needs to be ready for editing. Premiere Pro processes like cropping, timing adjustments, Warp Stabilization, and more all require a different set of processes than codecs, each a proprietary transformation to the video you’re editing. Intel Arc graphics handle all these editing functions like a champ, especially with our dedicated AI hardware accelerating processes like Adobe Premiere Pro’s Scene Edit Detection.
With full support for the most popular editing software, their traditional and AI-powered adjustments, and super-fast encoding in the latest formats, Intel Arc graphics include a complete media engine. TAP and team have demystified the science behind your favorite YouTube videos, game streams, and streaming TV, and Intel Arc graphics have the capability to power your own creation.
Notices and Disclaimers
Performance varies by use, configuration and other factors. Learn more at www.intel.com/PerformanceIndex.
Performance results are based on testing as of dates shown in configurations and may not reflect all publicly available updates. See backup for configuration details.
AI features may require software purchase, subscription or enablement by a software or platform provider, or may have specific configuration or compatibility requirements. Details at www.intel.com/AIPC.
Results that are based on pre-production systems and components as well as results that have been estimated or simulated using an Intel Reference Platform (an internal example new system), internal Intel analysis or architecture simulation or modeling are provided to you for informational purposes only. Results may vary based on future changes to any systems, components, specifications or configurations.
Your costs and results may vary. No product or component can be absolutely secure. Intel technologies may require enabled hardware, software or service activation.
All product plans and roadmaps are subject to change without notice.
Intel® Arc™ GPU only available on select H-series Intel® Core™ Ultra processor-powered systems with at least 16GB of system memory in dual channel configuration. OEM enablement required; check with OEM or retailer for system configuration details.
© Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries.
Other names and brands may be claimed as the property of others.