Smart cropping for art-directed automatic image optimization

Art direction is essential to take advantage of display conditions

As the web has become visual, image optimization and management are turning into an inherent part of web development and administration. On the design side, visual quality and aesthetics are key issues. Images must look great, whatever is the content of a site. The right lighting. The right view. The right crop. The finest visual quality. Everything matters when it comes to conveying the message, to engage and convince.

Often, there’s a feeling of conflict about the need to trade off between optimization and responsiveness on one hand and image quality and art direction on the other.

Ideally, a site would serve each image in the best format and adapted to the display conditions of each of the many users. However, as the number of images has rocketed, doing this manually is simply unfeasible.

In practice, widespread content management systems try to make some generic and convenient solutions. Wordpress serves a number of versions of an image resized and cropped from a fixed position to match the predefined sizes of the theme in use. Medium appears to do basically the same as shown below, leaving almost no vestige of the original message.

This is what Medium does if I try to feature the image below for this post. It leaves barely any vestige of the original message in the image.

Often, compression policies are also put in place, limiting the maximum weight of images or imposing a certain compression level somehow defined for all the images. Again, Wordpress compresses every image at a fixed 0.82 compression level by default. There’s even a general recommendation from Photoshop on the quality settings to use, depending on the acceptable visual quality. Regarding image format, the common practice is to choose PNG for icons and text images and JPEG for the rest, some also adding a WebP version just for Chrome browsers. When aesthetics really matters, key images are tweaked manually and size restrictions are imposed to keep art direction as consistent as possible. And this -together with lazy loading- is basically what is going on.

The pitfalls of this practice are evident and numerous. Many images will be undercompressed or visually degraded or both. Some images will show an awkward crop even though only a few sizes are available or they will show an equally awkward sizing in certain displays. Watermarks and texts will be placed in fixed positions regardless of what they may hide…

How to overcome this apparent conflict and to improve current practices? How can we best reconcile the need for optimization and art direction?

In recent years, new approaches have been emerging that benefit from our progress in understanding and modeling perception.

The use of metrics -related to perceived visual quality- allows us greater control over the compression settings based on the specific image content. This way visual quality remains the best for the displayed resolution while file size is kept at the minimum. Few compressors are really doing this. It’s a good practice to to be sure that it’s taking advantage of such possibility.

has become a reality, automatically choosing the best image crop according to measures of visual importance — beyond basic face detection — . At Abraia, we have even managed to keep the aesthetic quality in the final composition and choosing a good cut around the important object.

Smart cropping improves art direction and visual experience compared to cropping using fixed shapes and positions and even avoids unnecesary page weight when compared to classic cropping but using the client browser

If Medium were to use smart cropping this is what I would get instead of the awkward crop shown above.

If Medium happened to use smart cropping I would be able to get this and feature the image I had in mind

This cut is automatic too. The difference is I wouldn’t had to look for one more nice picture in pexels :) that’s simply well centered, I would be free to feature the image I had originally in mind.

A similar approach allows to drive automatically the overlay of text or logos on images without damaging key elements of the composition.

These new tools lay steps contributing to the promise of bridging design and code development, easing the collaboration between designers and developers.

The power of visual perception models in the hands of a designer has the potential to ensure a persistent art direction across devices and networks. This is about coping with complexity -not about replacing the human decision-, since models should always play the role of protectors of a master design in which the focus, aesthetics, and key creative variations have already been crafted and validated.

As AI makes further progress in the prediction of attention and perception patterns, designers may be expected to regain much control of art direction, recuperating capacities from the old times of offline design. Of course, in addition to the huge flexibility brought by all new devices and media.

Providing tools to analyse, process, transcode, and deliver images and videos, on the shoulders of state-of-the-art cloud, media and AI technology.