Alt text and image descriptions made simple

Many years ago, I gave into peer pressure to join Tumblr, and almost immediately, I noticed a problem. As a microblogging service, Tumblr is really ideally suited to presenting images and video, and that’s a big part of its purpose in life. But when you upload images, it doesn’t present you with an option for alt tagging. I observed that no one really seemed to be doing anything about this, including people who ostensibly cared about disability and accessibility, and it bothered me.

So I did two things: I launched my ongoing campaign to lobby Tumblr to allow people to enter alt text, and I started adding a brief description in the text body accompanying my image. At first, people asked me why I wrote ‘Image description: Here is what is in this picture.’ in all my photo posts, but then it caught on like wildfire. Now, it seems to be in widespread use across Tumblr, whether people are using it for accessibility or any number of other reasons, and I’m really pleased that taking a simple step created such a revolution — especially since Twitter earlier this year enabled alt tagging, and, hilariously, called it ‘Image Descriptions,’ which suggests that my reach extended further than I could have ever imagined. I’m really proud of that accomplishment, and it illustrates how one person coming in at the right moment to promote accessibility can make a huge difference.

But that was just part of the struggle, because people still have a hard time writing alt text and image descriptions. I often get asked about how to do them well, and it’s something I actually cover in the disability trainings I lead for newsrooms and organizations. (You can email me if you want to hire me, FYI.)

So here’s what to know: The purpose of an image description is to provide information about what is going on in an image so that someone who cannot see it will still understand what is happening. This includes not just blind and low-vision people, but those who may have images turned off in their browsers for whatever reason. So when you write any image description, you need to think about the context of the image, why you are using it, and what’s critical for someone to know.

My cat Loki sunning himself.

Let’s take this photo. The alt text is: ‘My cat Loki sunning himself.’ That pretty accurately describes what’s going on in this picture: It shows a cat sitting in the sun. Someone who can’t see it has a general idea of what’s going on.

What else could we say about the image? He’s lying on his belly with his paws crossed in front of himself. He’s mostly white, with black markings on his head and chin. The light is extremely bright, and his reflective fur throws the white balance off, so the photo looks a little washed out, except for his face, which is in such deep shadow that you can’t tell his eyes are yellow. There’s a window behind him, and you can see a leafy tree through it. Some sort of framed artwork is also visible behind him, along with the leaves of a house plant. He appears to be lying on a wooden surface. His ears are up and he looks alert, and maybe a little smug.

That is a lot of information. I could lovingly describe every single detail of the photo, but it would be overwhelming. So instead, I need to think about the three things discussed above.

In this case, I’m using the image to illustrate an article about writing clear, simple image descriptions. So the alt text I chose is also clear and simple, with the most basic and relevant information included. A casual reader doesn’t really need to know about all of the miniscule details in the photo. Remember: Someone’s screen reader is reading off that alt text. A huge, extremely detailed description can be distracting.

What if I was writing an article about the increased risk of skin cancer in white cats, especially around their ears? Then it might be relevant to mention that he is mostly white, because that adds further context to the image. What if it’s an article about the various positions cats sit, sleep, and lie in while they were sunning? Then it would be helpful for the reader to know that he’s lying on this stomach, sitting alert and upright.

What if I was writing an article about white balance and taking good photographs in challenging conditions? Then it would be relevant to add that the light is bleaching out his fur and darkening his face — and in the text, I would add that this is an example of a bad photograph. (Remember, you can use in-text discussions to add context too! This is important if you want everyone to know a piece of information about a photo!)

Does anyone really need to know about the houseplant in the background? Eh, not really. The window? Probably not.

My cat Mr. Shadow lying across some books in the sun.Okay, so what about this photo? It’s not a particularly good or remarkable photo. What makes it special is that it’s one of the last pictures I took of Mr. Shadow before he died. I was reading a book and he came and sat on it, so I picked another book, and he sprawled out across both of them. You can see that he looks thin and a little pinched, and his legs are awkwardly positioned — all of these things suggest that he’s probably sick and not very comfortable. The alt text reads: ‘My cat Mr. Shadow lying across some books in the sun.’

Why didn’t I describe his physical appearance? Because I just did, in the text. I don’t need to say it twice. If the only textual context I provided was ‘this is one of the last photos I took of Mr. Shadow before he died,’ I would probably describe his physical condition so that people would understand that he looks ill. I also don’t describe things like the flooring, which isn’t really relevant to the image, or the titles of the books — in part because they’re not relevant, and in part because you can’t actually see them. If the titles of the books were relevant, I’d mention it in the text, because you can’t see them — and presumably I would want people to know.

I could choose to add ‘a grey tabby’ to the description, but again, the question of relevance is important. Remember: A screenreader is reading this description to someone. The longer and more involved it gets, the more difficult it comes to follow. ‘My cat Mr. Shadow’ tells you who is depicted, and ‘lying across some books in the sun’ tells you what is depicted. His colouring may or may not be relevant.

Any time you use an image, you need to think about why it’s so important to you. A picture is worth a thousand words, but you shouldn’t be using a thousand words to describe it. You should also think carefully about how you frame descriptions of people — race and gender are both more complicated than you realise, and unless you know the race and gender of the people depicted, you probably shouldn’t assume.

In the photograph that heads this article, two people are standing in the water taking a selfie. One appears to be of Asian descent, and has long hair, and is wearing a dress. That’s all I know about that person — this person could be any gender, could be from any number of countries, might be mixed race. The other person is barely visible — I can see part of a baseball cap, a bare midriff. I can’t really make suppositions beyond those. And I could describe this scene in exhaustive detail (there are islands in the background, the water is very clear, they are in up to their mid calves, the dress is billowing in the breeze, they are using a selfie stick, the atmosphere is a little hazy…). But not all of that information is relevant — remember to think about context. Am I writing about selfie sticks? Then it’s important to know that they are using one. Am I writing about traveling in Southeast Asia, where I know this picture was taken because the photographer says so? If I don’t mention that the image was taken in Thailand (per the image tags) in the text, which I probably should, so everyone knows the context, I would add it to the image description.

Remember: Simpler is better. Clearer is easier. If you find yourself going to great lengths in alt text, step back and ask yourself why — if this information is really important, should it perhaps be in the text itself? Always ask yourself: If I couldn’t see this picture, what would I want to know about it?

Image: Perfect Selfie, Anders Lejczak, Flickr