CELPIP Speaking Task 3: How to Describe a Scene Effectively

What Is CELPIP Speaking Task 3?

In CELPIP Speaking Task 3, you are shown an image — typically a scene with several people doing different things — and asked to describe what you see. You have 30 seconds to prepare and 60 seconds to speak.

The task sounds simple, but candidates frequently lose points not because their grammar is poor, but because they describe scenes in an unorganised way: jumping between unrelated details, repeating themselves, or running out of things to say after 20 seconds.

This guide gives you a repeatable structure that fills the full 60 seconds with organised, detailed speech.

What Examiners Are Actually Listening For

Your response is scored on the same four criteria as all CELPIP Speaking tasks:

Fluency and Coherence — Does your speech flow? Do you move from one detail to the next logically, or do you jump around?

Vocabulary Range — Do you use varied, descriptive language, or do you rely on "there is a person" and "they are doing things"?

Grammar Accuracy — Are your descriptions grammatically correct? Present continuous tense is the primary structure here — "A man is carrying a bag" — so accuracy in this tense matters.

Pronunciation — Are you intelligible? Natural stress and pacing matter more than accent.

The most common Task 3 deductions come from poor organisation and thin vocabulary — not from grammar errors.

Practice CELPIP Speaking Task 3 with AI scoring and instant criterion feedback

Try free demo — no account required

Try free

The 30-Second Preparation Strategy

Do not use your 30 seconds to try to mentally write out what you will say sentence by sentence. That produces robotic, read-sounding speech. Instead, use it to build a quick visual map of the scene:

Where are we? (Identify the setting: indoor/outdoor, type of location)
How many people? (Count quickly — 3 people? 6 people?)
What is happening overall? (One-sentence summary of the main activity)
Pick 2–3 specific details to describe (interesting objects, interactions, background elements)

This 30-second map becomes the backbone of your 60-second response.

The Zone Method: How to Organise 60 Seconds

Describe the scene in zones — spatial areas of the image — rather than describing whatever catches your eye. Zones give your response a natural structure and prevent repetition.

Zone 1 — Overall scene (10–12 seconds) Open with a general statement about where you are and what is happening. Set the scene before zooming into details.

"This image appears to be taken in an outdoor market on what looks like a sunny afternoon. There are at least six people visible, and the scene has a relaxed, casual atmosphere."

Zone 2 — Foreground details (20–25 seconds) Describe the people or objects closest to the viewer. Include what they are doing, wearing (if visible), and any interaction happening between them.

"In the foreground, a woman wearing a green jacket is examining items on a display table. She appears to be in conversation with the vendor, who is pointing to something on the right side of the stall."

Zone 3 — Background details (15–20 seconds) Move to the background. Describe what is happening further away in the scene. This is where weaker candidates often stop — but the background gives you a second set of details that demonstrates careful observation.

"In the background, a group of two or three people are walking away from the market. One of them is carrying a large bag, which suggests they may have already made a purchase."

Closing observation (5–8 seconds) Finish with a brief general comment or inference about the overall mood or purpose of the scene.

"Overall, the image gives the impression of a busy community event where people are comfortable browsing at their own pace."

Vocabulary: Describing Scenes With More Precision

Generic vocabulary is the fastest way to limit your Task 3 score. Here are three vocabulary categories to develop before your exam:

Describing what people are doing: Instead of "they are doing something," use:

examining, inspecting, browsing, selecting (for shopping scenes)
conversing, gesturing, pointing, leaning toward (for interactions)
walking toward, moving away from, approaching (for movement)
seated, crouched, standing near (for position)

Describing the setting:

"The scene appears to be set in…"
"The background suggests…"
"The lighting indicates this is…" (indoor/outdoor, time of day)
"The general atmosphere is…"

Making inferences (shows higher-order thinking):

"This suggests that…"
"It appears as though…"
"Based on their body language, it seems like…"
"The presence of [detail] indicates that…"

Using inference language is a strong vocabulary and coherence marker — it shows you are doing more than just listing visible objects.

Grammar: Present Continuous Is Your Main Tool

Task 3 responses primarily use present continuous tense:

"A man is carrying a briefcase."
"Two women are talking near the entrance."
"The vendor is arranging items on the table."

Common errors to avoid:

"A man carry a briefcase." → missing auxiliary verb
"Two women talked." → wrong tense (past instead of present continuous)
"There is a man that is carry…" → awkward relative clause construction

Practise present continuous descriptions until they feel completely automatic before exam day.

What to Do If You Run Out of Ideas

If you describe the full scene and still have 15–20 seconds remaining:

Go back to a detail you mentioned briefly and expand it — "I mentioned the woman in the green jacket earlier. It's also worth noting that she appears to be comparing two different items, which suggests she hasn't made a decision yet."
Comment on what is NOT in the scene — "Interestingly, there are no visible signs or price tags on the display, which is somewhat unusual for a market setting."
Make a broader inference — "The overall composition of the image suggests this could be from a documentary or lifestyle photography project, given the candid, unstaged nature of the interactions."

These strategies keep your speech going naturally without repeating yourself or padding with filler phrases.

Practice Speaking Task 3 — get scored on Fluency, Vocabulary, Grammar and Pronunciation