The OSINT prompt for AI-powered image geolocation
Mastering the art of specifying place in generative AI prompts
As a journalist, verifying the who, what, where, when and why of an image is paramount. Determining an image’s exact geolocation is a critical OSINT skill and forms one of the three pillars of social newsgathering verification.
LLMs can provide a wide range of research and analysis support for open-source journalists through targeted prompting.
Building prompts is not an exact science. Depending on the complexity of the task, few prompts will provide the format or output you require without iterations or continuous debugging analysis.
What is essential is adopting a structured framework that can be adapted across a wide range of different tasks, which will help you build consistency across your prompts.
The same can be adopted for geolocation work.
This tutorial breaks down an AI prompt that creates step-by-step analysis, transforming a simple AI response into a detailed investigative geolocation report.
R.A.F.T.
R.A.F.T. is a framework which has been adopted by the Eurovision Social Newswire and Spotlight teams for use in a wide range of daily research and analysis tasks. From geolocation to webscraping to data analysis, R.A.F.T. is a portable, reusable asset that provides a solid structure upon which to build from for non-editorial tasks and tailor the output to the journalist’s needs.
Role (R): Assign a persona (e.g., “Forensic Accountant,” “Legal Verification Expert”) - Ensures the AI uses the appropriate industry terminology and level of scepticism.
Action (A): Define the specific, measurable task - Prevents generic text. Focuses AI on the immediate, time-critical task.
Format (F): Demand a specific, usable output structure (e.g., Markdown table, JSON, bullet points) - Makes data instantly ready for a spreadsheet, map, or article bullet list.
Tone (T): Set the language, perspective, or scepticism (e.g., “Neutral and objective,” “Sceptical”) - Prevents the AI from injecting hyperbolic or opinionated language.
The geolocation prompt
The prompt is designed to put the AI into an analytical mindset, mandate a Chain-of-Thought (CoT) process, and structure the output for maximum usefulness.
Assume the role of an expert OSINT analyst specialising in image geolocation.
Core Task: “Analyse the provided image meticulously to determine its geographic location. Aim for country, city, and, if possible, specific street address or coordinates.”
Guided Instructions:
1.Identify and list all significant visual cues** present in the image. Focus on: identifiable landmarks, architectural styles, landscape features (mountains, coastlines, vegetation type), visible vehicles (note make/model, license plate details if readable), all readable text on signs or buildings (transcribe text and identify language), flags, distinctive clothing or uniforms, and overall weather/environmental conditions.
2. Based only on the visual cues identified, provide a step-by-step reasoning process (Chain-of-Thought) explaining how these cues lead to a potential geographic location or region. Discuss ambiguities or conflicting clues if present.
3. State your most likely geolocation estimate(s).Provide coordinates if confidence is high.
4. Finally, explain the confidence level (High/Medium/Low) for your estimate and briefly justify it based on the strength of the evidence.
(Optional Context - Use with Caution): “Consider the following potential context, but prioritise visual evidence:.” (Use this section only if you have external, non-visual information that might guide the AI, like “The photo was posted by an account located in Europe.”)
Deconstructing the prompt: How it works
The prompt leverages several key AI prompting techniques: Role assignment, task definition, feature extraction, chain-of-thought, and confidence assessment.
Role Assignment and Tone Setting
Assume the role of an expert OSINT analyst specialising in image geolocation.
Assigning an “expert” role dramatically increases the quality, specificity, and technical depth of the response. It primes the AI to use specialised terminology and analytical rigour, rather than providing a casual guess.
The Core Task
Analyse the provided image meticulously to determine its geographic location. Aim for country, city, and, if possible, specific street address or coordinates.
This section directly addresses the Action (A) element of the R.A.F.T. framework. It sets a clear, ambitious, and measurable goal: Analyse the provided image meticulously to determine its geographic location. Aim for country, city, and, if possible, specific street address or coordinates. “Meticulously” encourages detailed analysis, and the required output hierarchy (country → city → street/coordinates) ensures the AI doesn’t stop at a vague region.
The subsequent Guided Instructions (1-4) further break down this single “Action” into verifiable, step-by-step sub-tasks.Feature Extraction and Data Structuring (Guided Instruction 1)
Identify and list all significant visual cues present in the image. Focus on: identifiable landmarks, architectural styles, landscape features (mountains, coastlines, vegetation type), visible vehicles (note make/model, license plate details if readable), all readable text on signs or buildings (transcribe text and identify language), flags, distinctive clothing or uniforms, and overall weather/environmental conditions.
This is the most crucial part. You are training the AI on what to look for, ensuring it executes the first step of OSINT before jumping to a conclusion. You’re essentially providing the AI with a checklist of features it is good at identifying (text, architecture, cars, flags).
Tip: This structured list makes it easy to spot if the AI missed an obvious clue (e.g., a foreign-language sign).Mandatory Chain-of-Thought (Guided Instruction 2)
Based only on the visual cues identified, provide a step-by-step reasoning process (Chain-of-Thought) explaining how these cues lead to a potential geographic location or region. Discuss ambiguities or conflicting clues if present.
The Chain-of-Thought (CoT) technique is a powerful enhancer for complex tasks. It forces the AI to demonstrate its analytical process.
Cue 1: Text in Cyrillic. → Conclusion: Likely Eastern Europe or a Slavic country.
Cue 2: Specific vehicle model (Lada Niva). → Conclusion: Supports Eastern European region, common in Russia/Ukraine.
Cue 3: Distinctive mountain range. → Conclusion: Matches geography of the Carpathian mountains.
This not only explains the answer but also helps you (the journalist) verify the logic.
Final Estimate & Coordinates (Guided Instruction 3)
State your most likely geolocation estimate(s). Provide coordinates if confidence is high.
This forces the AI to move from the analytical process (CoT) to a definitive, actionable answer. Requiring coordinates only when confidence is high prevents speculative guessing, ensuring that the provided geographical data meets a threshold of certainty required for professional use.
Confidence Assessment (Guided Instruction 4)
Finally, explain the confidence level (High/Medium/Low) for your estimate and briefly justify it based on the strength of the evidence.
A geolocation estimate is useless without a confidence level. This section forces the AI to self-critique based on the quality of the evidence (e.g., High confidence due to a unique street sign; Low confidence due to blurry text and generic architecture).
Optional context (Use with Caution)
Consider the following potential context, but prioritise visual evidence: (Use this section only if you have external, non-visual information that might guide the AI, like "The photo was posted by an account located in Europe.")
Use this section only if you have external, non-visual information that might guide the AI, like “The photo was posted by an account located in Italy”.
Example
On October 4, a video emerged on YouTube showing a appearing to showing a protest in Israel. The video was simply titled חי צילום כנסים צילום אירועים שידור חי SATVIEW.TV רחף (translated: “Live Conference Photography Event Photography Live Broadcast SATVIEW.TV Hover”)
At the time of the video’s publication, it was a live broadcast and offered few visual clues about the exact location, apart from blurred text in both Hebrew and English.
In the top right corner of the screen, the text “Tel Aviv Museum of ” was visible on one of the buildings. This could be sufficient for a manual search on Google to find the location. Additionally, it provides an opportunity to take a quick screenshot of the area and use a geolocation prompt.
Process
Upload content
Start your interaction by uploading the image (in this case, a screenshot) you need to geolocate.
Paste the full prompt
Paste the entire prompt into the AI’s chat box, including the role assignment and all instructions. In this instance, we are going to also invoke the Optional context section of the prompt to include “Tel Aviv Museum of” observed in the screenshot.
Step 3: Review the output
Critically evaluate the AI’s output.
AI output (Gemini)
Based on the image analysis, here is the geolocation estimate:
Geolocation Estimate
The most likely location is is Sderot Sha’ul HaMelech, Tel Aviv-Yafo, Israel.
The demonstration is taking place in the plaza area along this boulevard, known as Hostages Square (Kikar HaHatufim), which is directly in front of the Tel Aviv Museum of Art complex.
Coordinates (Approximate Centre of Square): 32°04’37.4”N 34°47’14.6”E
1. Visual Cues Identification 🔍
Feature: Readable Text
Description & Detail: The text visible on the image, including the text below the image, is in Hebrew. Banners and signs contain Hebrew characters.
Clue Significance: High: Confirms the country/language of the event is Israel.
Feature: Flags:
Description & Detail: Multiple flags are visible, predominantly the Flag of Israel (Blue Star of David on a white field between two blue stripes).
Clue Significance: High: Strongly points to Israel.
Feature: Landmarks/Architecture:
Description & Detail: The imposing and well-lit institutional buildings visible in the background, though partially obscured, match the architecture found in the cultural and civic centre of Tel Aviv. The building in the background is part of the extensive complex that includes the Tel Aviv Museum of Art and the surrounding cultural institutions on Sderot Sha’ul HaMelech.
Clue Significance: Anchors the location to Tel Aviv’s institutional core along this boulevard.
Feature: Event Type
Description & Detail: The large gathering of people, presence of a stage, organised banners, and a live broadcast setup strongly suggest a major political demonstration or rally, consistent with events held at Hostages Square.
Clue significance: Medium: Consistent with frequent public gatherings at this prominent civic location.
2. Reasoning Process (Chain-of-Thought) 🤔
Country/City Identification: The presence of Hebrew text and Israeli flags definitively places the event in Israel, with the large scale suggesting Tel Aviv, the hub for major political gatherings.
Landmark Recognition: The location, Sderot Sha’ul HaMelech, is known for its large public plaza, which sits directly in front of the Tel Aviv Museum of Art (TAMA). This plaza, recently established as Hostages Square, is defined by the surrounding institutional buildings and is frequently utilised for massive public events and rallies.
Architectural Context: The visible background architecture is congruent with the complex that includes the Tel Aviv Museum of Art’s buildings and the adjacent cultural centres, establishing the exact location on this boulevard.
3. Confidence Level and Justification ✅
Confidence Level: High 🚀
Justification:
The combination of unambiguous national symbols (Israeli flags, Hebrew text) and the definitive identification of the location as Sderot Sha’ul HaMelech, a major civic location in Tel Aviv defined by the presence of the Tel Aviv Museum of Art and its public plaza, provides a high degree of confidence in the geolocation.
In this example, there are many visual clues to aid the AI’s response, with Google Maps confirms the location is correct. However, this will not always be the case, and in some instances, it will not yield the expected results.
It is vital to remember that detailed prompts won’t work miracles if the source content lacks distinguishable information. If an image is too blurry, too generic, or completely lacking in unique visual cues, even the best prompt can only state that geolocation is impossible and justify the low confidence level.
Always temper expectations based on the quality of the image itself.
However, the prompt shows how using the R.A.F.T. framework, you can move beyond simple guesses and obtain a robust, evidence-based OSINT report every time.
Master advanced investigative techniques in the “Prompting for Geolocation” session, part of the Specialised Context & Content module within the EBU Academy’s Advanced Prompting for Journalists course.
Instructor Derek Bowler leads this training, offering high-level skills essential for modern journalism. The full course curriculum also covers: Core Advanced Workflows, Editorial Integrity & Ethics, and Automation & Data Pipelines.
Demand is exceptionally high for the upcoming session on December 1, 2025. Spaces are limited—register today to secure your spot!