Building AI into the investigative workflow

onlinejournalist 38 views 31 slides Oct 09, 2024
Slide 1
Slide 1 of 31
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31

About This Presentation

Talk presented to the AI in Journalism Education pre-conference, October 9 2024


Slide Content

Building AI into the
investigative workflow

Å integrere AI i den undersøkende arbeidsprosessen

What I’ll cover
●How AI can lower the barrier to entry in
investigative journalism
●Assessing and addressing the risks

The story so far
●MA Data Journalism at BCU: machine learning
used in student investigations since 2017
●2022: ChatGPT means AI becomes
non-investigation-specific (a “gateway drug”)
●Accelerates students’ coding, etc.

Bradshaw (2024) AI in
investigative journalism:
mapping the field

The problem
●GenAI threatens core journalism values
●GenAI threatens journalism’s status
●GenAI threatens journalists’ jobs

Not all our work is about facts.
But…

Not all our work is about facts.
Journalism is also about creativity.
But…

Not all our work is about facts.
Journalism is also about creativity.
Succinctness and clarity.
But…

Not all our work is about facts.
Journalism is also about creativity.
Succinctness and clarity.
Coding.
But…

Not all our work is about facts.
Journalism is also about creativity.
Succinctness and clarity.
Coding.
And reflexivity.
But…

Research

Scope diverse sources,
explore documents, form
advanced searches, and
write/fix code for scraping
and analysis
2
Pre-production

Idea generation and stimulation:
identify and map systems and
rules, apply brainstorming
frameworks (iceberg model, 5
whys, 8 angles of data journalism).
Planning.
1
Post-production

Optimisation, distribution
and reversioning
4
Production

Identify jargon and bias;
improve spelling, grammar,
structure and brevity
3
Applications of genAI in the journalism process

Research

Scope diverse sources,
explore documents, form
advanced searches, and
write/fix code for scraping
and analysis
2
Pre-production

Idea generation and stimulation:
identify and map systems and
rules, apply brainstorming
frameworks (iceberg model, 5
whys, 8 angles of data journalism).
Planning.
1
Post-production

Optimisation and
reversioning
4
Production

Identify jargon and bias;
improve spelling, grammar,
structure and brevity
3
Applications of genAI in the journalism process

You are an [investigative journalist]
working for a [national broadcaster]
aimed at an audience aged [20-40].
Identify parts of the [UK school] system
that might be suitable for an
investigation.
Starting with systems:

https://onlinejournalismblog.com/2024/07/10/investigative-journalism-an
d-chatgpt-using-generative-ai-for-story-ideas/

You are a [UK journalist writing for a
rural audience] looking for [feature]
story ideas relating to [education].
What rules do [UK schools] have to
follow in relation to [school uniform]?
Starting with rules:

Why rules?
●Rules having unintended consequences
●Rules not being followed
●Rules not being enforced
●The need for new rules

Take idea 2. Apply the [ SCAMPER method]
to it: [suggest variations of the idea
which substitute elements, combine it
with other ideas, adapt, modify, put to
another use, eliminate elements, or
reverse].
Applying a framework

https://medium.com/@headlinexplorer/investigative-journalism-iii-hypot
hesis-based-inquiry-557f48ea8dff

Omfang
7 vanlige vinkler for datahistorier_
Veksling
Dårlig/åpneRelasjonerUtforske
VariasjonRangering
(+ Saker)
Icons: the Noun Project: Becris (scale), Adrian Coquet (change and ranking), Kirby Wu (variation),
Aradila Studio (explore) Trevor Dsouza (relationships), Iconpai (bad data), Kirill Ulitin (leads)
https://onlinejournalismblog.com/2020/08/11/here-are-the-7-types-of-stories-most-often-found-in-data/

A new story reports that [scientists
have confirmed climate change is a major
reason the UK suffered such a
waterlogged winter]. You are a [data
journalist] looking to do a [data-driven
follow-up investigation] on this aimed
at a [UK] audience. Come up with 5 short
pitches for story ideas that you can
present to your editor when you next
meet her.
Injecting knowledge

Research

Scope diverse sources,
explore documents, form
advanced searches, and
write/fix code for scraping
and analysis
2
Pre-production

Idea generation and stimulation:
identify and map systems and
rules, apply brainstorming
frameworks (iceberg model, 5
whys, 8 angles of data journalism).
Planning.
1
Post-production

Optimisation and
reversioning
4
Production

Identify jargon and bias;
improve spelling, grammar,
structure and brevity
3
Applications of genAI in the journalism process

Research opportunities
●FOI
●Advanced search (‘Google dorks’)
●Scraping (help with code)
●Analysis (help with formulae, cleaning)
●Document sets (search across)
●Diverse sourcing

Story stage Opportunities Risks
Idea generation 1.Little time for idea exploration
2.Low exposure to potential inspiration.
3.Limited usage of models such as SCAMPER, systems thinking etc.

GenAI can increase exposure and reduce time required. Prompts can
require application of models.
Hallucination: low (ideas can be easily
rejected)
Knowledge cutoff: medium (new information
may need to be injected)
Bias: high (prompts must specify diversity)
Environmental: high (results can be achieved
without genAI)
Story research 1.Research lacks diversity
2.Lack of time for document exploration

GenAI can increase diversity and assist with technical barriers.
Hallucination: medium (material must be
checked)
Knowledge cutoff: medium (new information
may need to be injected)
Bias: high (prompts must specify diversity)
Environmental: medium (genAI required for
some tasks)
1.Lack of literacy/confidence in FOI, scraping and advanced search
2.Data needs cleaning

Hallucination: low (easily testable)
Knowledge cutoff: low (not reliant on up to
date info)
Bias: low
Environmental: high (results could be
achieved without genAI, but would require
training)
Opportunities vs risks

Research

Scope diverse sources,
explore documents, form
advanced searches, and
write/fix code for scraping
and analysis
2
Pre-production

Idea generation and stimulation:
identify and map systems and
rules, apply brainstorming
frameworks (iceberg model, 5
whys, 8 angles of data journalism).
Planning.
1
Post-production

Optimisation, distribution
and reversioning
4
Production

Identify jargon and bias;
improve spelling, grammar,
structure and brevity
3
Applications of genAI in the journalism process

You are an experienced editor who hates
it when a journalist gets too close to a
story and ‘overwrites’ it - that is,
uses too many words, or sentences that
are redundant. You also hate jargon and
prefer clear language that is accessible
to a normal reader.

Identify any potential jargon or
overwriting in this story and the steps
the journalist can take to address
those.

https://onlinejournalismblog.com/2024/09/26/using-generative-ai-as-a-s
preadsheet-and-data-cleaning-assistant/

Your publication has just introduced a
policy of ensuring that all stories
reflect the diversity of the communities
it seeks to serve. Those communities
include people of different faiths,
social class, ethnicity, gender, and
sexuality. It includes both rural and
urban readers, old and young, and people
with disabilities.

Identify any potential bias in this
article and steps the journalist can
take to address those.

https://onlinejournalismblog.com/2024/10/10/identifying-bias-in-your-writ
ing-with-generative-ai/

https://onlinejournalismblog.com/2024/10/10/identifying-bias-in-your-writ
ing-with-generative-ai/

Key points
●Opportunities are highest, and risk lowest, in the
earlier stages of investigations
●GenAI works best to extend thinking beyond
limitations of experience and routine
●Risk assessment needed, including
environmental impact of usage