The anti-bias machine: AI in journalism education

onlinejournalist 1,109 views 65 slides Oct 10, 2024
Slide 1
Slide 1 of 65
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38
Slide 39
39
Slide 40
40
Slide 41
41
Slide 42
42
Slide 43
43
Slide 44
44
Slide 45
45
Slide 46
46
Slide 47
47
Slide 48
48
Slide 49
49
Slide 50
50
Slide 51
51
Slide 52
52
Slide 53
53
Slide 54
54
Slide 55
55
Slide 56
56
Slide 57
57
Slide 58
58
Slide 59
59
Slide 60
60
Slide 61
61
Slide 62
62
Slide 63
63
Slide 64
64
Slide 65
65

About This Presentation

AI is biased because we are. This presentation explores opportunities to address a number of forms of bias in journalism - including access - and some practical tips for tackling bias with AI.

Keynote for the Nordic Journalism Education conference, Kristiansand, October 2024


Slide Content

The anti-bias
machine

AI in journalism education

I will talk about…
●AI is biased because we are biased
●Opportunities to widen access
●Some practical tips

But first.

Reuters: Digital News Report 2018
It was being used before ChatGPT…

https://bfi.uchicago.edu/insights/the
-adoption-of-chatgpt

AP: Generative AI in Journalism report (2024)

Cision State of the Media Report 2024
Journalists aren’t using it as much as it might seem

Cision State of the Media Report 2024
And they’re primarily using it like Google…

Consent in Crisis: The Rapid
Decline of the AI Data Commons
Whereas most other people…

Meanwhile…
…in the universities

“The University position is not to impose a blanket
restriction on the use of generative AI, but rather to:
Emphasise the expectation that assignments
should contain students’ own original work”
Edinburgh University (2024): Generative Artificial Intelligence guidance for students

“Note that if you rely on AI-generated precis of
sources, you won't enhance your own skills of
textual or data analysis. These are key skills for
university level study.”
York University https://www.yorksj.ac.uk/policies-and-documents/generative-artificial-intelligence/

York University https://www.yorksj.ac.uk/policies-and-documents/generative-artificial-intelligence/
“However, as a starting point, you could use genAI
tools to: … Provide an accessible summary of
concepts, ideas or bodies of literature freely available
online.”

Ulster University: Guidance
on use of AI for students

“An opportunity to move away from testing
knowledge and memory, and head towards
developing and assessing more highly valued
graduate attributes such as critical thinking, sound
judgement and solution creation.”
Birmingham City University (2024): AI Tools BCU Guidelines: Student Responsibilities

Q14: Can I use an AI tool to write code or a computer program for an
assignment?
No, you should not use an AI tool to produce code or programs that form part of an assessed piece of work,
unless clearly stated in the assessment brief. A programming or coding assignment is required to be your own
original work to demonstrate your understanding and expertise in whichever programming language you are
using.

Birmingham City University (2024): AI Tools BCU Guidelines: Student Responsibilities

“However, as a starting point, you could use genAI
tools to: … Debug code”
York University https://www.yorksj.ac.uk/policies-and-documents/generative-artificial-intelligence/

Choose a metaphor
●The cat is out of the bag
●The horse has bolted
●The elephant in the room

Generated with AI

Obligatory list of problems
●Hallucination
●Bias (more)
●Quality (including accuracy, knowledge cutoff)
●Environmental impact (more; more/research)
●Information security, copyright

“A dead end for responsible journalism”
Jonathan Soma’s framework for assessing AI

Consent in Crisis: The Rapid
Decline of the AI Data Commons
Apparent uses of ChatGPT are misaligned with the popular web domains language models are trained on.
Figure 4(a) shows the distribution of services provided by the web domains, broken down by whether those
domains are monetized. In contrast, Figure 4(b) shows how ChatGPT is used in the real world. The way that
users interact with ChatGPT is different in important ways from the types of content that is most frequently
represented in publicly available web-based training datasets. For instance, in over 30% of conversations,
users request creative compositions such as fictional story writing or continuation, role-playing, or poetry.
However, creative writing is poorly represented among the web data used for model training

https://bfi.uchicago.edu/insights/the
-adoption-of-chatgpt

Guidelines: crude, safety-first
Writers cannot put confidential information into AI tools and should ensure
other sources they use are “free of AI-generated content.” AP staff are asked to
avoid accidentally using AI content created to spread misinformation and should
verify the veracity of the content they use. (The Verge)
The Telegraph “has forbidden staff from incorporating AI-generated text into
copy except in limited circumstances, which require the sign-off of top Telegraph
editors and the legal department.” (PG)
VG’s guidelines ban generating photorealistic images as “AI-generated content
must never undermine the credibility of journalistic photography.” (Nieman)

Guidelines: permissive, not proactive
Insider journalists may use generative AI for creating “story outlines, SEO
headlines, copyediting, interview question generation, explaining concepts,
and summarizing old coverage.”
“Allowed uses [at Nucleo include] summarizing texts for suggesting
“alternative posts on social networks,” and as a tool for researching “subject
and themes.” Aftonbladet’s and VG’s journalists, may “use AI technology in their
work to produce illustrations, graphics, models.” (Nieman)

I have a problem with the
problems

Problems
●Focusing on AI bias distracts from human bias
●Research focuses on what is being done, not what
could be done
●Top-down rather than bottom-up
●A focus on risks, not opportunities, and a binary
‘use/don’t use’ instead of ‘how can we use this safely’

●Generating ideas
●Understanding systems
●Understanding conventions and codes
●Distinguishing between pro and amateur
●Being critical of their own work
●Seeking help and feedback
●Connecting practice with theory
●Translating theory
●Cognitive bias: confirmation bias, groupthink, associative coherence
Bottom-up: what problems do students face?

Story stageOpportunities Risks
Idea
generation
1.Little time for idea exploration
2.Low exposure to potential inspiration.
3.Limited usage of models such as SCAMPER,
systems thinking etc.

GenAI can increase exposure and reduce time
required. Prompts can require application of
models.
Hallucination: low (ideas can be easily rejected)
Knowledge cutoff: medium (new information may need to
be injected)
Bias: high (prompts must specify diversity)
Environmental: high (results can be achieved without
genAI)
Story
research
1.Research lacks diversity
2.Lack of time for document exploration

GenAI can increase diversity and assist with
technical barriers.
Hallucination: medium (material must be checked)
Knowledge cutoff: medium (new information may need to
be injected)
Bias: high (prompts must specify diversity)
Environmental: medium (genAI required for some tasks)
1.Lack of literacy/confidence in FOI, scraping
and advanced search
2.Data needs cleaning

Hallucination: low (easily testable)
Knowledge cutoff: low (not reliant on up to date info)
Bias: low
Environmental: high (results could be achieved without
genAI, but would require training)
Opportunities vs risks

Risk assessment & method statement
Story stage Risks Methods
Idea
generation
Hallucination
Knowledge cutoff
Bias
Environmental
Identify assumptions underpinning ideas, and research needed to check those
Inject new information
Prompts must specify diversity
Attempt without genAI
Story
research:
sources/docs
Hallucination
Knowledge cutoff
Bias
Environmental
Ensure prompts include references that can be checked
Inject new information
Prompts must specify diversity)
Attempt without AI first
Story
research:
scraping/FOI
Hallucination
Knowledge cutoff
Bias
Environmental
Test results
Check for recent changes in laws
Prompts must specify diversity)
Consider training instead of genAI

Opportunities

Opportunities
●To tackle newsroom limitations
●To tackle our own bias
●To develop risk assessments and strategies

Chris Moran (head of editorial innovation, The Guardian):
How Claude is helping release my inner developer
At this point I needed to come clean with it and share the scale of my ignorance.
I barely even really know how to open Terminal on my Mac. I have no idea if I
have Python installed. I find the terminal impenetrable. Can you talk me through
every single detail step by step?
Sharing that with anyone reading is mortifying. But that's one of the reasons why
using Claude for the job of tutoring is so empowering. It doesn't judge and it
absolutely will not stop until you understand something (well, or until its context
window runs out).

The bias opportunity
To tackle our bias
(rather than being distracted from it)

Which biases?
●Bias in language
●Bias in sourcing
●Bias in access to the industry (students):
access to cultural capital and codes
●Cognitive biases

Your publication has just introduced a
policy of ensuring that all stories
reflect the diversity of the communities
it seeks to serve. Those communities
include people of different faiths,
social class, ethnicity, gender, and
sexuality. It includes both rural and
urban readers, old and young, and people
with disabilities.

Identify any potential bias in this
article and steps the journalist can
take to address those.

https://onlinejournalismblog.com/2024/10/10/identifying-bias-in-your-writ
ing-with-generative-ai/

https://onlinejournalismblog.com/2024/10/10/identifying-bias-in-your-writ
ing-with-generative-ai/

“Selecting people to interview … is
something I've had trouble in the past
with.”

You are a specialist [health journalist]
with years of experience working in the
sector. You are planning an
investigative feature on the [influence
of the private sector on NHS dentistry]
and are especially keen to make sure
that the story draws on a more diverse
range of sources than stories normally
do.

Make a list of sources that you might
approach to get quotes and background
for the story.

“The suggestions it made helped me consider
sources I hadn’t thought of contacting.”

Make a list of data sources that you
might use. Make sure sources are
included which have data on
under-represented groups.

Suggest document-based sources for the
feature. Make sure sources are included
which relate to under-represented
groups.

“I turned to AI for suggestions on suitable
story angles, but the outcomes did not align
with my expectations. Nevertheless, this
experience prompted me to view the data
from a different perspective.”

Laura Hazard Owen: where diversity doesn’t exist
“The Baltimore Times, a free weekly online newspaper that covers positive stories
about the African American community and has a “staff of three-and-a-half,” said
associate publisher Paris Brown. It uses AI to create diverse content when they
can’t get the real thing. It used AI to create “culturally relevant,” diverse” digital
voices” “diverse voices” to read articles aloud. Brown also gave an example of a
prompt they gave image generator DALL-E:
“A-diverse-group-of-black-doctors-both-men-and-women-teaching-young-black-stu
dents-in-a-laboratory-setting-the-image-should-depict-the-doctors-wearing-white-ja
ckets.”

The mirror

**
Source

*
Just some biases journalists fall foul of:
●Out-group homogeneity bias
●Confirmation bias
●The anecdotal fallacy
●False consensus effect
●Clustering illusion
●Survivorship bias
●Illusory correlation bias
●The halo effect
●What you see is all there is
*

You are a data journalist who has
identified that the number of children
needing help with speech and language in
their first year of school has increased
substantially in the last two years. You
think this is because of the worldwide
pandemic, but suggest five alternative
hypotheses for why this might be the
case.

Here are five other potential reasons for this phenomenon:
1.Changes in screening and assessment: There may have been improvements or changes in how
schools assess speech and language skills, leading to more accurate identification of children who
need support. New assessment tools or protocols could be detecting issues that were previously
overlooked.
2.Increased awareness and reporting: Parents, teachers, and healthcare providers might be more
aware of speech and language development milestones. This increased awareness could lead to
more frequent reporting and referrals for assessment, rather than an actual increase in problems.
3.Demographic shifts: Changes in local demographics, such as an increase in immigrant populations
or families speaking languages other than the primary language of instruction, could contribute to
higher numbers of children needing language support in their first year of school.
4.Environmental factors: Increased exposure to screens and digital devices from an early age might
be impacting language development. This could be due to reduced face-to-face interactions and
conversations with caregivers and peers.
5.Socioeconomic changes: Economic downturns or shifts in employment patterns might have led to
changes in childcare arrangements or reduced access to early childhood education programs. This
could result in fewer opportunities for language development before entering school.

Checklists are
FANTASTIC.

“The Surgical Safety Checklist has been shown to
reduce complications and mortality by over 30
percent.”
https://www.who.int/teams/integrated-health-services/patient-safety/research/safe-surgery

You are an experienced editor who
previously trained in psychology. You
have a deep understanding of cognitive
biases and always look to identify
cognitive bias in stories that you edit.
Here is an explanation of just some of
those cognitive biases… [paste
explanation]

Provide feedback on this story
specifically targeted at identifying
cognitive biases that may affect the
writer of the piece.

Identify which cognitive biases relate
to which sections, and include concrete
suggestions for reducing the impact of
each cognitive bias on the story.

Beyond ChatGPT
●Custom GPTs: creating ‘guardrails’ to restrict
undesired usage; quality control
●NotebookLM: an interface to knowledge
●Custom LLMs: renewed emphasis on
newsgathering (training material)
●Built-in AI: Colab’s code hints, formula
autosuggest

https://chatgpt.com/g/g-hG2J6f3V0-
paul-s-mini-editor

https://notebooklm.google.com/notebook/c8ad478a-0328-40d
5-9051-01dc4d4d7ec0?original_referer=https:%2F%2Fnoteb
ooklm.google%23&pli=1

https://notebooklm.google.com/notebook/c8ad478a-0328-40d
5-9051-01dc4d4d7ec0?original_referer=https:%2F%2Fnoteb
ooklm.google%23&pli=1

Next steps

Prompt design
●Teaching a language, not a tool
●Ensuring genAI “understands” us…
●…and we understand it
●New emphasis on newsgathering
(original/training material)
●New emphasis on verification

A review of teaching
●What is our objective?
●What barriers are students likely to face?
●What role can genAI play?
●Is it practical or realistic to restrict the use of genAI?
●What might be lost and gained by doing so?
●What risks are there in using genAI for that?
●What steps can be taken to reduce those?
●What opportunities are there for reflection and review?

“When reviewing the answers I was sure to
pay attention to the answers provided to
check for any potential bias in its answers”

Change
●GenAI means interacting with information, not
merely reading it
●Journalism’s role in a genAI world?
●Supply and demand: writing vs reporting

Key points
●Use genAI as an opportunity to hold up a
mirror to our own biases
●Explore risk assessment and checklists
●Identify where AI can lower barriers to access

Questions?