Hudson Vitale "AI Essentials: From Tools to Strategies: A 2025 NISO Training Series, Session Two - Integrity in an AI Era: From Research to Algorithms"

BaltimoreNISO 2 views 16 slides Oct 16, 2025
Slide 1
Slide 1 of 16
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16

About This Presentation

This presentation was provided by Cynthia Hudson Vitale of Johns Hopkins University, for the second session of the NISO training series "AI Essentials: From Tools to Strategies." Session Two, "Integrity in an AI Era: From Research to Algorithms," was held on October 16, 2025.


Slide Content

AI Essentials: From Tools to Strategy

Week 2: Integrity in an AI Era: From Research to Algorithms Examine integrity across research, publishing, and AI systems. Experience and assess real tools that promote trust, transparency, and accountability in the use of AI.

Integrity The Singapore Statement on Research Integrity (2010) Purpose: Provides a global framework for responsible research conduct. Intended to complement national and institutional policies. Developed at the 2nd World Conference on Research Integrity (Singapore, 2010). Four Core Principles: Honesty in all aspects of research: reporting data, methods, and results truthfully. Accountability for the research itself, the resources used, and training of others. Professional Courtesy and Fairness in working with colleagues, collaborators, and students. Good Stewardship of Research on Behalf of Others using funds, data, and materials responsibly.

Integrity Research integrity (the work itself) Are data/methods sound, results reproducible, contributions and authorship transparent? Publication integrity (the scholarly record) Do provenance, metadata, and peer-review processes keep the record trustworthy? Algorithmic integrity (the systems) Are the tools/models we use fair, explainable, and sustainable and do they respect data provenance and consent?

Four Risk Zones As AI becomes embedded across the research lifecycle, integrity is tested in four main risk zones. These include: Research Fraud & Paper Mills Authorship & Attribution Algorithmic Bias & Transparency Environmental Impact

Integrity Type Focus Question Primary Risk Zones That Challenge It Research Integrity (the work itself) Are data and methods sound, results reproducible, and contributions transparent? Research Fraud & Paper Mills Authorship & Attribution Publication Integrity (the scholarly record) Do provenance, metadata, and peer-review processes keep the record trustworthy? Research Fraud & Paper Mills Authorship & Attribution Algorithmic Bias & Transparency Algorithmic Integrity (the systems and infrastructure) Are the tools and models we use fair, explainable, and sustainable — and do they respect data provenance and consent? Algorithmic Bias & Transparency Environmental Impact

Research Fraud & Paper Mills Threats: Fabricated manuscripts, data, or figures AI-generated “synthetic” papers with falsified content Coordinated paper mills exploiting weak peer review Integrity Infrastructure (for example) : Turnitin AI checker Grammarly Proofig ImageTwin

Authorship & Attribution Threats: Undisclosed AI assistance (“ghost authorship”) Inaccurate author contribution statements Image or data reuse without attribution Integrity infrastructure : ORCID CRediT Coalition for Content Provenance and Authenticity COPE Guidelines on AI Use in Scholarly Publications

Algorithmic Bias & Transparency Threats: Biased training data leading to skewed results Opaque model architectures and decision-making Over-reliance on black-box outputs in research assessment Integrity infrastructure: Model Cards (Hugging Face, Google) Dataset Nutrition Label AI Incident Database

Environmental Impact Threats: High energy and water use in model training and inference Hidden compute costs across cloud infrastructure Environmental inequities, concentration of AI capacity in wealthy regions Integrity infrastructure: Green algorithms MLCO2 Impact Estimator CodeCarbon

Tool Testing! In breakout rooms: Choose 2-3 tools or frameworks to explore and answer: What integrity problem does it address? What are its limitations? How might this tool or framework fit, or not fit, into your organization’s existing workflows?

Tool links Turnitin AI checker Website: https://www.turnitin.com/solutions/topics/ai-writing/ Grammarly Website: https://www.grammarly.com/ Proofig Website: https://www.proofig.com/ ImageTwin Website: https://imagetwin.ai/ ORCID Website: https://orcid.org/ CRediT (Contributor Roles Taxonomy) Website: https://credit.niso.org/ Coalition for Content Provenance and Authenticity (C2PA) Website: https://c2pa.org/ COPE Guidelines on AI Use in Scholarly Publications Website: https://publicationethics.org/guidance/cope-position/authorship-and-ai-tools Hugging Face model card: https://huggingface.co/docs/hub/en/model-cards Dataset Nutrition Label: https://arxiv.org/abs/1805.03677 AI Incident Tracker: https://airisk.mit.edu/ai-incident-tracker CodeCarbon: https://codecarbon.io/ Green Algorithm: https://www.green-algorithms.org/ MLCO2 Impact calculator: https://mlco2.github.io/impact/

Report out

Scenario Integration Manuscript Review – A publisher is piloting a detection or provenance-tracking tool. Institutional Repository – A library wants to embed model cards or dataset labels for datasets or AI models it publishes. Research Office – The campus is drafting a disclosure policy for AI use in grant proposals or publications. Data Services – The library supports faculty with code- or model-based research and wants to measure sustainability.

Focus Questions Purpose & Fit Are there any tools or frameworks that could aid in this scenario? What value would it add? Integration Pathways What policies, partnerships, or infrastructure would need to be in place for this to work? Stakeholders Who would need to be involved (IT, library, research office, faculty, vendors)? Risks & Trade-offs What new challenges might it introduce (privacy, bias, governance, cost)? Next Step What is one small, concrete step your organization could take to pilot or adapt this approach?

Wrap up!