FAIRSpectra - Enabling the FAIRification of Spectroscopy and Spectrometry

AlexHendersonManchester 92 views 28 slides May 07, 2024
Slide 1
Slide 1 of 28
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28

About This Presentation

Presentation from the "Opening up Research" conference organised by the University of Manchester's Office for Open Research.
24 April 2024
Video - https://doi.org/10.48420/26678713.v1
https://www.openresearch.manchester.ac.uk/
https://fairspectra.net
https://alexhenderson.info


Slide Content

FAIRSpectra
Enabling the FAIRification of
Spectroscopy and Spectrometry
Alex Henderson
University of Manchester
Office for Open Research
https://fairspectra.nethttps://alexhenderson.info

Thanks…
•For financial support
•University of Manchester’s Office for Open Research
•SurfaceSpectra Ltd.
•For in-kind support (free exhibition space)
•UK Surface Analysis Users Forum (UKSAF)
•SIMS Europe
•SpringSciX 2024
•101
st
IUVSTA Workshop (The International Union for Vacuum Science, Technique and Applications)
•Zulip (free upgrade)
SIMS Europe
Office for Open Research

What is FAIR?
The FAIR Guiding Principles
•Findable
•Accessible
•Interoperable
•Reusable
https://www.go-fair.org/

What are Spectroscopy
and Spectrometry?
•Instrumentation-based chemical analysis
•mass spectrometry (MALDI, SIMS, DESI)
•UV-vis / infrared / Raman spectroscopies
•NMR
•X-ray diffraction
•…
•Variants of each technique have own requirements
•Need to consider combination of techniques → data fusion

Modalities
•Single spectrum
•Collections of spectra
•Spectral maps
•Multispectral images
•Hyperspectral images
•3D images

Modalities
•Single spectrum
•Collections of spectra
•Spectral maps
•Multispectral images
•Hyperspectral images
•3D images
Typical infrared spectrum of tissue. Courtesy Peter Gardner @ Manchester

Modalities
•Single spectrum
•Collections of spectra
•Spectral maps
•Multispectral images
•Hyperspectral images
•3D images
X-ray photoelectron spectra. Range of carbon functionalities in polymers. SurfaceSpectra Ltd.

Modalities
•Single spectrum
•Collections of spectra
•Spectral maps
•Multispectral images
•Hyperspectral images
•3D images
Secondary ion mass spectrometry spectra from a range of urinary-tract infection bacteria. John Fletcher @ Manchester now Gothenburg

Modalities
•Single spectrum
•Collections of spectra
•Spectral maps
•Multispectral images
•Hyperspectral images
•3D images
Co-located infrared and Raman spectroscopy spectra of neuroglioma cells. Photothermal Inc.

Modalities
•Single spectrum
•Collections of spectra
•Spectral maps
•Multispectral images
•Hyperspectral images
•3D images
Secondary ion mass spectrometry image of arsenic, sulfur and phosphate abundance in rice. Katie Moore @ Manchester

What is hyperspectral imaging?
•Operates more like a camera, with multiple image elements
•128 × 128 pixels, liquid nitrogen cooled
•Mosaic these ‘pictures’ to cover large areas
Courtesy of Peter Gardner @ Manchester

Modalities
•Single spectrum
•Collections of spectra
•Spectral maps
•Multispectral images
•Hyperspectral images
•3D images
Infrared spectroscopy hyperspectral image of prostate cancer tissue. False coloured Random Forests classification of spectra. Peter Gardner @ Manchester
H&E optical FTIR
Epithelium
Smooth Muscle
Lymphocytes
Blood
Concretion
Fibrous Stroma
ECM

Modalities
•Single spectrum
•Collections of spectra
•Spectral maps
•Multispectral images
•Hyperspectral images
•3D images
Secondary ion mass spectrometry 3D image depth profile, topography corrected, false coloured green=lipid (DPPC), R=nucleic acid (Adenine). John Fletcher & Alex Henderson @ Manchester

Modalities
•Single spectrum
•Collections of spectra
•Spectral maps
•Multispectral images
•Hyperspectral images
•3D images
Secondary ion mass spectrometry 3D depth profile image showing cholesterol distribution in surface of frog oocyte. John Fletcher @ Manchester, now Gothenburg

Modalities
•Single spectrum
•Collections of spectra
•Spectral maps
•Multispectral images
•Hyperspectral images
•3D images
X-ray photoelectron spectroscopy image of nitrogen and specific carbon functionality in caffeine tablet. Alex Henderson @ Manchester with Kratos Analytical.

What are the issues?
Academia
Funders require ‘data’ to be deposited in (open) repositories
But…
•No dedicated repositories
•Metadata terms are patchy
•Instrument data in proprietary file formats
•Many software packages not compatible with open formats
Researchers willing to share, but don’t know how

What are the issues?
Commercial activity needs to be considered
Barriers
•FAIR often confused with Open
•In-house processes considered good enough
•Worry about certain metadata usage giving secrets away
Benefits
•Easier to share data in-house, between labs and (overseas) sites
•FAIR practises lead to better records retention
•Acquisitions and mergers become more straightforward
•Third-party (open source) software becomes easily accessible
•Incoming staff already familiar with systems
Instrument manufacturer buy-in is vital

Moving forward
https://xkcd.com/2116/

What is FAIRSpectra?
Community driven initiative
Focus on hyperspectral imaging techniques
•File formats for hyperspectral imaging
•No standards exist right now
•Software tools to support these
•Metadata requirements
•Education and training
•Raising awareness

What is FAIRSpectra?
https://fairspectra.net https://fairspectra.zulipchat.com https://github.com/FAIRSpectra

Which metadata are required?
•Sampling method
•Storage conditions
•Chemical modifications
•Physical state
•Pre-treatment
•…
Sample
•Experiment plan
•Substrate material
•Mounting method
•Region analysed
•Instrument params
•…
Experiment
•Artifact removal
•Pre-processing
•Algorithm choice
•Hyperparameters
•Validation method
•…
Analysis
Downstream reporting
Upstream sample provenance

Where do we start?
•Sampling method
•Storage conditions
•Chemical modifications
•Physical state
•Pre-treatment
•…
Sample
•Experiment plan
•Substrate material
•Mounting method
•Region analysed
•Instrument params
•…
Experiment
•Artifact removal
•Pre-processing
•Algorithm choice
•Hyperparameters
•Validation method
•…
Analysis
Downstream reporting
Upstream sample provenance
Born-digital metadata
in data files
Limited/common options

Where do we start?
•Sampling method
•Storage conditions
•Chemical modifications
•Physical state
•Pre-treatment
•…
Sample
•Experiment plan
•Substrate material
•Mounting method
•Region analysed
•Instrument params
•…
Experiment
•Artifact removal
•Pre-processing
•Algorithm choice
•Hyperparameters
•Validation method
•…
Analysis
Downstream reporting
Upstream sample provenance
Many workflows have
common steps
Default hyperparameters

Where do we start?
•Sampling method
•Storage conditions
•Chemical modifications
•Physical state
•Pre-treatment
•…
Sample
•Experiment plan
•Substrate material
•Mounting method
•Region analysed
•Instrument params
•…
Experiment
•Artifact removal
•Pre-processing
•Algorithm choice
•Hyperparameters
•Validation method
•…
Analysis
Downstream reporting
Upstream sample provenance
Samples so varied makes
this very difficult

Where do we start?
•Sampling method
•Storage conditions
•Chemical modifications
•Physical state
•Pre-treatment
•…
Sample
•Experiment plan
•Substrate material
•Mounting method
•Region analysed
•Instrument params
•…
Experiment
•Artifact removal
•Pre-processing
•Algorithm choice
•Hyperparameters
•Validation method
•…
Analysis
Downstream reporting
Upstream sample provenance
Workflows also include repeated steps
with poorly defined break points

Exhibition booths at 3 conferences/workshops
•More planned, another 3 before Christmas
Survey
•Positives
•Everyone wanted to see something done, not sure about how
•Barriers
•People have difficulty sharing
•Poor documentation
•Proprietary file formats – loss of information
•Raw data versus processed data – large file size
•Gazumping / IP & prior art / confidentiality
•Time consuming
Where are we now – 6 months in?

Data file formats
•Looking to re-purpose strategies from astronomy, climate science, and microscopy
•Some file convertors ready for testing
•Suitable test files an issue – everyone’s a critic
Two instrument vendors interested in getting involved
•Need to be careful not to go too fast
•Only one shot at changing instrument software
Discussions with journals just beginning
•Need minimum reporting requirement
•Need to convince referees this is important
Developing connections with BioFAIR, ELIXIR and ISO
Where are we now – 6 months in?

Summary
The researchers are willing, but their resources are weak
•Few solutions currently exist
•Metadata terms missing
•Proprietary file formats are a barrier
•Instrument vendor buy-in required
•Lack of awareness persists
But…
•Some low-hanging fruit
•Opportunity to make an impact
•Even Closed FAIR can still have benefits to industry
There’s lots to do, but FAIRSpectra is just getting started!
https://fairspectra.net