berlin-april2024-epistemic-change-jws.pptx

ssuser7ddcbe 6 views 30 slides Aug 07, 2024
Slide 1
Slide 1 of 30
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30

About This Presentation

This presentation amongst other things demonstrates the some of the many problems with the disruption index.


Slide Content

Algorithm for the comparative study of indicators on the Leiden database Jesper W. Schneider, Emil B. Madsen & Jens Peter Andersen, Danish Centre for Studies in Research and Research Policy, Department of Political Science, Aarhus university, Denmark [email protected]

Disclaimer I am very critical of … the way we pretend to ‘measure’ … the way we misuse statistical inference … the way we disregard the theory … The way we disregard our assumptions … the way we ‘data mine’ noisy ‘Big Data’ and treat the (social) system as ergodic … the way we mix-up social and cognitive aspects of as science … the way we ‘oversell’ our descriptive claims and tell ‘stories’ about science and along the way, claim to be ‘scientific’ ‘Storytelling’, ‘cargo-cult science’ and ‘ quantifauxcation ’?

Motivation Uzzi , B., Mukherjee, S., Stringer, M., & Jones, B. (2013). Atypical Combinations and Scientific Impact. Science, 342(6157), 468-472. ‘Teams’ translates into the less fancy term co-authors ‘Novelty’ translates into the less fancy rare combinations of journal references in the reference lists What is implied by ‘team’? Co-authorship ≠ ‘collaboration’ Authorship = mix of division of labor and QRPs Number of authors which is generally rising influences paper production, reference and citing behaviors What is implied by ‘novelty’? Besides the obvious fact that atypical combinations of cited journals are considered novel! Presumably, the assumption is that something in the paper beyond the reference list is ‘novel’? But what does the concept mean? Novelty as complexity? Novelty as difficulty? Novelty as surprise? Novelty as technical novelty? Novelty as usefulness or value? A tremendous leap of faith is needed when inferring an epistemic quality of a paper based on atypical cited journal combinations in the reference list

Motivation To us this looked like ‘storytelling’ and we wanted to scrutinize the ‘disruptive’ papers – sort of a conceptual replication Wu, L., Wang, D., & Evans, J. A. (2019). Large teams develop and small teams disrupt science and technology. Nature. What is implied by ‘disruption’? Besides the obvious fact a citing pattern is changed! Original idea from patents where citing behavior is very different? Many technical issues which in themselves are detrimental for the measures Presumably, the assumption is that something epistemic in the paper causes the citing pattern to change radically? But what does the concept mean? … and could changes in citing patterns not just be some convenience issue? Most papers get scores around zero, Less citable potential, the more likely a paper will get a higher score A tremendous leap of faith is needed when inferring an epistemic quality of a paper based on a supposed radical change in citing patterns and assuming that the behaviors behind such are ergodic across the sciences

Reproducibility: Challenges with open data and code

Openness

Openness Biased? Not validated Data processing issues? Not fully transparent Often fixed parameters – ‘push button’ like Perhaps somewhat suitable for ‘replication’ studies (error checking) Not particular suitable for reproducibility, sensitivity, case and/validation studies

Doing it ourselves: The ‘algorithm’ or ‘conceptual reproducibility’

Our setting and algorithm We use CWTS’ database infrastructure SQL database : Structured, transparent, but challenging when it comes to intensive processing and computing Data processing is transparent (enabling deep analyses) Flexibility in parameter choice and transparency Right now, implementations of variants of the ‘disruption’ index Next, implementation of Wang’s ‘novelty’ index, Uzzi’s more problematic due to resampling Enable us to do comparative analyses in a ‘fixed’ transparent setting Moving from WoS to OpenAlex (same structure in the database

Did we get it right ?   0.86   -0.58  

Did we get it right ?   0.86   -0.58   No. references = 19 (some are notes) No. citable references (CWTS journals covered, 1980) = 12 No. references = 6 (some are notes) No. citable references (CWTS journals covered, 1980) = 2

Examining ‘disruption’ – technical issues Disruption score Counts

Examining ‘disruption’ – technical issues Disruption score Citable sources

Examining ‘disruption’ – technical issues Disruption score Disruption index variants

The most ‘disruptive’ Danish papers from 2000 to 2015 (0.5-1)? Cool paper on why case studies are great!

Examining ‘disruption’ conceptual issues

Examining ‘disruption’ conceptual issues https://www.science.org/content/blog-post/2022-nobel-prize-chemistry “Let's start out with the concept of "click chemistry" in general. That was an idea from Sharpless and co-workers that developed in the 1990s and was the subject of a widely-read article in Angewandte Chemie in 2001.” “The same [chemical] reaction was discovered by Meldal and co-workers in Copenhagen, who were looking for new ways to assemble peptides and peptide-like compounds … Meldal’s work [was] published in 2002 … “.

Examining ‘disruption’ conceptual issues Sharpless and Meldal came up with the same ‘idea’ or ‘discovery’ or ‘innovation’ independently of each other approximately at the same time Only review delays dispersed their publication to 2001 and 2002, respectively The papers are different, Sharpless’ is conceptual (review or summary), Meldal’s is empirical Sharpless has 215 references (notes) and 18 citable sources (CWTS) Meldal’s has 29 references and 8 citable sources (CWTS)

‘Disruption’ or ‘developer’ or just ‘ quantifauxcation ’ ? di5 Authors pub_year title Source Title n_refs citable sources n_cits 0.30 Tornoe , CW; Christensen, C; Meldal, M 2002 Peptidotriazoles on solid phase : [1,2,3]- triazoles by regiospecific copper (I)- catalyzed 1,3-dipolar cycloadditions of terminal alkynes to azides JOURNAL OF ORGANIC CHEMISTRY 29 16 7211 0.02 Kolb, HC; Finn, MG; Sharpless , KB 2001 Click chemistry: Diverse chemical function from a few good reactions ANGEWANDTE CHEMIE-INTERNATIONAL EDITION 215 116 10880

co-cited = 2,600+

Kolb, HC; Finn, MG and Sharpless , KB (2001) 10,730 Citations Rostovtsev , VV; Green, LG; (...); Sharpless , KB (2002) 9,586 Citations Tornoe , CW; Christensen, C and Meldal , M (2002) 6,877 Citations Meldal , M and Tornoe , CW (2008) 3,685 Citations Hoyle, CE and Bowman, CN (2010) 2,840 Citations Kolb, HC and Sharpless , KB (2003) 2,724 Citations Micro-level field: 1149 No. of pub. (2000-2022): 11,084 Main field : Physical sciences and engineering Journals: chemical communications; journal of the american chemical society ; angewandte chemie -international edition; tetrahedron letters organic letters Terms: triazole ; click chemistry; click reaction; triazole derivative;azide alkyne cycloaddition 6 most cited papers A micro cluster

What can we say about these articles? Discovered a new chemical technique and defined it as ‘click-chemistry’ (Sharpless’ paper defined it) Both are very highly cited (somewhat contradictory to Merton’s claim) Seemingly defined their own micro-cluster (CWTS; WoS online) A ‘specialty’ has grown around it, but utilization goes beyond as the third laurate Betozzi , testifies to But are these papers ‘novel’, ‘breakthroughs’ or ‘disruptive’ and in relation to what? From an epistemic point of view, we would argue that sure they are ‘novel’ Contradiction between ‘novelty’ and ‘reproducibility’ as epistemic criteria They are also ‘breakthroughs’ or ‘innovations’ with important consequences for subsequent knowledge production

Epistemic change

Epistemic change … some considerations We need to start with the concepts, then measurement Until now, we have been doing ‘ measurement by fiat ’ (A. V. Cicourel ), where we impose definitions and metrics on complex phenomena without adequately considering their complexity or theoretical frameworks that give them meaning ‘Severe testing’ or ‘falsifiability’, if x is supposed to measure y , then it cannot do that haphazardly (which seems to be the case for many ‘sophisticated’ indicators) An indicator is a model, and a model comes with assumptions ‘All models are wrong, but some are useful’ - explain why they would be useful, where, and for whom? Validate and calibrate the measurement Quality data over Big Data, the latter requires much more effort

Epistemic change … some considerations Social systems in general, and the science system in particular are non-ergodic When we model publication and citing behavior, we are modeling social acts … modelling natural language in stylized scholarly texts situated in discourse communities is a genre analysis, and scholarly writing is also a social act Title and abstracts are very ‘tricky’ as here language is especially ‘framed’ Manipulations of such variables and calling them ‘interdisciplinarity’, ‘novelty’ or ‘disruption’ assumes that the act of writing, referencing or later citing patterns can reveal cognitive or epistemic properties in papers Citing behavior is notorious – a mix of rhetorical, self-serving, convenient, proper, relevant practices, disturbed by errors, restrictions and indexing preferences = very noisy! Designating them as proxies for indicators is no excuse for dealing with the claimed concepts and their potential measurement – anything else is not scientific!

Epistemic change … some considerations Publications are packages of different types and combinations of knowledge – not knowledge in its purest form Knowledge production  publications (knowledge types and claims)  knowledge Scientometrics measures social organization and effort, but growth in effort or production does not necessarily imply a growth in knowledge (measuring knowledge = epistemetrics )

Epistemic change … some considerations Fanelli, D. (2019). A theory and methodology to quantify knowledge. Royal Society Open Science, 6(4), 181055. doi:10.1098/rsos.181055 Note, not individual papers, but ‘knowledge systems’

Epistemic change … some considerations Epistemic change  scientific progress (different theories) Publications are packages of different types and combinations of knowledge – not knowledge in its purest form The cult of the single paper  many papers make empirical claims, but at the end of the day, facts are more important that ‘novelty’

Are we not just mining noisy citing patterns? Can properties from essentially social objects have a correspondence to epistemic qualities, either embedded in a work, or subsequently determined by perception and use by peers?