"In Silico" Research: Software Engineering to the Rescue

serge_demeyerUA 6 views 38 slides Oct 02, 2024
Slide 1
Slide 1 of 38
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37
Slide 38
38

About This Presentation

In the last decade, “In Silico” research has become a standard tool in the arsenal of scientific research. It complements the traditional “in vitro” and “in vivo” research, by careful construction of a simulation model for the phenomenon under investigation and executing this model on hi...


Slide Content

"In Silico" Research:
Software Engineering to the Rescue:
May 2024
Prof. Serge Demeyer
AnSyMo

"HIS WEEK
A dolphin's
demise
SCIENTIFIC PUBLISHING
A Scientist's Nightmare: Software
Problem Leads to Five Retractions
Until recently, Geoffrey Chang's career was on
a trajectory most young scientists only dream
about. In 1999, at the age of 28, the protein
crystallographer landed a faculty position at
the prestigious Scripps Research Institute in
San Diego, California. The next year, in a cer
emony at the White House, Chang received a
Presidential Early Career Award
for Scientists and Engineers, the
country's highest honor for young
researchers. His lab generated a
stream of high-profile papers
detailing the molecular structures
of important proteins embedded in
cell membranes.
Then the dream turned into a
nightmare. In September, Swiss
researchers published a paper in
Nature that cast serious doubt on a
protein structure Chang's group
had described in a 2001 Science
paper. When he investigated,
Chang was horrified to discover
that a homemade data-analysis pro
gram had flipped two columns of
data, inverting the electron-density
map from which his team had
derived the final protein structure.
Unfortunately, his group had used
the program to analyze data for
other proteins. As a result, on page 1875,
Chang and his colleagues retract three Science
papers and report that two papers in other jour
nals also contain erroneous structures.
"I've been devastated," Chang says. "I hope
people will understand that it was a mistake,
and I'm very sorry for it." Other researchers
don't doubt that the error was unintentional,
and although some say it has cost them time
and effort, many praise Chang for setting the
record straight promptly and forthrightly. "I'm
very pleased he's done this because there has
been some confusion" about the original struc
tures, says Christopher Higgins, a biochemist
at Imperial College London. "Now the field
can really move forward."
The most influential of Chang's retracted
publications, other researchers say, was the
2001 Science paper, which described the struc
ture of a protein called MsbA, isolated from the
bacterium Escherichia coli. MsbA belongs to a
huge and ancient family of molecules that use
energy from adenosine triphosphate to trans
port molecules across cell membranes. These
so-called ABC transporters perform many
essential biological duties and are of great clin
ical interest because of their roles in drug resist
ance. Some pump antibiotics out of bacterial
cells, for example; others clear chemotherapy
drugs from cancer cells. Chang's MsbA struc
ture was the first molecular portrait of an entire
ABC transporter, and many researchers saw it
as a major contribution toward figuring out how
these crucial proteins do their jobs. That paper
alone has been cited by 364 publications,
according to Google Scholar.
Two subsequent papers, both now being
retracted, describe the structure of MsbA from
other bacteria, Vibrio cholera (published in
Molecular Biology in 2003) and Salmonella
typhimurium (published in Science in 2005).
The other retractions, a 2004 paper in the
Proceedings of the National Academy of
Sciences and a 2005 Science paper, described
EmrE, a different type of transporter protein.
Crystallizing and obtaining structures of
five membrane proteins in just over 5 years
was an incredible feat, says Chang's former
postdoc adviser Douglas Rees of the Califor
nia Institute of Technology in Pasadena. Such
proteins are a challenge for crystallographers
because they are large, unwieldy, and notori
ously difficult to coax into the crystals
needed for x-ray crystallography. Rees says
determination was at the root of Chang's suc
cess: "He has an incredible drive and work
ethic. He really pushed the field in the sense
of getting things to crystallize that
no one else had been able to do."
Chang's data are good, Rees says,
but the faulty software threw
everything off.
Ironically, another former post
doc in Rees's lab, Kaspar Locher,
exposed the mistake. In the 14 Sep
tember issue of Nature, Locher,
now at the Swiss Federal Institute
of Technology in Zurich, described
the structure of an ABC transporter
called Sav 1866 from Staphylococcus
aureus. The structure was dramati
cally?and unexpectedly?differ
ent from that of MsbA. After
pulling up Sav 1866 and Chang's
MsbA from S. typhimurium on a
computer screen, Locher says he
realized in minutes that the MsbA
structure was inverted. Interpreting
the "hand" of a molecule is always
a challenge for crystallographers,
Locher notes, and many mistakes can lead to
an incorrect mirror-image structure. Getting
the wrong hand is "in the category of monu
mental blunders," Locher says.
On reading the Nature paper, Chang
quickly traced the mix-up back to the analysis
program, which he says he inherited from
another lab. Locher suspects that Chang
would have caught the mistake if he'd taken
more time to obtain a higher resolution struc
ture. "I think he was under immense pressure
to get the first structure, and that's what made
him push the limits of his data," he says. Oth
ers suggest that Chang might have caught the
problem if he'd paid closer attention to bio
chemical findings that didn't jibe well with the
MsbA structure. "When the first structure
came out, we and others said, 4We really
Flipping fiasco. The structures of MsbA (purple) and Savl866 (green) overlap
little Heft) until MsbA is inverted {right).
3
1856 22 DECEMBER 2006 VOL 314 SCIENCE www.sciencemag.org
This content downloaded from
!!!!!!!!!!!!!143.129.75.98 on Wed, 15 Feb 2023 12:26:35 UTC!!!!!!!!!!!!!
All use subject to https://about.jstor.org/terms
http://www.jstor.org/stable/20035062
[…] in a ceremony at the White House, Chang received a Presidential Early Career
Award for Scientists and Engineers, the country’s highest honor for young
researchers. His lab generated a stream of high-profile papers detailing the
molecular structures of important proteins embedded in cell membranes.
[…] Swiss researchers published a paper in Nature that cast serious doubt on a
protein structure Chang’s group had described in a 2001 Science paper.
[…] Chang was horrified to discover that a homemade data-analysis program had
flipped two columns of data, inverting the electron-density map from which his team
had derived the final protein structure.
Dr. Chang had to withdraw five high profile widely cited papers

Risk Analysis- RV Belgica
Author: Gustavo Carro

Case Study
Measuring Campaign - RV Belgica
Give ship information:
Duration of measuring campaign
Route of the Ship… etc!!

Case Study
Measuring Campaign - RV Belgica

Risk Assessment

"In Silico" Research: Software Engineering to the Rescue © Serge Demeyer
“In Silico” Research — Jupyter Notebooks
7

"In Silico" Research:
Software Engineering to the Rescue:
May 2024
Prof. Serge Demeyer
AnSyMo

"In Silico" Research: Software Engineering to the Rescue © Serge Demeyer
Data Processing Software
9

"In Silico" Research: Software Engineering to the Rescue © Serge Demeyer
Q&A Sites
10
https://superuser.com/?tags=microsoft-excel
https://discourse.jupyter.org/
https://community.rstudio.com/
https://discuss.python.org/
https://www.mathworks.com/matlabcentral/

"In Silico" Research: Software Engineering to the Rescue © Serge Demeyer
StackOverflow
11
https://programming.guide/java/formatting-byte-size-to-human-readable-format.html

© calebmarcelo — https://www.deviantart.com/

"In Silico" Research: Software Engineering to the Rescue © Serge Demeyer
Threats to Validity
13
Cause
construct
Effect
construct
Treatment Outcome
cause-
effect

construct
treatment-

outcome

construct
Independent variable Dependent variable
Experiment operation
Experiment objective
THEORY
OBSERVATION
4
3 3
1 2
•1. Conclusion validity
•2. Internal validity
•3. Construct validity
•4. External validity

"In Silico" Research: Software Engineering to the Rescue © Serge Demeyer
Threats to Validity — Instrument Validity
14
Cause
construct
Effect
construct
Treatment Outcome
cause-
effect

construct
treatment-

outcome

construct
Independent variable Dependent variable
Experiment operation
Experiment objective
THEORY
OBSERVATION
4
3 3
1 2
•1. Conclusion validity
•2. Internal validity
•3. Construct validity
•4. External validity
Reliability
To what extent is the data
and the analysis dependent
on the researcher?
Instrument Validity
To what extent does the
instrument measure what it
is designed to measure?

"In Silico" Research:
Software Engineering to the Rescue:
May 2024
Prof. Serge Demeyer
AnSyMo

"In Silico" Research: Software Engineering to the Rescue © Serge Demeyer
Software Engineering
16
Models
Quality
Tools
Process

Wordcloud via - https://www.wordclouds.com/

"In Silico" Research: Software Engineering to the Rescue © Serge Demeyer
Reproducibility Issues
18
In a nutshell, we found that only 31% of the
tools and 22% of the models used as
experimental subjects are accessible. […] We
found none of the experimental results
presented in these papers to be fully replicable,
and 6% partially replicable.
We find that 74% of R files failed
to complete without error in the initial
execution, while 56% failed when code cleaning
was applied […]
We were able to successfully run 24.11% of the
unambiguous execution order Python
notebooks. […] However, the rate is way smaller
(4.03%) when we count only notebooks that
produce the same results.

"In Silico" Research: Software Engineering to the Rescue © Serge Demeyer
Mission: Long Term Reproducibility
19
You are a data scientist involved in an “In Silico” Project. You published
your data set(s) and the scripts to process it in a reproduction package on
figshare.
Your mission (should you choose to accept) is to guarantee that these
scripts will still produce the same results within 5—10 years.
[This message will self destruct in 5 minutes]

"In Silico" Research: Software Engineering to the Rescue © Serge Demeyer
Smoke Test
20

"In Silico" Research: Software Engineering to the Rescue © Serge Demeyer
Testing Terminology
21
https://glossary.istqb.org
Smoke Test (a.k.a. sanity test, intake test, confidence test)
•A test suite that covers the main functionality of a
component or system to determine whether it works
properly before planned testing begins.
Regression Testing
•A type of change-related testing to detect whether
defects have been introduced or uncovered in unchanged
areas of the software.

"In Silico" Research: Software Engineering to the Rescue © Serge Demeyer
DevOps — Continuous Testing
22

"In Silico" Research: Software Engineering to the Rescue © Serge Demeyer
Continuous Integration Pipeline
23
<<Breaking the Build>>
version
control
build
developer
tests
deploy
scenario
tests
deploy to
production
measure &
validate

"In Silico" Research: Software Engineering to the Rescue © Serge Demeyer
version
control
build
developer
tests
deploy
scenario
tests
deploy to
production
measure &
validate
Program the Workflow
24

"In Silico" Research: Software Engineering to the Rescue © Serge Demeyer
version
control
build
developer
tests
deploy
scenario
tests
deploy to
production
measure &
validate
Testing Frameworks
25
PyUnit

"In Silico" Research: Software Engineering to the Rescue © Serge Demeyer
Example - testDeposit
26
1  def testDeposit (self) : 

2 self.b.set_owner(’ Iwena Kroka’) 

3   self.b.deposit(10) 

4   self.assertEqual(self.b.get_balance(), 10)
5   self.assertEqual(self.b.owner, ’Iwena Kroka’) 

6   self.b.deposit(100) 

7   self.b.deposit(100) 

8   self.assertEqual(self.b.get_balance() , 210)
Input
Expected output

"In Silico" Research: Software Engineering to the Rescue © Serge Demeyer
Unit Testing
Bank
self.b.set_owner(’Iwena Kroka’) 

self.b.deposit(10)
self.b.deposit(100) 

self.b.deposit(100)
Unit Under TestStimuli Verification
self.assertEqual(self.
b.get_balance(), 10)
self.assertEqual(self.
b.get_balance(), 210)

"In Silico" Research: Software Engineering to the Rescue © Serge Demeyer
Code Coverage
28
© https://pytest-with-eric.com/pytest-best-practices/pytest-code-coverage-reports/

"In Silico" Research: Software Engineering to the Rescue © Serge Demeyer
version
control
build
developer
tests
deploy
scenario
tests
deploy to
production
measure &
validate
Testing Frameworks
29
PyUnit
Robot Framework

"In Silico" Research: Software Engineering to the Rescue © Serge Demeyer
Scenario Test (Via Browser)
30
*** Settings ***
Documentation  Login Functionality
Library  SeleniumLibrary
*** Variables ***
*** Test Cases ***
Verify Successful Login to OrangeHRM
    [documentation]  This test case verifies that user is able to successfully Login to OrangeHRM
    [tags]  Smoke
    Open Browser  https://opensource-demo. orangehrmlive.com/  Chrome
    Wait Until Element Is Visible   id:txtUsername  timeout= 5
    Input Text   id:txtUsername  Admin
    Input Password   id:txtPassword  admin123
    Click Element   id:btnLogin
    Element Should Be Visible   id:welcome  timeout= 5
    Close Browser
MacBook-Air ~ % robot -d results Tests/Login.robot
================================================================================
Login:: Login Functionality
Verify Successful Login to OrangeHRM :: This test case verifies th... | PASS |
Login:: Login Functionality
| PASS |
1 tests, 1 passed, 0 failed
================================================================================

"In Silico" Research: Software Engineering to the Rescue © Serge Demeyer
Scenario Testing
Login
Open Browser  https://opensource-demo.

orangehrmlive.com/  Chrome
Input Text   id:txtUsername  Admin
Input Password   id:txtPassword  admin123
Click Element   id:btnLogin
System Under TestStimuli Verification
Wait Until Element Is Visible
id:txtUsername  timeout= 5
Element Should Be Visible
id:welcome  timeout= 5

"In Silico" Research: Software Engineering to the Rescue © Serge Demeyer
Coverage Report
32
© https://docs.robotframework.org/docs/reporting_alternatives

"In Silico" Research: Software Engineering to the Rescue © Serge Demeyer
Testing The Quality Assurance
33
© Brussels Airlines

"In Silico" Research: Software Engineering to the Rescue © Serge Demeyer
Inject Synthetic Items
34
© "The Good, the Bad and the Ugly: Evaluating Convolutional Neural Networks for Prohibited Item Detection Using Real and
Synthetically Composite X-ray Imagery”

Neelanjan Bhowmik, Qian Wang, Yona Falinie A. Gaus, Marcin Szarek, Toby P. Breckon

"In Silico" Research Driving the Future © Serge Demeyer
Fault Injection
35
https://glossary.istqb.org
Fault Injection
•The process of intentionally adding a defect to a
component or system to determine whether it can detect
and possibly recover from it.

Wordcloud via - https://www.wordclouds.com/

"In Silico" Research: Software Engineering to the Rescue © Serge Demeyer
Take Away
37
The Bare Minimum
•Frequent smoke tests

(manual? automatic?)
Efficient
•Automate regression tests

(Unit + Scenario Tests)
Effective
•Monitor code coverage
•Inject Faults

"In Silico" Research: Software Engineering to the Rescue © Serge Demeyer
version
control
build
developer
tests
deploy
scenario
tests
deploy to
production
measure &
validate
BONUS: Code Reviewing
38
Create
Pull Request
Merge
into Main Code
Review