Owen Price - Excel - Solving a challenge step by step.pdf
LawrenceYalongi
23 views
17 slides
Jun 10, 2024
Slide 1 of 17
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
About This Presentation
Cleaning data in excel
Size: 505.02 KB
Language: en
Added: Jun 10, 2024
Slides: 17 pages
Slide Content
flexyourdata.com |youtube.com/@flexyourdata |linkedin.com/in/owenhprice
Step by Step:
Solving an Excel
data cleaning
challenge
flexyourdata.com |youtube.com/@flexyourdata |linkedin.com/in/owenhprice
Omid Motamedisedehposed this challenge
on Jun 72024
flexyourdata.com |youtube.com/@flexyourdata |linkedin.com/in/owenhprice
Typicallythese challenges invite some very
advanced use of Excel functions and formulas
flexyourdata.com |youtube.com/@flexyourdata |linkedin.com/in/owenhprice
These formulas are probably in the top 10% of
Excel users. But don’t be put off!
The good news is, it doesn’t have to be all or
nothing.
Some familiarity with newer functions can go a
long way –just use the space on the
worksheet to keep track of the steps!
The following pages are the step-by-step
solution that occurred to me as I worked
through.
flexyourdata.com | youtube.com/@flexyourdata | linkedin.com/in/owenhprice
Step 1 – split the rows on comma
Formula: =TEXTSPLIT(B3,", ") dragged down
It’s OK to have one
formula per row!
flexyourdata.com | youtube.com/@flexyourdata | linkedin.com/in/owenhprice
Step 2 – convert the dates stored as text to
date values
Formula: =DATEVALUE(I3:I9)
Using TEXTSPLIT means these
dates are text. We need them
to be Excel ‘date values’!
flexyourdata.com | youtube.com/@flexyourdata | linkedin.com/in/owenhprice
Step 3 – count the text cells on each row
Formula: =BYROW(J3:L9,COUNTA)
This formula might look daunting! It’s nothing to worry about! It’s
taking each row from the range and using the COUNTA function
on it. If you’re not using the Insiders Beta channel of Excel 365,
you would write it like this:
=BYROW(J3:L9,LAMBDA(row, COUNTA(row)))
flexyourdata.com | youtube.com/@flexyourdata | linkedin.com/in/owenhprice
Step 4 – repeat each date that number of
times, separated by a semi-colon
Formula: =REPT(";"&M3#,O3#)
The reason for doing this may not be
immediately clear. We are preparing
the dates to be expanded onto
separate rows in a later step
flexyourdata.com | youtube.com/@flexyourdata | linkedin.com/in/owenhprice
Step 5 – join all the dates together, separated
by semi-colons
Formula: =TEXTJOIN(";",TRUE,Q3:Q9)
Now we join all the dates from all the
rows into a single text. They’re
separated by semi-colons, which is
the delimiter we’ll use in the
TEXTSPLIT function in the next step!
flexyourdata.com | youtube.com/@flexyourdata | linkedin.com/in/owenhprice
Step 6 – split step 5 on semi-colon
Formula: =TEXTSPLIT(S3,,";",TRUE)
Now we have enough duplicates of
each date to align with the text that
was next to them in the beginning!
Two dates for two
pieces of text!
flexyourdata.com | youtube.com/@flexyourdata | linkedin.com/in/owenhprice
Step 7 – join all the non-date data, separated
by semi-colons
Formula: =TEXTJOIN(";",TRUE,J3:L9)
We follow a similar process for the
non-date data, only we don’t have to
use REPT to duplicate anything.
flexyourdata.com | youtube.com/@flexyourdata | linkedin.com/in/owenhprice
Step 8 – split the non-date data on semi-colon
Formula: =TEXTSPLIT(W3,,";",TRUE)
The non-date data are now on the
same rows as the dates! ☺
flexyourdata.com | youtube.com/@flexyourdata | linkedin.com/in/owenhprice
Step 9 – get the text before the space in the
non-date data.
Formula: =TEXTBEFORE(Y3#," ",,,,Y3#)
If there’s no space,
this last argument
just returns the
whole text
flexyourdata.com | youtube.com/@flexyourdata | linkedin.com/in/owenhprice
Step 10 – get the text after the space in the
non-date data.
Formula: =TEXTAFTER(Y3#," ",,,,1)
Return a 1 if
there’s no space
flexyourdata.com | youtube.com/@flexyourdata | linkedin.com/in/owenhprice
Step 11 – Join steps 6, 9 and 10 together. Use
VALUE to make sure the dates and the
numbers are in the right format.
Formula:
=HSTACK(VALUE(U3#),AA3#,VALUE(AC3#))
These are numbers stored as text. Using VALUE converts
them into the dates and numbers we ned
flexyourdata.com | youtube.com/@flexyourdata | linkedin.com/in/owenhprice
Check – Finally, check that the result is as
expected.
Formula: =AE3#=D3:F14
=Range1=Range2 is the simplest
way to check if the contents of
two ranges are the same.
flexyourdata.com | youtube.com/@flexyourdata | linkedin.com/in/owenhprice
Takeaways:
1.Solutions to Excel data cleaning
needn’t be complex
2.Solving step by step is a great
way to keep track of what’s
happening
3.There’s plenty of space!