SQL Portfolio - Assignments table in publicdb.pptx
jmcorao
8 views
5 slides
Aug 31, 2025
Slide 1 of 5
1
2
3
4
5
About This Presentation
Analyzing a datacamp database
Size: 550.85 KB
Language: en
Added: Aug 31, 2025
Slides: 5 pages
Slide Content
THE ASSIGNMENTS TABLE – from the dbpublic in Projects Data
Total records : SELECT COUNT(DISTINCT assignment_id ) AS nb_assignments , COUNT(*) AS total_records FROM assignments; Unique start and end : SELECT COUNT (DISTINCT start_date ) AS nb_start , COUNT (DISTINCT end_date ) AS count_endsFROM public.assignments ; Top Budget and imp_score : SELECT public.assignments.assignment_id , public.assignments.impact_score , region FROM public.assignments ORDER BY public.assignments.impact_score DESC LIMIT 5; MAX & MIN: SELECT MAX( public.assignments.budget ) max_bud , MIN( public.assignments.budget ) min_bud , MAX( public.assignments.impact_score ) max_imp_score , MIN( public.assignments.budget ) min_imp_score FROM public.assignments -No missing values BUT dates col are a mess, are negative budgets normal? -5000 records / 5000 unique assignments (total) -1974 unique start_date and 2315 unique end_date -Top 3 budget => 2750 (99 994.60$) / 2837 (99 992.35$) / 3364 (99 989.67$) -Top 5 imp_score => 10 (3547/316/664), 9.99 (14/1514) - M&M: negative budgets ok? Impact score seems ok. Data Cleaning + Overall info
ANALYZING BY REGION Assignments , budget, avg_imp_score by region SELECT region, COUNT( assignment_id ) AS nb_assignments , ROUND(AVG (budget),1) AS avg_budget , ROUND(AVG ( impact_score ),2) AS avg_impact_score FROM assignments GROUP BY region ORDER BY avg_impact_score DESC; Top 3 assignments budget/ imp_score by region SELECT region, public.assignments.impact_score , public.assignments.assignment_id FROM public.assignments WHERE region = 'West’ ORDER BY public.assignments.impact_score DESC LIMIT 3; -South has the biggest number of assignments : 1276 -South has the biggest avg budget of 25 698,8$ -West has the biggest average impact_score with 5.50 -Top 3 assignments budget by region (for South and north) South (975/150/1126) // North (2750/2837/1589) -Top 3 assignments imp_score by region (for East and West) East (316/1514/3624) // West (3764/2794/1477)
ANALYZING BY DATE Assignments KPI by year SELECT -- Extract the year from the start_date and name it as year EXTRACT(YEAR FROM CASE -- Case 1: If format is YYYY-MM-DD (e.g., '2021-10-17’) WHEN start_date ~ '^\d{4}-\d{2}-\d{2}$' THEN TO_DATE( start_date , 'YYYY-MM-DD’) -- Case 2: If format is DD/MM/YYYY (e.g., '26/10/2020’) WHEN start_date ~ '^\d{2}/\d{2}/\d{4}$' THEN TO_DATE( start_date , 'DD/MM/YYYY’) -- Case 3: If format is MM-DD-YYYY (e.g., '11-22-2020’) WHEN start_date ~ '^\d{2}-\d{2}-\d{4}$' THEN TO_DATE( start_date , 'MM-DD-YYYY’) ELSE NULL -- Handle any other cases END) AS start_year , -- Count the number of assignments in each year COUNT( assignment_id ) AS nb_assign_year , -- Calculate the average impact score ROUND(AVG( impact_score ),2) AS avg_imp_score_year , -- Calculate the average budget ROUND(AVG(budget),1) AS avg_budget_year FROM assignments GROUP BY start_year ORDER BY start_year ; -2020 : 2497 assignments started with an avg bud of 24847.8$ and an imp_score of 5.47 1171 assignments ended with an avg bud of 25771.7$ and an imp_score of 5.50 -2021 : 2503 assignments started with an avg bud of 23574.3$ and an imp_score of 5.42 2491 assignments ended with an avg bud of 23962.1$ and an imp_score of 5.38 -2022 : No assignments started 1338 assignments ended with an avg bud of 23305.8$ and an imp_score of 5.49 -Top 3 assign budget for 2020 in start_date : 2750/975/620 -Top 3 assign imp_score for 2022 in end_date : 664/316/1514
ANALYZING BY DATE Part II Top 3 assignments budget/ imp.sco for year YYYY WITH assignments_with_year AS ( SELECT -- Extract the year from the start_date and name it as year EXTRACT(YEAR FROM CASE -- Case 1: If format is YYYY-MM-DD (e.g., '2021-10-17’) WHEN end_date ~ '^\d{4}-\d{2}-\d{2}$' THEN TO_DATE( end_date , 'YYYY-MM-DD’) -- Case 2: If format is DD/MM/YYYY (e.g., '26/10/2020’) WHEN end_date ~ '^\d{2}/\d{2}/\d{4}$' THEN TO_DATE( end_date , 'DD/MM/YYYY’) -- Case 3: If format is MM-DD-YYYY (e.g., '11-22-2020’) WHEN end_date ~ '^\d{2}-\d{2}-\d{4}$' THEN TO_DATE( end_date , 'MM-DD-YYYY’) ELSE NULL -- Handle any other cases END) AS end_year , assignment_id , impact_score FROM assignments ) SELECT assignment_id , impact_score FROM assignments_with_year WHERE end_year = 2022 ORDER BY impact_score DESC LIMIT 3; -Top 3 assign imp_score for 2022 in end_date : 664/316/1514