Improving Organizational Decision Making Using a SAF-T based Business Intelligence System

BrunoOliveira631137 10 views 20 slides Apr 24, 2024
Slide 1
Slide 1 of 20
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20

About This Presentation

SAFT based Data Warehouse


Slide Content

Improving Organizational Decision Making Using a SAF-T based Business Intelligence System Bruno Oliveira – [email protected] Mariana Carvalho Rosa Silveira Telmo Matos CIICESI, School of Management and Technology, Porto Polytechnic 14-17 de Outubro , 2020 20ª Conferência da Associação Portuguesa de Sistemas de Informação

Outline Introduction The Portuguese Standard Audit File for Tax Purposes (SAF-T (PT)) A Business Intelligence System for SAF-T (PT) Case study Data Exploration using Data Mining: The Recency, Frequency and Monetary (RFM) analysis for Customer Segmentation Outcomes Evaluation Conclusions and Future Work

Introduction Today, the analytical systems represent an important asset that each company should have and use; The Portuguese Tax Authority (PTA) requires all companies that have organized accounting to monthly submit the Standard Audit File for Tax Purposes (SAF-T (PT)) for validation ; The SAF-T documents represent valuable data extracted from the companies' operational systems; Despite having these data, several companies do not have mechanisms to create deeper knowledge from the data they produce .

The SAF-T (PT) document The SAF-T (PT) is an XML (Extensible Markup Language) vocabulary created to standardize and potentiate the use of data for business control and inspection ; These documents collect all the tax relevant data of a company, allowing the interoperability of data of a set of accounting records; The SAF-T (PT) file supports accounting and/or billing applications, considering specific schema rules applied to each one; The billing SAF-T (PT) document is generated in a regular basis (every month) in order to communicate the monthly invoicing of companies;

The SAF-T (PT) document XML SAF-T (PT) file structure excerpt

A Business Intelligence System for SAF-T (PT) We are developing a Business Intelligence System organized into five components:

Case study Data Warehouse Star Schema used for SAF-T (PT) case study based on billing data .

Case study Due to data protection laws, the experimental results were assessed by populating the Data Warehouse with random SAF-T (PT) documents. 100 companies and 100 customers were randomly generated across Portugal's regions. Three different taxes were created, namely 6%, 13% and 23%; 1000 products were randomly generated over five product type; Each SAF-T (PT) has several invoices and line sales connected with each invoice. Invoices are generated considering a date period and a customer, randomly chosen; Each line sale, linked to an invoice, is generated considering: an amount , randomly generated between 1 and 10.000;

To exemplify the potentialities of the proposed DW and as we are dealing with SAF-T (PT) files, we can perform a customer segmentation analysis using the R( Recency ), F ( Frequency ) M ( Monetary ) method [1]; This type of analysis can be performed using clustering analysis (a data mining technique) that allows for data descriptive analysis; With this approach, we can group customers into clusters and obtain the description (according to its RFMs’ characteristics) of the customers. RFM analysis for Customer Segmentation [1] - (Chen et al., 2009; Dursun & Caber, 2016; Maryani & Riana, 2017)

RFM analysis is based on the following measures : Recency, which represents the period since customer’s last purchase ; Frequency, is the number of orders in a certain period ; Monetary, which represents the amount that the customer spent in the same interval . To perform the RFM method, we resort to Data Mining , more specifically, a clustering technique , for grouping the customers based on its RFM values. RFM analysis for Customer Segmentation

RFM analysis approach RFM analysis for Customer Segmentation

RFM analysis for Customer Segmentation Determining the best k From the relative distance criteria , the best k is identified by the most pronounced "elbow" that is formed on the downward curve; The 𝑅^2 criterion is more objective once we can determine that the best solution for k is the smallest number of clusters that retains a significant percentage of the total variability .

RFM analysis for Customer Segmentation K-means application The output that is generated is the point in the center of each cluster – the centroids

RFM analysis for Customer Segmentation For cluster 1 – “Light spender ” low monetary values low recency values high-frequency values

RFM analysis for Customer Segmentation For cluster 2 – “ Heavy spender ” high monetary values low recency values high-frequency values

RFM analysis for Customer Segmentation For cluster 3 – “ Minor Spender ” low monetary values high recency values low-frequency values

Outcomes Evaluation of K-means Validation with Decision Tree (DT) A DT algorithm generates a tree structure [2] [2] - (Han, Kamber, and Pei 2012). Attribute values that corresponds to each RFM metric Not all instances were correctly classified

Conclusions and Future Work We performed a customer segmentation analysis using the RFM criterion considering SAF-T (PT) data properly structured in a DW repository ; We produced interesting findings, proving the effectiveness of a BI system that can be used on SAF-T (PT) data; The RFM analysis has shown to be effective in the SAF-T (PT) dataset: Understanding the characteristics of each group, it possible to establish a relationship with customers to facilitate the direction of marketing strategies; We also want to extend the data analysis to the remaining SAF-T perspectives, providing a full package of analytical techniques applied to the different entities involved;

Acnowledgements This work has been supported by national funds through FCT – Fundação para a Ciência e Tecnologia through project UIDB/04728/2020.

Improving Organizational Decision Making Using a SAF-T based Business Intelligence System Bruno Oliveira – [email protected] Mariana Carvalho Rosa Silveira Telmo Matos CIICESI, School of Management and Technology, Porto Polytechnic 14-17 de Outubro , 2020 20ª Conferência da Associação Portuguesa de Sistemas de Informação
Tags