Developing Statistical Algorithm For Linking Dynamic
Data...
Research Proposal Yichen Hu
[email protected] u5986120 Project Title:
Developing statistical algorithm for linking dynamic data with temporal information
Introduction Social genomes refer to the digital footprints of people in the society
[1]. Since the birth, we interact with various social entities include government,
businesses, and individuals. Traces of our social interactions aggregately indicating
what kind of social being we are, analogously as how our genomes indicating what
kind of biological being we are. Population informatics is informatics on social
genomes, it is an emerging multidisciplinary field involves social science, statistics,
data science and others. By analysing population databases with computational tools
and quantitative methods, population informatics answer questions about the society,
make forecasting of future changes and provide insights for politics and business
decisions. Problem to address Population databases are scattered across various
organisations, and their presentations are continuously changing alongside the
evolving of the organisations. To form a social genome with comprehensive
information, techniques are needed to link dynamic data across different population
databases for each natural being. Data quality has significant influence on the
performance of the system [2], and appropriate data linkage techniques can improve
data quality, enrich data, and reduce costs in data acquisition [3]. However, although
real world