Data Science – what it is and isn’t
Over the last few years, the term “data science” has become ubiquitous across a range of sectors and situations. Data science covers the need to find insights from huge amounts of data that was piling up within companies such as Yahoo, Facebook, and LinkedIn.
The data scientists then do the “data wrangling” – hunting for sources of data, joining them together and performing cleaning tasks over large data sets. Then they use their subject matter expertise to analyse the data to get insights and share those with decision-makers. So, they could then decide to act on or create new features for operations or customers.
Monitoring vs Reporting vs Analytics
Reporting is the process of using data to highlight things or trends that have already happened. This is in contrast with monitoring. It does the same for things that are happening now, and predictive analytics, which tries to predict what will happen in the future based on the same data.
The difference between reporting and monitoring is only one of data latency. And as such, monitoring is often referred to as real-time reporting, which further muddies the water.
ELT vs ETL
ETL is the Extract, Transform, and Load process for data. ELT is Extract, Load, and Transform process for data.
In ETL, data moves from the data source to staging into the data warehouse. It can help with data privacy and compliance by cleaning sensitive and secure data even before loading it into the data warehouse. ELT leverages the data warehouse to do basic transformations. There is no need for data staging.
Data is arguably the most important asset that organizations have. Data governance helps to ensure that data is usable, accessible and protected.