Data Explained
As the volume of data continues to grow and the call for better government services surges, good data management and evidence-informed decision-making has become a top priority for Canada. In 2023, the Government of Canada published the latest iteration of the Data Strategy for the Federal Public Service to enable departments to use data to its highest potential and deliver data-driven results. With data at its core, decision-making becomes more strategic and fact-based, resulting in better outcomes. Without it, everything becomes a gamble.
Think about it. You are unlikely to swipe right on a dating profile with no photos. When you shop online, you usually buy the product with the most positive reviews. And you generally don't waste time applying for jobs that don't list the salary range on the posting. We live in a data-driven world, where assumptions can sometimes lead to less-than-ideal dates, subpar products and empty bank accounts.
Data is everywhere. Everyone needs to use it, everyone needs to get at it, everyone needs to learn. So one [challenge] is the enormity of trying to enable everyone in an organization with data. Everyone needs the right kind of data literacy for their job in order for us to really get the value out of data.
Sandy Kyriakatos, Chief Information Officer, National Research Council Canada
Types of data: Qualitative and quantitative
Data is a common term, but its meaning can vary depending on context. For many, data refers to the Internet connectivity delivered to your cellphone. However, we can also define data as facts, figures, observations or recordings about an object or phenomenon that can take the form of an image, sound, text or a physical measurement such as a distance or weight. It is not limited to electronic or digital information.
A typical way to categorize data is to distinguish between quantitative and qualitative data.
Quantitative data is numerical information acquired through measuring or counting. It usually refers to a certain quantity, amount or range. The number of individuals attending a baseball game, time spent in traffic and your height are all quantitative data. Most government departments have quantitative data about the number of employees they have, the number of applications they process or the number of clients they serve each month.
Qualitative data includes descriptive statements that can be made about a subject based on observations, interviews or evaluations. The goal of such statements is to build an understanding of a phenomenon through meaning, context and lived experiences. Movie reviews and real estate listings are two examples. Images, videos and sound recordings can also be considered qualitative data.
Both types of data are complementary—by integrating these two forms of data, analysts can develop comprehensive interpretations and make informed decisions that are grounded in empirical evidence and contextual understanding.
Data collection
There are many ways to collect data. Statistics Canada, for instance, primarily uses three methods of data collection:
- Census: A census is a way to collect information from every person in a specific group or population. It aims to gather data on various aspects like age, occupation and living conditions by asking questions to everyone in that group. This helps to get a complete picture of the entire population.
- Sample survey: A sample survey is a method used to gather information from a small group of people (called a sample) to learn about a larger group (called the population). Instead of asking everyone in the population, which can be time-consuming and expensive, researchers select a sample that represents the population. The answers from the sample help to understand the opinions, behaviours or characteristics of the entire population.
- Administrative data: Administrative data is collected as a result of an organization's day-to-day operations. Examples include data on births, deaths, tax, vehicle registrations and transactional data.
However, there are many other data sources, such as:
- Crowdsourcing: Collecting information from a large community of users and relies on the principle that citizens are the experts of their local environment. An example includes Wikipedia, which relies on contributions from a diverse community of users to create and edit articles on various topics.
- Web scraping: Process through which information is gathered and copied from the web for further analysis. An example includes e-commerce companies monitoring competitor prices, allowing them to adjust their own pricing strategies accordingly.
- Remote sensing: Acquisition of information about an object or phenomenon from a distant point. Examples include the growth of vegetation observed using satellite imagery, weather radar systems that track storms and seismic arrays showing vibrations in the earth.
- Open data: Structured, machine-readable data that is freely shared and that can be used without restrictions.
- Big data: Data sets that have such a large number of records and variables that they exceed the capacity of traditional software to process the information within a reasonable amount of time.
How data is organized: Structured and unstructured data
Data is either structured or unstructured.
Structured data is highly organized and easily analyzed. When you think of structured data, think of things that would sit nicely in a spreadsheet. Examples include:
- dates
- phone numbers
- postal codes
- client names
- types of benefit
Unstructured data is just the opposite. It is raw, unorganized information that may have its own internal structure, but does not conform neatly into a spreadsheet or database.It isusually text-heavy and more subjective, such as responses to open questions, which are potentially all different and difficult to categorize. Examples of unstructured data include:
- audio, image and video file formats
- Word documents and PowerPoint presentations
- emails
- customer reviews
- text messages
- client notes and chat logs
- call centre recordings
Unstructured data presents challenges in terms of organization and analysis due to its variability. Despite the lack of structure, it can reveal valuable insights and trends when using tools like natural language processing, image recognition or sentiment analysis. Understanding the difference between structured and unstructured data helps organizations choose the best way to store, handle and analyze their data effectively.
How we use the data
Data can be used in various ways, such as to:
- find solutions to queries
- guide sound decision-making
- tell a story
- support your conclusions with evidence
- simplify and clarify complex information
- reveal trends and relationships
- understand behaviours and why things happen
Interpreting data can help expand our knowledge and guide research. It provides insights into local and global events, allowing us to assess their potential impacts. By analyzing data, we can make informed decisions, uncover the root causes of a phenomenon, and come up with solutions to various problems. It not only helps understand what is happening but also anticipate future trends and outcomes.
We have to use data. We have to get value out of it. We have to do our best for our citizens with the data we have, but we must protect people's privacy. We have to make sure we do it in a way that doesn't risk people's personal information.
Sandy Kyriakatos, Chief Information Officer, National Research Council Canada
Now, let's test your knowledge
Scenario 1: Employee satisfaction survey
You are conducting an employee satisfaction survey to gauge how employees feel about their work environment.
Which type of data are you collecting?
- Qualitative data only
- Quantitative data only
- Both qualitative and quantitative data
- Neither qualitative nor quantitative data
Answer:
The answer is (c) Both qualitative and quantitative data. The survey collects quantitative data through numerical ratings and qualitative data through open-ended questions.
Scenario 2: Managing provincial vehicle registration data
Your team is responsible for maintaining the provincial vehicle registration database. This administrative data includes detailed records of all registered vehicles in the country, such as owner information, vehicle specifications, registration dates and renewal status. This data is collected as a result of everyday operations and is crucial for policy making, monitoring compliance with transportation regulations and planning infrastructure projects.
Which type of data collection method is being used?
- Census
- Sample survey
- Administrative data
- Remote sensing
Answer:
The answer is (c) Administrative data. This type of data is collected through routine operations, like maintaining the vehicle registration database, and is essential for various administrative and policy-making purposes.
Scenario 3: Analyzing traffic patterns for urban planning
You are a data analyst at your provincial transport authority and have been tasked with improving traffic flow in a major city. Your supervisor has asked you to analyze traffic data collected from various intersections and highways over the past year. The goal is to identify congestion hotspots, optimize traffic signal timings and propose infrastructure improvements to enhance overall transportation efficiency.
How will you primarily use the data?
- Find solutions to queries
- Guide sound decision-making
- Tell a story
- Support your conclusions with evidence
Answer:
The answer is (b) Guide sound decision-making. In this scenario, you will use the traffic data to guide sound decision-making regarding traffic management strategies and infrastructure improvements.
Resources