Core Module Information
Module title: Data Management and Processing

SCQF level: 11:
SCQF credit value: 20.00
ECTS credit value: 10

Module code: SET11823
Module leader: Carl Strathearn
School School of Computing, Engineering and the Built Environment
Subject area group: Computer Science
Prerequisites

There are no pre-requisites for this module to be added

Description of module content:

This module will explore and develop data management and processing solutions that will work on dirty, complex, real-world data. This module will examine the key concepts of data warehousing, data cleaning, and data processing in the context of business requirements and focus on how to combine these steps into a coherent data processing pipeline.First, modern tools and techniques in data management will be examined, with the emphasis on good practice and professional approaches of storing and handling data. Next, the module will examine ways of cleaning noisy real-world data in order to make it suitable for data processing. Finally, data processing and collation techniques such as Machine or Deep Learning will be applied to the data to extract structure and elicit comprehension of the data. Throughout the module, advantages and disadvantages of using local and cloud approaches will be explored, alongside discussing common parallel approaches to facilitate faster solutions.In short, the goal of this module is to allow students to understand a data processing pipeline from raw data to final delivery. It will cover:• Data warehousing and storage techniques• Data cleaning techniques• A discussion of cloud approaches• Data processing and collation techniques• An introduction to parallel data pipeline approaches

Learning Outcomes for module:

Upon completion of this module you will be able to

LO1: Critically assess different data warehousing techniques and technologies related to data management.

LO2: Appraise different methods of data cleaning in the context of large or complex data sets.

LO3: Critically reflect on the advantages of local and cloud solutions for data processing.

LO4: Display mastery of industry-standard data collation techniques.

LO5: Produce a data processing and management pipeline from raw data to a final delivery.

Full Details of Teaching and Assessment
2025/6, Trimester 1, Online (fully o,
VIEW FULL DETAILS
Occurrence: 001
Primary mode of delivery: Online (fully o
Location of delivery: MERCHISTON
Partner:
Member of staff responsible for delivering module: Carl Strathearn
Module Organiser:


Student Activity (Notional Equivalent Study Hours (NESH))
Mode of activityLearning & Teaching ActivityNESH (Study Hours)NESH Description
Independent Learning Lecture 20 Lectures focusing on data processing and warehousing methods.
Independent Learning Practical classes and workshops 20 The workshops are designed in Java and use external libraries to create a pipeline for cleaning, reading and modelling data.
Online Guided independent study 160 Guided independent study, students are also given a slack channel for group study. Reading week is week 7 and that teaching is the standard SCEBE 2-6 8-12
Total Study Hours200
Expected Total Study Hours for Module200


Assessment
Type of Assessment Weighting % LOs covered Week due Length in Hours/Words Description
Project - Practical 30 1~2 Week 6 HOURS= 2,500 A 2,500 word report on how the pipeline functions within the context of a select business concept, zipped dataset, zipped code base.
Project - Practical 70 3~4~5 Week 13 HOURS= 2,500 Final 2,500 reflective report building on the first report, datasets used to text ML approaches i.e. sentiment analysis, zipped code base.
Component 1 subtotal: 100
Component 2 subtotal: 0
Module subtotal: 100

Indicative References and Reading List - URL:
Contact your module leader