Module title: Data Management and Processing

SCQF level: 11:
SCQF credit value: 20.00
ECTS credit value: 10

Module code: SET11823
Module leader: Thomas Methven
School School of Computing
Subject area group: Software Engineering
Prerequisites

N/A

2018/9, Trimester 3, ONLINE,
Occurrence: 001
Primary mode of delivery: ONLINE
Location of delivery: WORLDWIDE
Partner:
Member of staff responsible for delivering module: Thomas Methven
Module Organiser:


Learning, Teaching and Assessment (LTA) Approach:
You will be supported by the Global Online team who will provide general overall support, and by the module teams who will provide module-specific online material and discussion forums using a variety of communication technologies such as Moodle and Skype (LO 1-5). You will be encouraged to develop your learning through peer and tutor interaction through electronic communication.
Self-study readings, supported by online discussions forum hosted through the VLE, will develop skills as independent learners (LO 1-5). Formative feedback will be provided via online quizzes (LO 1-5). The lecture programme will be enhanced by material from guest speakers (where appropriate) and will be made available online. The material for the lab-based practical sessions will be made available online with a support forum.
In addition to this, online students are provided with online support in the form of:
• Dedicated online administrators who will keep track of student progress and will help you if you are having any problems.
• A dedicated interactive database of frequently asked questions specific to the online learning environment.
• A regular ‘virtual office hour’ will be held where module staff will be available for contact with you.


Formative Assessment:
There will be a series of end-of-section online quizzes that will provide formative assessment throughout the course. In general, these will be weekly unless a section is broken into multi-part sessions. Formative feedback on practical work will also be provided through the online discussion forum and ‘virtual office hour’. Reflective Exercises will enable you to apply theory to practice – this is not assessed, but it will support your personal and professional development.

Summative Assessment:
Summative assessment will be in the form of a single component, which will consist of an ongoing software development task and associated written documentation worth 100% of the final mark (covering LOs 1 - 5). This component will be broken down into two elements, due week 7 and 13. The first element will be submitted in week 7 in order to give interim summative feedback (30%: L.Os 1, 2). The second element will be submitted in week 13 (70%: LOs 3, 4, 5), after which final summative assessment will be given.

Student Activity (Notional Equivalent Study Hours (NESH))
Mode of activityLearning & Teaching ActivityNESH (Study Hours)
Online Lecture 12
Online Practical classes and workshops 12
Online Tutorial 15
Independent Learning Guided independent study 121
Online Guided independent study 40
Total Study Hours200
Expected Total Study Hours for Module200


Assessment
Type of Assessment Weighting % LOs covered Week due Length in Hours/Words
Project - Practical 30 1,2 7 HOURS= 12
Project - Practical 70 3,4,5 13 HOURS= 28
Component 1 subtotal: 100
Component 2 subtotal: 0
Module subtotal: 100

Description of module content:

This module will explore and develop data management and processing solutions that will work on dirty, complex, real-world data. This module will examine the key concepts of data warehousing, data cleaning, and data processing in the context of business requirements and focus on how to combine these steps into a coherent data processing pipeline.
First, modern tools and techniques in data management will be examined, with the emphasis on good practice and professional approaches of storing and handling data. Next, the module will examine ways of cleaning noisy real-world data in order to make it suitable for data processing. Finally, data processing and collation techniques such as Machine or Deep Learning will be applied to the data to extract structure and elicit comprehension of the data. Throughout the module, advantages and disadvantages of using local and cloud approaches will be explored, alongside discussing common parallel approaches to facilitate faster solutions.
In short, the goal of this module is to allow students to understand a data processing pipeline from raw data to final delivery. It will cover:
• Data warehousing and storage techniques
• Data cleaning techniques
• A discussion of cloud approaches
• Data processing and collation techniques
• An introduction to parallel data pipeline approaches

Learning Outcomes for module:

LO1: Critically assess different data warehousing techniques and technologies related to data management
LO2: Appraise different methods of data cleaning in the context of large or complex data sets
LO3: Critically reflect on the advantages of local and cloud solutions for data processing
LO4: Display mastery of industry-standard data collation techniques
LO5: Produce a data processing and management pipeline from raw data to a final delivery

Indicative References and Reading List - URL:
Contact your module leader