Core Module Information
Module title: Data Wrangling

SCQF level: 11:
SCQF credit value: 20.00
ECTS credit value: 10

Module code: SET11121
Module leader: Dimitra Gkatzia
School School of Computing, Engineering and the Built Environment
Subject area group: Computer Science


Description of module content:

The challenges of contemporary data acquisition and analysis have been characterised as “the four V’s of Big Data” (volume, variety, velocity and validity). These require the use of specialised data storage, aggregation and processing techniques. This module introduces a range of tools and techniques necessary for working with data in a variety of formats with a view to developing data driven applications. The module focuses primarily on developing applications using the Python scripting language and associated libraries and will also introduce a range of associated data storage and processing technologies and techniques.

The module covers the following topics:

• Data types and formats: numerical and time series, graph, textual, unstructured,
• Data sources and interfaces: open data, APIs, social media, web-based
• NoSQL databases such as document (MongoDB), graph and key value pair
• Techniques for dealing with large data sets, including Map Reduce
• Developing Data Driven Applications in Python

The Benchmark Statement for Computing specifies the range of skills and knowledge that should be incorporated in computing courses. This module encompasses cognitive skills in Computational Thinking, Modelling and Methods and Tools, Requirements Analysis and practical skills in specification, development and testing and the deployment and use of tools and critical evaluation in addition to providing useful generic skills for employment.

Learning Outcomes for module:

On completion of this module, students will be able to:
LO1: Critically evaluate the tools and techniques of the data extraction, cleaning, interfacing, aggregation and processing
LO2: Select and apply a range of specialised data types, tools and techniques for data extraction, cleaning, interfacing, aggregation and processing
LO3: : Experiment with specialised techniques for dealing with complex text data sets, such as Natural Language Processing and machine learning.
LO4: Design, develop and critically evaluate data-driven applications in Python

Full Details of Teaching and Assessment
2022/3, Trimester 2, BLENDED, Edinburgh Napier University
Occurrence: 001
Primary mode of delivery: BLENDED
Location of delivery: MERCHISTON
Partner: Edinburgh Napier University
Member of staff responsible for delivering module: Dimitra Gkatzia
Module Organiser:

Learning, Teaching and Assessment (LTA) Approach:
You will attend for one day per month during term time for intensive face-to-face lectures, workshops, tutorials and computer-based practical sessions (LO 1-4). Each of these days will typically involve two half-day sessions, each consisting of up to 1 hour of lecture followed by up to 2.5 hours of labs/ practical work or by tutorial/seminar/class group work. This will be further supported by online material and a discussion forum using communication technologies such as Moodle and Skype. You will be encouraged to develop your learning through peer and tutor interaction, either face to face or through electronic communication. Teaching comprises a blend of lectures, lab-based practical sessions, together with online materials and support. The lecture programme will enhanced by material from guest speakers which will also be made available online. .

Each day will typically involve two half day sessions, each consisting of a 1 hour lecture followed by 2.5 hours of labs/ practical work or by Tutorial/Seminar/Class Group work

Teaching will concentrate on the critical analysis of the underlying principles and theories, and of their implementation in the Python language and relevant specialised code libraries, (LOs 1 - 4). Students are expected to spend a substantial proportion of their time doing practical programming exercises and researching the underlying principles and theories, and related academic literature (LOs 1 - 4). The practical materials are organised and selected for enhancing students’ understanding of the theories/principles covered.

Formative Assessment:
Formative assessment will be provided during lab-based practical sessions at the face to face monthly meetings. There will also be a series of online quizzes/exercises that will provide formative feedback.

Summative Assessment:
The summative assessment will comprise one practical coursework worth 100% of the final mark (covering LOs 1- 4). An element of this coursework will be submitted around week 7 to give feedback (30%: L.Os 1, 2), with the 2nd part of the coursework to be submitted at the end of the module (70%: LOs 3, 4) .

Student Activity (Notional Equivalent Study Hours (NESH))
Mode of activityLearning & Teaching ActivityNESH (Study Hours)
Face To Face Lecture 6
Face To Face Practical classes and workshops 15
Online Tutorial 12
Online Practical classes and workshops 12
Independent Learning Guided independent study 109
Online Lecture 6
Face To Face Guided independent study 40
Total Study Hours200
Expected Total Study Hours for Module200

Type of Assessment Weighting % LOs covered Week due Length in Hours/Words
Project - Practical 30 1,2 7 HOURS= 12, WORDS= 1000
Project - Practical 70 3,4 14/15 HOURS= 28, WORDS= 1000
Component 1 subtotal: 100
Component 2 subtotal: 0
Module subtotal: 100

Indicative References and Reading List - URL:
SET11121/SET11821/SET11521 Data Wrangling