Core Module Information
Module title: Scripting for Data Science

SCQF level: 07:
SCQF credit value: 20.00
ECTS credit value: 10

Module code: SET07111
Module leader: Valerio Giuffrida
School School of Computing, Engineering and the Built Environment
Subject area group: Computer Science
Prerequisites

N/A

Description of module content:

The aim of the module is to deepen the students' understanding of fundamental programming concepts, introduce more advanced concepts pertaining to script development, and develop an ability to utilise publicly available software libraries to solve problems. Throughout the module, the underlying concepts will be contextualised through case studies relevant to the students' Data Science programmes of study. The chosen scripting language is widely used by Data Scientists in both academia and industry and has a thriving community which provides supporting software packages of relevance to the programme of study.

The module provides a fundamental introduction to python and makes no assumptions about student’s prior exposure to it. The latter parts of the module will focus on applying these concepts to data processing, such that students will develop insight into automating common statistical analyses on imported datasets.

The syllabus includes topics such as:
• An introduction to building scripts using a popular scripting language widely used in Data Science
• Core programming and language concepts, such as data types, control structures, functions, importing libraries, and re-usable design
• Techniques for creating robust scripts, including exception handling, testing and debugging
• Importing and working with externally sourced data (e.g. text and CSV files)
• The use of open-source libraries for automating basic data processing (e.g. calculating point statistics, plotting histograms)
Indicative case studies:
• How to download, format, and import open source datasets using the scripting language.
• Answering basic questions relating to open datasets, such as what the median, mode and mean values, interquartile ranges, and why these values are important.
• Basic plotting to understand the distribution of the underlying data, with examples of how point statistics may be misleading.

Learning Outcomes for module:

LO1: Design, implement and test software scripts which solve problems relating to statistics and data science
LO2: Employ good practice programming and scripting techniques to develop well-written modular code which is reusable, well documented and uses comprehensive error handling techniques.
LO3: Solve applied problems through abstraction by identifying, utilising and integrating publicly available software libraries as appropriate.

Full Details of Teaching and Assessment
2022/3, Trimester 2, FACE-TO-FACE,
VIEW FULL DETAILS
Occurrence: 001
Primary mode of delivery: FACE-TO-FACE
Location of delivery: MERCHISTON
Partner:
Member of staff responsible for delivering module: Valerio Giuffrida
Module Organiser:


Learning, Teaching and Assessment (LTA) Approach:
Key concepts will be explained in lectures, where the subject matter will be illustrated with examples and interactive demonstrations (LO1,2). A key feature is to present the principles behind the applications (LO1,2,3), discuss the concepts with the students (LO1,2,3) and, for each case study, discuss a starting solution and suitable approaches (LO1,2,3). Practical labs focus on problem solving and case studies to provide practice in the application of theory (LO1,2,3). Students develop their own applications, initially often by extending a starting solution provided. As the module progresses, this will gradually require more independent work and research of advanced concepts. Throughout the labs, students will be encouraged to interact with staff and peers to explore concepts in more depth and receive feedback on their progress and understanding. In addition to timetabled classes, students should undertake private study to work through the learning materials and gain further practice at solving conceptual and technical problems (LO1,3). To provide an integrated understanding of the subject matter, each topic area will centre on a case study relevant to the students' subject area. topics will reuse and integrate functions and modules developed earlier and emphasise exception handling (LO2). The latter parts of the module will shift the emphasis from programming fundamentals to applied problem solving in the Data Science domain, giving students a better idea of how to apply coding techniques they have learned (LO3). Moodle will be used to distribute course materials including starting solutions and to point the students to selected third party resources.

Formative Assessment:
Formative and Summative assessments: Assessment comprises a series of practical exercises accompanied by a log book (LOs 1, 2, 3) and a final coursework (LOs 1, 3). Students will be expected to complete a log book of their solutions to exercises contained in part 1 of the core text, which will be discussed during practical classes and also a solution to a specific problem created for the module.

Summative Assessment:
Formative and Summative assessments: Assessment comprises a series of practical exercises accompanied by a log book (LOs 1, 2, 3) and a final coursework (LOs 1, 3). Students will be expected to complete a log book of their solutions to exercises contained in part 1 of the core text, which will be discussed during practical classes and also a solution to a specific problem created for the module.

Student Activity (Notional Equivalent Study Hours (NESH))
Mode of activityLearning & Teaching ActivityNESH (Study Hours)
Face To Face Lecture 24
Face To Face Practical classes and workshops 24
Independent Learning Guided independent study 152
Total Study Hours200
Expected Total Study Hours for Module200


Assessment
Type of Assessment Weighting % LOs covered Week due Length in Hours/Words
Practical Skills Assessment 30 1,2,3 8 HOURS= 30.00, WORDS= 0
Practical Skills Assessment 70 1,2,3 13 HOURS= 30.00, WORDS= 0
Component 1 subtotal: 30
Component 2 subtotal: 70
Module subtotal: 100

Indicative References and Reading List - URL:
Contact your module leader