Core Module Information
Module title: Scripting for Data Science

SCQF level: 08:
SCQF credit value: 20.00
ECTS credit value: 10

Module code: SET08423
Module leader: Valerio Giuffrida
School School of Computing, Engineering and the Built Environment
Subject area group: Computer Science

Good understanding of fundamental programming/scripting concepts and some understanding of how they can be used to automate statistical analysis.

Description of module content:

The aim of the module is to deepen the students' understanding of fundamental programming concepts, introduce more advanced concepts pertaining to script development, and develop an ability to utilise publicly available software libraries to solve problems. Throughout the module, the underlying concepts will be contextualised through case studies relevant to the students' Data Science programmes of study. The chosen scripting language is widely used by Data Scientists in both academia and industry and has a thriving community which provides supporting software packages of relevance to the programme of study.
The module provides a fundamental introduction to the chosen scripting language and makes no assumptions about student’s prior exposure to it. The latter parts of the module will focus on applying these concepts to data processing, such that students will develop insight into automating common statistical analyses on imported datasets.

The syllabus includes topics such as:
• An introduction to building scripts using a popular scripting language widely used in Data Science
• Core programming and language concepts, such as data types, control structures, functions, importing libraries, and re-usable design
• Techniques for creating robust scripts, including exception handling, testing and debugging
• Importing and working with externally sourced data (e.g. text and CSV files)
• The use of open-source libraries for automating basic data processing (e.g. calculating point statistics, plotting histograms)
Indicative case studies:
• How to download, format, and import open source datasets using the scripting language.
• Answering basic questions relating to open datasets, such as what the median, mode and mean values, interquartile ranges, and why these values are important.
• Basic plotting to understand the distribution of the underlying data, with examples of how point statistics may be misleading.

Learning Outcomes for module:

LO1: Design, implement and test substantial software scripts which solve problems relating to statistics and data science.
LO2: Employ good practice programming and scripting techniques to develop well-written modular code which is reusable, well documented and uses comprehensive error handling techniques.
LO3: Solve complex, applied problems through abstraction by identifying, utilising and integrating publicly available software libraries as appropriate.

Full Details of Teaching and Assessment
2022/3, Trimester 2, BLENDED, Edinburgh Napier University
Occurrence: 001
Primary mode of delivery: BLENDED
Location of delivery: MERCHISTON
Partner: Edinburgh Napier University
Member of staff responsible for delivering module: Valerio Giuffrida
Module Organiser:

Learning, Teaching and Assessment (LTA) Approach:
Key concepts will be explained in lectures, where the subject matter will be illustrated with examples and interactive demonstrations (LO1,2). A key feature is to present the principles behind the applications (LO1,2,3), discuss the concepts with the students (LO1,2,3) and, for each case study, discuss a starting solution and suitable approaches (LO1,2,3).
Practical labs focus on problem solving and case studies to provide practice in the application of theory (LO1,2,3). Students develop their own applications, initially often by extending a starting solution provided. As the module progresses, this will gradually require more independent work and research of advanced concepts. Throughout the labs, students will be encouraged to interact with staff and peers to explore concepts in depth and receive feedback on their progress and understanding.
In addition to timetabled classes, students should undertake private study to work through the learning materials and gain further practice at solving conceptual and technical problems (LO1,3). To provide an integrated understanding of the subject matter, each topic area will centre on a case study relevant to the students' subject area. topics will reuse and integrate functions and modules developed earlier and emphasise exception handling (LO2).
The latter parts of the module will shift the emphasis from programming fundamentals to applied problem solving in the Data Science domain, giving students a better idea of how to apply coding techniques they have learned (LO3).
Moodle will be used to distribute course materials including starting solutions and to point the students to selected third party resources.

Formative Assessment:
Interactive elements of lectures encourage students to test their understanding continuously. There will be additional formative challenges such as quizzes. Continuous feedback is given by staff through discussions in the labs.
To support the first summative assessment, a practice test will be available with immediate, automated feedback.
The two assessments are partly based on earlier lab exercises, for which feedback is available during the practical sessions.

Summative Assessment:
Summative assessment will be formed by a single component, and two forms of assessment. Both are practical skills assessment and weighed at 50% of the final mark each.
The first practical skills assessment (50% of final mark) is designed to cover most of the fundamental theory of the module, covering LO1,2. This assessment will require students to apply basic concept of python programming. Students are requred to submit their script as solution via Moodle. A summary of the feedback is recorded in Moodle using a set of rubrics tailored to emphasise the specific skills assessed.
The second practical skills assessment, weighted at 50% of the final mark as well, also requires the students to submit their scripts as solutions to a case study and requires the students to apply and integrate many of the concepts learned. It will be designed to reinforce LO1,2 and will also assess LO3. Similarly as in the first assessment, students are requred to submit their script as solution via Moodle and a summary of the feedback is recorded in Moodle using a tailred set of rubrics.

Student Activity (Notional Equivalent Study Hours (NESH))
Mode of activityLearning & Teaching ActivityNESH (Study Hours)
Face To Face Lecture 24
Face To Face Practical classes and workshops 24
Independent Learning Guided independent study 152
Total Study Hours200
Expected Total Study Hours for Module200

Type of Assessment Weighting % LOs covered Week due Length in Hours/Words
Practical Skills Assessment 50 1,2 7 HOURS= 12.00
Practical Skills Assessment 50 1,23 13 HOURS= 30.00
Component 1 subtotal: 100
Component 2 subtotal: 0
Module subtotal: 100

Indicative References and Reading List - URL:
Contact your module leader