# 1^{st} Semester

###### (choose all 4)

### Instructors:

Spiros Skiadopoulos (UoP), Christos Tryfonopoulos (UoP)

### Topics per week

1. Overview

2. Entity relation model

3. Relational model

4. Relational algebra

5. SQL

6. Query processing

7-8. Query optimisation

9. Primary and secondary storage

10. Tree-structured indexes

11. Hash-based structures

12-13. Database tuning and physical design for massive datasets

### Instructor:

Ioannis Moscholios (UOP)

### Topics per week

1. Review on basic probability theorems

2. Discrete and continuous random variables

3. Bayesian inference and the posterior distribution

4. Point estimation, hypothesis testing, and the MAP Rule

5. Bayesian least mean squares estimation

6. Bayesian linear least mean squares estimation

7. Statistical inference

8. Classical parameter estimation

9. Linear regression

10. Binary hypothesis testing

11. Significance testing

12-13. Introduction to multivariate models

### Instructor:

Anastasia Krithara (NCSR Demokritos), George Petasis (NCSR Demokritos)

### Topics per week

1. Different types of learning algorithms:

- Supervised learning

- Unsupervised learning

2. Basic machine learning algorithms:

- Linear Regression

- Decision Trees

- Logistic Regression

- KNN (K- Nearest Neighbours)

- K-Means

- Hierarchical clustering

- Naïve Bayes

- Support Vector Machines

- Dimensionality Reduction

3. Ensembles:

- Voting

- Random Forests

- AdaBoost

- XGBoost

4. Applied Machine learning:

- Exploratory Analysis

- Data Cleaning/Data Wrangling

- Feature Engineering

- Feature selection

- Algorithm selection

- Model training

- Model evaluation

### Instructor:

Iraklis Klampanos (NCSR Demokritos)

### Topics per week

1. Introduction to data programming

2-3. Python programming

4-5. Data stream processing

6-7. Data acquisition: web services, streams, data transfer

8-9. Octave/Matlab/R for data analysis

10-11. Optimisation considerations, vectorisation, GPUs

12-13. Use-case combining batch processing, streaming and analysis

# 2^{nd} Semester

###### (choose all 4)

### Instructors:

Christos Tryfonopoulos (UoP), Spiros Skiadopoulos (UoP)

### Topics per week

1. Getting to know your (Big) Data

2-3. Architectures for Big Data

4. Distributed object location

5. Distributed file systems (Cassandra, BigTable, HBase)

6. The Map/Reduce paradigm

7-9. Parallel data processing with Hadoop

10. Parallel graph processing (Pregel, Hama)

11. NoSQL databases (key-value/document/graph stores)

12. Column stores

13. Distributed stream processing

### Instructors:

Nikos Platis (UoP), Iraklis Klampanos (NCSR Demokritos)

### Topics per week

1. Visual perception

2-3. Visualization techniques

4. Interactive visualizations

5-6. Visualization software (Tableau and other tools)

7. Visual communication

8-9. Visualization in Python

10-12. Case studies

13. Presentations of student projects

### Instructor:

Anastasia Krithara (NCSR Demokritos), George Petasis (NCSR Demokritos)

### Topics per week

1. Introduction to natural language processing

2. Architectures for natural language processing

3. Big Data analysis

4. Named-entity recognition

5. Disambiguation

6. Sentiment analysis and opinion mining

7. Information extraction and topic modelling

8. Summarization

9. Question answering

10. Natural language processing of Big Data

11. Parallel natural-language processing with Hadoop

12. Large Knowledge-bases

13. Deep learning for language processing

### Instructor:

George Giannakopoulos (NCSR Demokritos)

### Topics per week

1. Data mining basic concepts

2. Data types and features

3. Use-cases: text representation, representing data from bioinformatics

4. Data preprocessing and cleaning

5-6. Data classification and clustering

7. Relations and sequences

8. Similarity-based data mining

9. Outliers and concept drift

10. Evaluation in data mining

11. Human evaluation, automatic and semi-automatic evaluation problems

12. Big data mining

13. Big data mining tools

# 3^{rd} Semester

###### Obligatory Seminar

### Instructors:

George Giannakopoulos (NCSR Demokritos), Alexandros Nousias (NCSR Demokritos)

### Topics per week

1. Scientific method overview

2. Hypotheses and testing

3. Risks in hypothesis testing

4. Scientific error and scientific lies

5. Reviewing scientific work: the peer reviewing process; how to do a good review; how to review one’s own work.

6. Communicating scientific results: clarifying science; risks in publication of results

7. Legal and ethical issues overview: overview of legal and ethical risks

8. Data licensing, sharing, openness: how to share or reuse data; licences and their meaning

9. Emerging data formats and publishing (nano-publications; semantic web)

10. Anonymization and profiling: data aggregation and anonymization; discovering user identity through profiling

11. Privacy and Security concerns: difference between privacy and security; privacy in data publication; sensitive data

12. Ethics considerations in data analysis: the effect and impact of scientific discovery; ethics and data analysis

13. Social understanding of data and ethics

###### Optional Courses (Choose 3 from the following 4)

### Instructor

Theodoros Giannakopoulos (NCSR Demokritos)

### Topics per week

1. General signal and image processing issues

2. Audio representations and feature extraction

3. Audio content characterization: classification, segmentation, clustering and alignment

4. Music Information Retrieval

5. Speech recognition

6. Image introduction and representations

7. Image segmentation: thresholding, edge-based, region-based

8. Image classification and retrieval

9. Video analysis: motion analysis, flow extraction, temporal event recognition, tracking

10. Deep-learning-based image and video characterization

11. Audio-visual fusion

12-13. Implementation based on open-source audio-visual libraries

### Instructor

Nikos Kolokotronis (UoP), Konstantinos Limniotis (UoP)

### Topics per week

1. Introduction to security

2. Cyber-threat landscape

3. Cryptography for big data: introduction

4. Federated identity management

5. Decentralised systems security: systems (i.e. Hadoop)

6. Decentralised systems security: network

7. Automated trust negotiation

8. Privacy in big data: introduction

9. Privacy in big data: privacy-preserving data mining

10. Cryptography for big data: advanced topics

11. Secure data sharing/outsourcing

12. Secure searching over big data

13. Securing big data in the cloud

### Instructor:

Alexandros Artikis (NCSR Demokritos), Nikos Katzouris (NCSR Demokritos)

### Topics per week

1. Introduction to complex event recognition.

2. Case study: complex event recognition for maritime monitoring.

3. Complex event recognition languages.

4. Automata-based event recognition.

5. Temporal reasoning systems.

6. Big Data complex event recognition.

7. Uncertainty handling.

8. Probabilistic programming.

9. Markov Logic Networks.

10. Complex event pattern learning.

11-12. Online learning over relational streams.

13. Complex event forecasting.

### Instructors

Iraklis Klampanos (NCSR Demokritos), George Petasis (NCSR Demokritos)

### Topics per week

1. Introduction to deep learning and indicative examples

2. Revisiting ML basics

3. Deep feedforward networks

4. Regularisation for deep learning

5. Training deep neural networks

6. Convolutional networks

7. Recurrent and recursive networks

8. Linear factor models

9. Unsupervised learning and autoencoders

10. Representation learning

11. Approximate inference

12. Example use-case

13. Practical issues and methodology

# 4^{th} Semester

MSc Data Science Thesis