BDA5010 Big Data and Hadoop EnvironmentBahçeşehir UniversityDegree Programs BIG DATA ANALYTICS AND MANAGEMENT (ENGLISH, NONTHESIS)General Information For StudentsDiploma SupplementErasmus Policy StatementNational QualificationsBologna Commission
Master TR-NQF-HE: Level 7 QF-EHEA: Second Cycle EQF-LLL: Level 7

Course Introduction and Application Information

Course Code Course Name Semester Theoretical Practical Credit ECTS
BDA5010 Big Data and Hadoop Environment Fall 3 0 3 8
This catalog is for information purposes. Course status is determined by the relevant department at the beginning of semester.

Basic information

Language of instruction: English
Type of course: Departmental Elective
Course Level:
Mode of Delivery: Face to face
Course Coordinator : Dr. Öğr. Üyesi SERKAN AYVAZ
Course Objectives: This course provides an overview of the fields of big data analytics and data science. Topics are covered in the context of data analytics include the terminology and the core concepts behind big data problems, applications, and systems. In this course, the students learn how to use Hadoop and related Big Data Processing tools that are used for scalable big data analysis and have made it easier and more accessible.

Learning Outcomes

The students who have succeeded in this course;
1-)Understand the architectural components and programming models used for scalable big data analysis.
2-)Be able to install Hadoop and run a MapReduce programs using Hadoop
3-)Describe the components and usages of core Hadoop environment including the HDFS file system and the MapReduce programming model.
4-)Perform the frequent data operations including data collection, monitoring, storage, analysis required for various Big data applications
5-)Be able to explain differences between a traditional Database Management System and a Big Data Management System

Course Content

This course covers Hadoop Ecosystem Fundamentals, Hadoop architecture and HDFS, MapReduce Programming, Hadoop administration, Spark Programming and RDDs, NoSQL Databases and distributed data storage, and, Fast Data and Stream Processing with Spark

Weekly Detailed Course Contents

Week Subject Related Preparation
1) Introduction to Big Data & Hadoop Environment
2) Hadoop Ecosystem Fundamentals
3) Hadoop architecture and HDFS
4) Introduction to MapReduce
5) MapReduce Programming
6) Hadoop Cluster Administration
7) Introduction to Apache Spark
8) Spark Programming and RDDs
9) NoSQL Databases and distributed data storage
10) Distributed data operations and integration
11) Machine Learning with Apache Spark
12) Fast Data and Stream Processing with Apache Spark
13) Project Presentations
14) Project Presentations


Course Notes / Textbooks: • Hadoop: The Definitive Guide, Tom White.2012.ISBN-13: 978-1449311520
• Big Data SMACK: A Guide to Apache Spark, Mesos, Akka, Cassandra, and Kafka. Isaac Ruiz and Raul Estrada. 2016.
• Big Data Analytics with Spark A Practitioner’s Guide to Using Spark for Large Scale Data Analysis. Mohammed Guller
• Data-Intensive Text Processing with MapReduce, Jimmy Lin and Chris Dyer
References: MapReduce Design Patterns, by Donald Miner and Adam Shook
O'Reilly Media. ISBN: 978-1-4493-2717-0
Learning Spark, by Holden Karau, Andy Konwinsky, Patrick Wendell, Matei Zaharia. O'Reilly Media. ISBN: 978-1-4493-5862-4
Big Data Science & Analytics: A Hands-On Approach. Bahga, A. and Madisetti, V., 2016.

Evaluation System

Semester Requirements Number of Activities Level of Contribution
Homework Assignments 1 % 10
Project 1 % 30
Midterms 1 % 20
Final 1 % 40
Total % 100
Total % 100

Contribution of Learning Outcomes to Programme Outcomes

No Effect 1 Lowest 2 Low 3 Average 4 High 5 Highest
Program Outcomes Level of Contribution