BIG DATA ANALYTICS AND MANAGEMENT (ENGLISH, NONTHESIS) | |||||
Master | TR-NQF-HE: Level 7 | QF-EHEA: Second Cycle | EQF-LLL: Level 7 |
Course Code | Course Name | Semester | Theoretical | Practical | Credit | ECTS |
BDA5010 | Big Data and Hadoop Environment | Fall | 3 | 0 | 3 | 8 |
This catalog is for information purposes. Course status is determined by the relevant department at the beginning of semester. |
Language of instruction: | English |
Type of course: | Departmental Elective |
Course Level: | |
Mode of Delivery: | Face to face |
Course Coordinator : | Assist. Prof. SERKAN AYVAZ |
Course Objectives: | This course provides an overview of the fields of big data analytics and data science. Topics are covered in the context of data analytics include the terminology and the core concepts behind big data problems, applications, and systems. In this course, the students learn how to use Hadoop and related Big Data Processing tools that are used for scalable big data analysis and have made it easier and more accessible. |
The students who have succeeded in this course; 1-)Understand the architectural components and programming models used for scalable big data analysis. 2-)Be able to install Hadoop and run a MapReduce programs using Hadoop 3-)Describe the components and usages of core Hadoop environment including the HDFS file system and the MapReduce programming model. 4-)Perform the frequent data operations including data collection, monitoring, storage, analysis required for various Big data applications 5-)Be able to explain differences between a traditional Database Management System and a Big Data Management System |
This course covers Hadoop Ecosystem Fundamentals, Hadoop architecture and HDFS, MapReduce Programming, Hadoop administration, Spark Programming and RDDs, NoSQL Databases and distributed data storage, and, Fast Data and Stream Processing with Spark |
Week | Subject | Related Preparation |
1) | Introduction to Big Data & Hadoop Environment | |
2) | Hadoop Ecosystem Fundamentals | |
3) | Hadoop architecture and HDFS | |
4) | Introduction to MapReduce | |
5) | MapReduce Programming | |
6) | Hadoop Cluster Administration | |
7) | Introduction to Apache Spark | |
8) | Spark Programming and RDDs | |
9) | NoSQL Databases and distributed data storage | |
10) | Distributed data operations and integration | |
11) | Machine Learning with Apache Spark | |
12) | Fast Data and Stream Processing with Apache Spark | |
13) | Project Presentations | |
14) | Project Presentations |
Course Notes / Textbooks: | • Hadoop: The Definitive Guide, Tom White.2012.ISBN-13: 978-1449311520 • Big Data SMACK: A Guide to Apache Spark, Mesos, Akka, Cassandra, and Kafka. Isaac Ruiz and Raul Estrada. 2016. • Big Data Analytics with Spark A Practitioner’s Guide to Using Spark for Large Scale Data Analysis. Mohammed Guller • Data-Intensive Text Processing with MapReduce, Jimmy Lin and Chris Dyer |
References: | MapReduce Design Patterns, by Donald Miner and Adam Shook O'Reilly Media. ISBN: 978-1-4493-2717-0 Learning Spark, by Holden Karau, Andy Konwinsky, Patrick Wendell, Matei Zaharia. O'Reilly Media. ISBN: 978-1-4493-5862-4 Big Data Science & Analytics: A Hands-On Approach. Bahga, A. and Madisetti, V., 2016. |
Semester Requirements | Number of Activities | Level of Contribution |
Homework Assignments | 1 | % 10 |
Project | 1 | % 30 |
Midterms | 1 | % 20 |
Final | 1 | % 40 |
Total | % 100 | |
PERCENTAGE OF SEMESTER WORK | % 30 | |
PERCENTAGE OF FINAL WORK | % 70 | |
Total | % 100 |
No Effect | 1 Lowest | 2 Low | 3 Average | 4 High | 5 Highest |
Program Outcomes | Level of Contribution |