Name: trendnologies
Brand: trendnologies
SKU: NA
Price: 1000 INR
Availability: InStoreOnly
Rating: 5 (200 reviews)

Syllabus

Topic 1: Overview of Big Data & HDFS Concepts along with Linux Commands
- Bigdata Intro
- HDFS Architecture
- Hadoop Setup
- Linux commands
- HDFS commands
- Quiz
- Assessment
Topic 2: Linux Scripting in detail
- History
- Architecture
- Development Commands
- Env Variables
- File Management
- Directories Management
- Admin Commands
- Advance Commands
- Shell Scripting
- Groups and User managements
- Permissions
- Important directory structure
- Disk utilities
- Compression Techniques
- Misc Commands
- Kernel, Shell
- Terminal, SSH, GUI
- Automation & Scripting in Linux
- Hands On Exercises
- Quiz
- Assessment
Topic 3: Deep Dive in Hadoop
- What is Hadoop?
- Evolution of Hadoop
- Features of Hadoop
- Characteristics of Hadoop
- Hadoop compared with Traditional Dist. Systems
- When to use Hadoop
- Limitations of Hadoop
- Components of Hadoop (HDFS, MapReduce, YARN)
- Hadoop Architecture
- Daemons in Hadoop Version 1 & 2
- How Data is stored in Hadoop
- Cluster, Datacenter, Spilt, Block,
- Rack Awareness, Replication, Heartbeat)
- Hadoop 1.0 Limitation
- NameNode High Availability
- Quiz
- Assessment
Topic 4: MapReduce - Distributed Computing Framework
- MapReduce Intro
- MapReducte - Theory / Depth
- MapReduce Programming concept
- MapReduce Practicals
- Different types of files supported
- (Text, Sequence, map and Avro)
- MapReduce Job submission in YARN Cluster in details
- Tweaking mappers and reducers
- Mapreduce package and deployment
- Quiz
- Assessment
Topic 5: Apache Sqoop - Moving Data into Hadoop (Vice-Versa)
- Sqoop Fundamentals
- Sqoop Excersises
- Sqoop Export
- Sqoop Incremental
- Sqoop Job
- Sqoop Merge
- Best practices & performance tuning
- Sqoop Use cases
- Quiz
- Assessment
Topic 6: Apache Hive Basics
- Introduction to Apache Hive
- Understanding Apache Hive
- Hive Practical
- Hive to know more
- Quiz
Topic 7: Advance Hive Advance & Components - Part 1
- Hive Definition Level Optimizations : Theory
- Hive Definition Level Optimizations : Practical
- Hive Query Level Optimizations : Theory
- Hive Query Level Optimizations : Practical
- Hive Windowing Functions
- Hive Ranking
- Hive Sorting
- Quiz
- Assessment
Topic 8: Advance Hive Advance & Components - Part 2
- Hive File Format
- Hive File Format - Practicals
- Hive Compression Techniques
- Hive Vectorization & Changing the Hive Engine
- Hive Thrift Server
- Hive MSCK Repair
- Hive Miscellaneous
- Hive Optimization Techniques REWIND
- Hive SCD
- Hive Sqoop, HBase Integration
- Hive Schema evolution (AVSC) use cases using AVRO dataset
- Quiz
- Assessment
Topic 9: Hue, Ambari, Cloudera Manager
- Introduction
- Cluster formation guide and implementation
- Deployment in Cloud
- Full Visibility into Cluster Health
- Metrics & Dashboards
- Heat Maps
- Configurations
- Services, Alerts, Admin activities
- Provisioning, Managing and Monitoring Hadoop Clusters
- Hue Introduction
- Access Hive
- Query executor
- Data browser
- Access Hive, HCatalog, Oozie, File Browser
- Hortonworks/Cloudera
- Cluster Design
- Different nodes (Gateway, Ingestion, Edge)
- System consideration
- Commands (fsck, job, dfsadmin, distcp, balancer)
- Schedulers in RM (Capacity, Fair, FIFO)
- View all services in Ambari & Cloudera Manager
Topic 10: Intro to NOSQL - HBase
- HBase Basics
- CAP Theorem
- Hbase Architecture
- Hbase Practicals
- Storage Hierarchy – Characteristics
- Table Design
- HMaster & Regions
- Region Server & Zookeeper
- Inside Region Server (Memstore, Blockcache, HFile, WAL)
- Minor/Major Compactions
- Role of Zookeeper
- HBase Shell
- Introduction to Filters
- Row Key Design
- Performance Tuning
- Cassandra Overview
- Integration with Hive
- Integration with Hadoop (Mini Project)"
- Quiz
- Assessment
Topic 11: Phoenix
- Overview of Phoenix
- Introduction
- Architecture
- History
- Phoenix Hbase Integration
- Hbase table, view creation
- SQL & UDFs
- SQL Line & PSQL Line of Phoenix
- Phoenix Load & Query engine
- Understanding co processor Configurations
- Hive -> Hbase -> Phoenix integration
- Creation of views in phoenix
- Load bulk data using psql
- Serverlog Aggregation usecase
Topic 12: Oozie (Workflow Orchestration)
- Introduction
- History - Why Oozie
- Components
- Architecture
- Workflow Engine
- Nodes
- Workflow
- Coordinator
- Action (MapReduce, Hive, Spark, Shell & Sqoop)
- Introduction to Bundle
- Email Notification
- Error Handling
- Installation
- Workouts
- Orchestration of end to end tools
- Scheduling of data pipeline
- Invoking shell script, Sqoop, Hive & Spark
Topic 13: Python
- Python Introduction
- Evolution
- Application
- Features
- Installation & Configuration
- Objectives
- Flow Control
- Variables
- Data types
- Functions
- Modules
- OOPS
- Python for Spark
- Structures
- Collection types
- Looping Constructs
- Dictionary & Tuples
- File I/O
Topic 14: Learning Scala - A Guide to Functional Programming
- Scala and Spark Setup
- Scala Basics
- Scala Functional Programming
- Scala Object Oriented Sessions
- Quiz
- Assessment
Topic 15: Apache Spark - General Purpose Cluster Computing Framework
- Scala Interview Prep Series
- Spark Fundamental Theory
- Spark Fundamental Practical
- Quiz
- Assessment
Topic 16: YARN
- Introduction to YARN
- YARN Architecture
- YARN Components
- YARN Longlived & Shortlived Daemons
- YARN Schedulers
- Job Submission under YARN
- Multi tenancy support of YARN
- YARN High Avalability
- YARN Fault tolerance handling
- MapReduce job submission using YARN
- YARN UI
- History Server
- YARN Dynamic allocation
- Containerization of YARN
- Quiz
- Assessment
Topic 17: Apache Spark Use Cases in Depth
- Spark Real-Time Examples
- Spark Shared Variables
- YARN Rewind
- Spark on YARN Architecture
- Spark in depth
- Quiz
- Assessment
Topic 18: Spark Structured API - Part 1
- Spark in depth continued
- Spark DataFrames, DataSets
- Quiz
- Assessment
Topic 19: Spark Structured API - Part 2
- Spark in depth continued
- Quiz
- Assessment
Topic 20: Spark Performance Tuning - Part 1
- Spark Performance Tuning
- Quiz
- Assessment
Topic 21: Spark Performance Tuning - Part 2
- Spark Broadcast Join With Low level API (RDD)
- Spark Broadcast Join With Structured API (DataFrames)
- Spark Submit: Client Mode vs Cluster Mode
- Spark Join Optimizations
- Spark Advance Optimization: Sort Aggregate vs Hash Aggregate
- Spark Catalyst, Tungsten, AST Optimizer
- Spark Connecting to External Source
- Quiz
Topic 22: Spark Streaming
- Spark Real Time Processing
- Understanding Discretized Stream (DStream) in Spark Streaming
- Stream Processing in Spark - Word Count Example
- Understanding Stateless and Stateful Transformations in Spark Streaming
- Stateless Transformation - Word Count Exampel Using Eclipse IDE
- Stateful Transformation - Word Count Exampe
- Working with Sliding Windows
- Quiz
Topic 23: Spark Advance Streaming - Structured
- Spark Structured Streaming - Part1
- Spark Structured Streaming - Part2
- Spark Structured Streaming - Part3
- Quiz
Topic 24: Apache Kafka
- Kafka Introduction
- Applications, Cluster Setup
- Broker fault tolerance
- Architecture
- Components
- Partitions & Replication
- Distribution of messages
- Producer & Consumer workload distribution
- Topics management
- Brokers
- Installation
- Workouts
- Console Publishing
- Console Consuming
- Topic options
- Offset Management
- Cluster deployment in cloud
Topic 25: NIFI
- Nifi Introduction
- Core Components
- Architecture
- Nifi Installation & Configuration
- Fault tolerance
- Data Provenance Routing, mediation, transformation & routing
- Nifi -> Kafka -> Spark integration
- Workouts
- Scheduling
- Real time streaming
- Kafka producer & consumer
- File streaming with HDFS integration
- Data provenenance
- Packaging NIFI templates
- Rest API Integration
- Twitter data capture
- Quiz
- Assessment
Topic 26: Big Data on Cloud Part 1 (AWS S3, EMR, Athena+Glue)
- Introduction to Cloud Computing And Running Spark Code on AWS EMR
- Fundamentals of AWS for Bigdata Developer
- AWS Storage, Networking & CLI
- AWS EMR: Launch a EMR Cluster Using Advanced Options
- AWS Athena Session-1
- AWS Athena Session-2
- "AWS Athena with Glue Session-3"
- Quiz
Topic 27: Big Data on Cloud Part 2 (Redshift, Glue, Airflow)
- Database vs Datawarehouse vs Data lake
- AWS Redshift Sessions
- AWS Glue
- Apache Airflow
- "Apache Airflow - Workflow Management Platform"
- Airflow Fundamentals Sessions
- Airflow Practical Pipeline Sessions
Topic 28: CI/CD Pipeline (GitHub, Maven, & Jenkins)
- DevOps Basics
- Versioning
- Create and use a repository
- Start and manage a new branch
- Make changes to a file and push them to GitHub as commits
- Open and merge a pull request
- Create Story boards
- Desktop integration
- Maven integration with Git
- Create project in Maven
- Add scala nature
- Maven operations
- Adding and updating POM
- Managing dependencies with the maven repository
- Building and installing maven
- Maven fat & lean jar build with submit
Topic 29: Real Time Project classes

Real Time Projects on Bigdata which are diverse in nature covering various data sets from multiple domains such as banking, Healthcare, telecommunication, social media, insurance, and e-commerce.
Topic 30: Real Time Projects on Bigdata
- Project Statement
- Dataset
- Architectural Diagram and Solution
- Task segregation
- Project Demo Session
- Code and snippets
- Documentation
Topic 31: Interview Questions
Topic 32: Mock Interview and Mock Interview Answers
Topic 33: Resume Prep session and sample resumes

Bigdata Hadoop , Spark full stack specialization program

Overview

The Big Data Hadoop Training offers:

Course Highlights

Pre-requisites and Eligibility

Syllabus

Audience for this course:

Mode of Training

Week days batch

Week end batch

Fast track Batch

Big Data Certification

Key features

Contacts

Usefull Links