programming-background-with-person-working-with-codes-computer.jpg

Analysera stordata med Hive

Huvudsyftet med den här kursen är att hjälpa dig att förstå komplexa arkitekturer för Hadoop och dess komponenter, vägleda dig i rätt riktning för att komma igång och snabbt börja arbeta med Hadoop och dess komponenter. Den täcker allt du behöver som nybörjare inom Big Data. Lär dig om Big Data-marknaden, olika jobbroller, tekniktrender, Hadoops historia, HDFS, Hadoop-ekosystemet, Hive och Pig. I den här kursen kommer vi att se hur man som nybörjare bör börja med Hadoop. Kursen innehåller många praktiska exempel som hjälper dig att lära dig Hadoop snabbt.

Analyzing Big Data
Introduction
Motivation for Hadoop
Distributed Computing Challenges
Hadoop File System (HDFS)
Mapreduce
Word Count Example
Demo Basic Hadoop Command and environment Setup
Introduction to Hive
Hive Motivation
Hive Architecture
Hive principles Schema-On-Read
Hive Warehouse

Hive Query Language Basics
Creating Database and Tables with Hive
Working with Hive Tables and loading data into Warehouse
Loading Data into hive and managing external table
Data Types
Type Conversions
Managed Partitioned Tables
External Partitioned Tables
Multiuser and dynamic partition Inserts
Loading Data use case
Data Retrieval Group-By function
Sorting and Controlling data Flow
The Command line and Variable substitution
Bucketing
Bucketing and Block sampling

Joins
Joins in depth & joins Optimization
Map-side Joins for Bucketed Tables
Distributed Cache
UDTFs Explode and Lateral View
Extending Hive _Creating your own UDF
Hive Compiling and testing custom UDF
Extending Hive Custom UDF
Hive Initialization File
Accessing the distributed cache
Hadoop Streaming and Transform
Windowing and Analytic Function