coverpage
HDInsight Essentials Second Edition
Credits
About the Author
About the Reviewers
www.PacktPub.com
Support files eBooks discount offers and more
Preface
What this book covers
What you need for this book
Who this book is for
Conventions
Reader feedback
Customer support
Chapter 1. Hadoop and HDInsight in a Heartbeat
Data is everywhere
Hadoop concepts
Hadoop distributions
HDInsight overview
Hadoop on Windows deployment options
Summary
Chapter 2. Enterprise Data Lake using HDInsight
Enterprise Data Warehouse architecture
The next generation Hadoop-based Enterprise data architecture
Journey to your Data Lake dream
Tools and technology for Hadoop ecosystem
Use case powered by Microsoft HDInsight
Summary
Chapter 3. HDInsight Service on Azure
Registering for an Azure account
Azure storage
Provisioning an HDInsight cluster
HDInsight management dashboard
Exploring clusters using the remote desktop
Deleting the cluster
HDInsight Emulator for the development
Summary
Chapter 4. Administering Your HDInsight Cluster
Monitoring cluster health
Name Node status
Hadoop Service Availability
YARN Application Status
Azure storage management
Azure PowerShell
Summary
Chapter 5. Ingest and Organize Data Lake
End-to-end Data Lake solution
Ingesting to Data Lake using HDFS command
Loading data to Azure Blob storage using Azure PowerShell
Loading files to Data Lake using GUI tools
Using Sqoop to move data from RDBMS to Data Lake
Organizing your Data Lake in HDFS
Managing file metadata using HCatalog
Summary
Chapter 6. Transform Data in the Data Lake
Transformation overview
Tools for transforming data in Data Lake
Transformation for the OTP project
Other tools used for transformation
Summary
Chapter 7. Analyze and Report from Data Lake
Data access overview
Analysis using Excel and Microsoft Hive ODBC driver
Analysis using Excel Power Query
Other BI features in Excel
Ad hoc analysis using Hive
Other alternatives for analysis
Summary
Chapter 8. HDInsight 3.1 New Features
HBase
Storm
Apache Tez
Summary
Chapter 9. Strategy for a Successful Data Lake Implementation
Challenges on building a production Data Lake
The success path for a production Data Lake
Architectural considerations
Online resources
Summary
Index
更新时间:2021-08-06 19:27:14