<thead id="fflbj"><font id="fflbj"><cite id="fflbj"></cite></font></thead>
    <progress id="fflbj"><thead id="fflbj"><font id="fflbj"></font></thead></progress>

            課程目錄:Hadoop for Developers and Administrators培訓
            4401 人關注
            (78637/99817)
            課程大綱:

               Hadoop for Developers and Administrators培訓

             

             

             

            Module 1. Introduction to Hadoop
            The Hadoop Distributed File System (HDFS)
            The Read Path and The Write Path
            Managing Filesystem Metadata
            The Namenode and the Datanode
            The Namenode High Availability
            Namenode Federation
            The Command-Line Tools
            Understanding REST Support
            Module 2. Introduction to MapReduce
            Analyzing the Data with Hadoop
            Map and Reduce Pattern
            Java MapReduce
            Scaling Out
            Data Flow
            Developing Combiner Functions
            Running a Distributed MapReduce Job
            Module 3. Planning a Hadoop Cluster
            Picking a Distribution and Version of Hadoop
            Versions and Features
            Hardware Selection
            Master and Worker Hardware Selection
            Cluster Sizing
            Operating System Selection and Preparation
            Deployment Layout
            Setting up Users, Groups, and Privileges
            Disk Configuration
            Network Design
            Module 4. Installation and Configuration
            Installing Hadoop
            Configuration: An Overview
            The Hadoop XML Configuration Files
            Environment Variables and Shell Scripts
            Logging Configuration
            Managing HDFS
            Optimization and Tuning
            Formatting the Namenode
            Creating a /tmp Directory
            Thinking Namenode High Availability
            The Fencing Options
            Automatic Failover Configuration
            Format and Bootstrap the Namenodes
            Namenode Federation
            Module 5. Understanding Hadoop I/O
            Data Integrity in HDFS
            Understanding Codecs
            Compression and Input Splits
            Using Compression in MapReduce
            The Serialization mechanism
            File-Based Data Structures
            The SequenceFile format
            Other File Formats and Column-Oriented Formats
            Module 6. Developing a MapReduce Application
            The Configuration API
            Setting Up the Development Environment
            Managing Configuration
            GenericOptionsParser, Tool, and ToolRunner
            Writing a Unit Test with MRUnit
            The Mapper and Reducer
            Running Locally on Test Data
            Testing the Driver
            Running on a Cluster
            Packaging and Launching a Job
            The MapReduce Web UI
            Tuning a Job
            Module 7. Identity, Authentication, and Authorization
            Managing Identity
            Kerberos and Hadoop
            Understanding Authorization
            Module 8. Resource Management
            What Is Resource Management?
            HDFS Quotas
            MapReduce Schedulers
            Anatomy of a YARN Application Run
            Resource Requests
            Application Lifespan
            YARN Compared to MapReduce 1
            Scheduling in YARN
            Scheduler Options
            Capacity Scheduler Configuration
            Fair Scheduler Configuration
            Delay Scheduling
            Dominant Resource Fairness
            Module 9. MapReduce Types and Formats
            MapReduce Types
            The Default MapReduce Job
            Defining the Input Formats
            Managing Input Splits and Records
            Text Input and Binary Input
            Managing Multiple Inputs
            Database Input (and Output)
            Output Formats
            Text Output and Binary Output
            Managing Multiple Outputs
            The Database Output
            Module 10. Using MapReduce Features
            Using Counters
            Reading Built-in Counters
            User-Defined Java Counters
            Understanding Sorting
            Using the Distributed Cache
            Module 11. Cluster Maintenance and Troubleshooting
            Managing Hadoop Processes
            Starting and Stopping Processes with Init Scripts
            Starting and Stopping Processes Manually
            HDFS Maintenance Tasks
            Adding a Datanode
            Decommissioning a Datanode
            Checking Filesystem Integrity with fsck
            Balancing HDFS Block Data
            Dealing with a Failed Disk
            MapReduce Maintenance Tasks
            Killing a MapReduce Job
            Killing a MapReduce Task
            Managing Resource Exhaustion
            Module 12. Monitoring
            The available Hadoop Metrics
            The role of SNMP
            Health Monitoring
            Host-Level Checks
            HDFS Checks
            MapReduce Checks
            Module 13. Backup and Recovery
            Data Backup
            Distributed Copy (distcp)
            Parallel Data Ingestion
            Namenode Metadata

            538在线视频二三区视视频