<thead id="fflbj"><font id="fflbj"><cite id="fflbj"></cite></font></thead>
    <progress id="fflbj"><thead id="fflbj"><font id="fflbj"></font></thead></progress>

            課程目錄:Data Science for Big Data Analytics培訓
            4401 人關注
            (78637/99817)
            課程大綱:

                     Data Science for Big Data Analytics培訓

             

             

             

            Introduction to Data Science for Big Data Analytics
            Data Science Overview
            Big Data Overview
            Data Structures
            Drivers and complexities of Big Data
            Big Data ecosystem and a new approach to analytics
            Key technologies in Big Data
            Data Mining process and problems
            Association Pattern Mining
            Data Clustering
            Outlier Detection
            Data Classification
            Introduction to Data Analytics lifecycle
            Discovery
            Data preparation
            Model planning
            Model building
            Presentation/Communication of results
            Operationalization
            Exercise: Case study
            From this point most of the training time (80%) will be spent on examples and exercises in R and related big data technology.
            Getting started with R
            Installing R and Rstudio
            Features of R language
            Objects in R
            Data in R
            Data manipulation
            Big data issues
            Exercises
            Getting started with Hadoop
            Installing Hadoop
            Understanding Hadoop modes
            HDFS
            MapReduce architecture
            Hadoop related projects overview
            Writing programs in Hadoop MapReduce
            Exercises
            Integrating R and Hadoop with RHadoop
            Components of RHadoop
            Installing RHadoop and connecting with Hadoop
            The architecture of RHadoop
            Hadoop streaming with R
            Data analytics problem solving with RHadoop
            Exercises
            Pre-processing and preparing data
            Data preparation steps
            Feature extraction
            Data cleaning
            Data integration and transformation
            Data reduction – sampling, feature subset selection,
            Dimensionality reduction
            Discretization and binning
            Exercises and Case study
            Exploratory data analytic methods in R
            Descriptive statistics
            Exploratory data analysis
            Visualization – preliminary steps
            Visualizing single variable
            Examining multiple variables
            Statistical methods for evaluation
            Hypothesis testing
            Exercises and Case study
            Data Visualizations
            Basic visualizations in R
            Packages for data visualization ggplot2, lattice, plotly, lattice
            Formatting plots in R
            Advanced graphs
            Exercises
            Regression (Estimating future values)
            Linear regression
            Use cases
            Model description
            Diagnostics
            Problems with linear regression
            Shrinkage methods, ridge regression, the lasso
            Generalizations and nonlinearity
            Regression splines
            Local polynomial regression
            Generalized additive models
            Regression with RHadoop
            Exercises and Case study
            Classification
            The classification related problems
            Bayesian refresher
            Na?ve Bayes
            Logistic regression
            K-nearest neighbors
            Decision trees algorithm
            Neural networks
            Support vector machines
            Diagnostics of classifiers
            Comparison of classification methods
            Scalable classification algorithms
            Exercises and Case study
            Assessing model performance and selection
            Bias, Variance and model complexity
            Accuracy vs Interpretability
            Evaluating classifiers
            Measures of model/algorithm performance
            Hold-out method of validation
            Cross-validation
            Tuning machine learning algorithms with caret package
            Visualizing model performance with Profit ROC and Lift curves
            Ensemble Methods
            Bagging
            Random Forests
            Boosting
            Gradient boosting
            Exercises and Case study
            Support vector machines for classification and regression
            Maximal Margin classifiers
            Support vector classifiers
            Support vector machines
            SVM’s for classification problems
            SVM’s for regression problems
            Exercises and Case study
            Identifying unknown groupings within a data set
            Feature Selection for Clustering
            Representative based algorithms: k-means, k-medoids
            Hierarchical algorithms: agglomerative and divisive methods
            Probabilistic base algorithms: EM
            Density based algorithms: DBSCAN, DENCLUE
            Cluster validation
            Advanced clustering concepts
            Clustering with RHadoop
            Exercises and Case study
            Discovering connections with Link Analysis
            Link analysis concepts
            Metrics for analyzing networks
            The Pagerank algorithm
            Hyperlink-Induced Topic Search
            Link Prediction
            Exercises and Case study
            Association Pattern Mining
            Frequent Pattern Mining Model
            Scalability issues in frequent pattern mining
            Brute Force algorithms
            Apriori algorithm
            The FP growth approach
            Evaluation of Candidate Rules
            Applications of Association Rules
            Validation and Testing
            Diagnostics
            Association rules with R and Hadoop
            Exercises and Case study
            Constructing recommendation engines
            Understanding recommender systems
            Data mining techniques used in recommender systems
            Recommender systems with recommenderlab package
            Evaluating the recommender systems
            Recommendations with RHadoop
            Exercise: Building recommendation engine
            Text analysis
            Text analysis steps
            Collecting raw text
            Bag of words
            Term Frequency –Inverse Document Frequency
            Determining Sentiments
            Exercises and Case study

            538在线视频二三区视视频