課程簡介
介紹
瞭解 Big Data
Spark概述
Python概述
PySpark概述
- 使用彈性分散式數據集框架分發數據
- 使用 Spark API Operators 分發計算
使用 Spark 設定 Python
設定PySpark
將 Amazon Web Services (AWS) EC2 實例用於Spark
設定Databricks
設置 AWS EMR 集群
學習基礎知識 Python Programming
- 開始使用 Python
- 使用 Jupyter Notebook
- 使用變數和簡單數據類型
- 使用清單
- 使用 if 語句
- 使用用戶輸入
- 使用 while 迴圈
- 實現函數
- 使用類
- 處理文件和異常
- 使用專案、數據和 API
瞭解 Spark DataFrame 的基礎知識
- Spark DataFrames 入門
- 使用Spark實現基本操作
- 使用 Groupby 和 Aggregate 操作
- 使用時間戳和日期
處理Spark DataFrame項目練習
使用 MLlib 瞭解 Machine Learning
使用 MLlib、Spark 和 Python 獲取 Machine Learning
了解回歸
- 學習線性回歸理論
- 實現回歸評估代碼
- 處理樣本線性回歸練習
- 學習邏輯回歸理論
- 實現邏輯回歸代碼
- 進行示例邏輯回歸練習
瞭解 Random Forest 和決策樹
- 學習樹方法理論
- 實現決策樹和 Random Forest 代碼
- 處理樣本 Random Forest 分類練習
使用 K-means 聚類
- 理解 K 均值聚類理論
- 實現 K-means 聚類代碼
- 處理樣本聚類分析練習
使用推薦系統
實現自然語言處理
- 理解 Natural Language Processing (NLP)
- NLP工具概述
- 處理範例 NLP 練習
在 Python 上使用Spark進行流式處理
- 概述:使用Spark進行流式處理
- 樣本 Spark Streaming 運動
結束語
最低要求
- 一般程式設計技能
觀眾
- 開發人員
- IT 專業人員
- 數據科學家
客戶評論 (6)
I liked that it was practical. Loved to apply the theoretical knowledge with practical examples.
Aurelia-Adriana - Allianz Services Romania
Course - Python and Spark for Big Data (PySpark)
The course was about a series of very complex related topics & Pablo has in-depth expertise of each of them. Sometimes nuances were lost in communication and/or due to time pressures and possibly expectations were not quite met due to this. Also there were some UHG/Azure Databricks setup issues however Pablo / UHG resolved these quickly once they became apparent - this to me showed a high level of understanding and professionalism between UHG & Pablo,
Michael Monks - Tech NorthWest Skillnet
Course - Python and Spark for Big Data (PySpark)
Individual attention.
ARCHANA ANILKUMAR - PPL
Course - Python and Spark for Big Data (PySpark)
Hands on Training..
Abraham Thomas - PPL
Course - Python and Spark for Big Data (PySpark)
The lessons were taught in a Jupyter notebook. The topics were structured with a logical sequence and naturally helped develop the session from the easier parts to the more complex. I'm already an advanced user of Python with background in Machine Learning, so found the course easier to follow than, possibly, some of my classmates that took the training course. I appreciate that some of the most elementary concepts were skipped and that he focused on the most substantial matters.
Angela DeLaMora - ADT, LLC
Course - Python and Spark for Big Data (PySpark)
practice tasks