Skip to content Skip to sidebar Skip to footer

Taming Big Data with Apache Spark and Python - Hands On!

Link : Taming Big Data with Apache Spark and Python - Hands On!
Taming Big Data with Apache Spark and Python - Hands On!
udemy couponed code
Dive right in with 15+ hands-on examples of analyzing large data sets with Apache Spark, on your desktop or on Hadoop!by Sundog Education by Frank Kane, Frank Kane

What you'll learn

  • Use DataFrames and Structured Streaming in Spark 3
  • Frame big data analysis problems as Spark problems
  • Use Amazon's Elastic MapReduce service to run your job on a cluster with Hadoop YARN
  • Install and run Apache Spark on a desktop computer or on a cluster
  • Use Spark's Resilient Distributed Datasets to process and analyze large data sets across many CPU's
  • Implement iterative algorithms such as breadth-first-search using Spark
  • Use the MLLib machine learning library to answer common data mining questions
  • Understand how Spark SQL lets you work with structured data
  • Understand how Spark Streaming lets your process continuous streams of data in real time
  • Tune and troubleshoot large jobs running on a cluster
  • Share information between nodes on a Spark cluster using broadcast variables and accumulators
  • Understand how the GraphX library helps with network analysis problems

Description
New! Updated for Spark 3 and with a hands-on structured streaming example.
“Big data" analysis is a hot and highly valuable skill – and this course will teach you the hottest technology in big data: Apache Spark. Employers including Amazon, EBay, NASA JPL, and Yahoo all use Spark to quickly extract meaning from massive data sets across a fault-tolerant Hadoop cluster. You'll learn those same techniques, using your own Windows system right at home. It's easier than you might think.

  • Learn and master the art of framing data analysis problems as Spark problems through over 15 hands-on examples, and then scale them up to run on cloud computing services in this course. You'll be learning from an ex-engineer and senior manager from Amazon and IMDb.
  • Learn the concepts of Spark's Resilient Distributed Datastores
Develop and run Spark jobs quickly using Python
Translate complex analysis problems into iterative or multi-stage Spark scripts
Scale up to larger data sets using Amazon's Elastic MapReduce service
Understand how Hadoop YARN distributes Spark across computing clusters
Learn about other Spark technologies, like Spark SQL, Spark Streaming, and GraphX
By the end of this course, you'll be running code that analyzes gigabytes worth of information – in the cloud – in a matter of minutes.
This course uses the familiar Python programming language; if you'd rather use Scala to get the best performance out of Spark, see my "Apache Spark with Scala - Hands On with Big Data" course instead.
Online Course CoupoNED
Online Course CoupoNED I am very happy that there are bloggers who can help my business

Post a Comment for "Taming Big Data with Apache Spark and Python - Hands On!"

Subscribe via Email