{"product_id":"dataproc-cookbook-running-spark-and-hadoop-workloads-in-google-cloud-9781098157708","title":"Dataproc Cookbook: Running Spark and Hadoop Workloads in Google Cloud","description":"\u003cp\u003eGet up to speed with Dataproc, the fully managed and highly scalable service for running open source big data tools and frameworks, including Hadoop, Spark, Flink, and Presto. This cookbook shows data engineers, data scientists, data analysts, and cloud architects how to use Dataproc, integrated with Google Cloud, for data lake modernization, ETL, and secure data science at a fraction of the cost. \u003c\/p\u003e\u003cp\u003e Narasimha Sadineni from Google and former Googler Anu Venkataraman show you how to set up and run Hadoop and Spark jobs on Dataproc. You'll learn how to create Dataproc clusters and run data engineering and data science workloads in long-running, ephemeral, and serverless ways. In the process, you'll gain an understanding of Dataproc, orchestration, logging and monitoring, Spark History Server, and migration patterns. \u003c\/p\u003e\u003cp\u003e This cookbook includes hands-on examples for configuring, logging, securing clusters, and migrating from on-prem to Dataproc. You'll learn how to: \u003c\/p\u003e\u003cul\u003e \u003cli\u003eCreate Dataproc clusters on Compute Engine and Kubernetes Engine \u003c\/li\u003e\n\u003cli\u003eRun data science workloads on Dataproc \u003c\/li\u003e\n\u003cli\u003eExecute Spark jobs on Dataproc Serverless \u003c\/li\u003e\n\u003cli\u003eOptimize Dataproc clusters to be cost effective and performant \u003c\/li\u003e\n\u003cli\u003eMonitor Spark jobs in various ways \u003c\/li\u003e\n\u003cli\u003eOrchestrate various workloads and activities \u003c\/li\u003e\n\u003cli\u003eUse different methods for migrating data and workloads from existing Hadoop clusters to Dataproc \u003c\/li\u003e\n\u003c\/ul\u003e\u003cbr\u003e\u003cbr\u003e\u003cb\u003eBinding Type:\u003c\/b\u003e Paperback\u003cbr\u003e\u003cb\u003ePublisher:\u003c\/b\u003e O'Reilly Media\u003cbr\u003e\u003cb\u003ePublished:\u003c\/b\u003e 07\/15\/2025\u003cbr\u003e\u003cb\u003eISBN:\u003c\/b\u003e 9781098157708\u003cbr\u003e\u003cb\u003ePages:\u003c\/b\u003e 410","brand":"Narasimha Sadineni, Anuyogam Venkataraman","offers":[{"title":"Default Title","offer_id":45396560674997,"sku":"9781098157708","price":67.99,"currency_code":"USD","in_stock":true}],"thumbnail_url":"\/\/cdn.shopify.com\/s\/files\/1\/0473\/0804\/6492\/files\/img_27e66dfd-1051-45fa-9ae2-4aa4a18beeff.jpg?v=1747750381","url":"https:\/\/pastforward.org\/products\/dataproc-cookbook-running-spark-and-hadoop-workloads-in-google-cloud-9781098157708","provider":"Past Forward","version":"1.0","type":"link"}