000 a
999 _c31348
_d31348
008 230220b xxu||||| |||| 00| 0 eng d
020 _a9789352137060
082 _a006.31
_bCHA
100 _aChambers, Bill
245 _aSpark : the definintive guide : big data processing made simple
260 _bShroff Publishers,
_c2018
_aMumbai :
300 _axxvi, 574 p.;
_bill.
_c24 cm
365 _b1800.00
_cINR
_d1.00
504 _aIncludes index.
520 _aLearn how to use, deploy, and maintain Apache Spark with this comprehensive guide, written by the creators of the open-source cluster-computing framework. With an emphasis on improvements and new features in Spark 2.0, authors Bill Chambers and Matei Zaharia break down Spark topics into distinct sections, each with unique goals. You’ll explore the basic operations and common functions of Spark’s structured APIs, as well as Structured Streaming, a new high-level API for building end-to-end streaming applications. Developers and system administrators will learn the fundamentals of monitoring, tuning, and debugging Spark, and explore machine learning techniques and scenarios for employing MLlib, Spark’s scalable machine-learning library. Get a gentle overview of big data and Spark Learn about DataFrames, SQL, and Datasets—Spark’s core APIs—through worked examples Dive into Spark’s low-level APIs, RDDs, and execution of SQL and DataFrames Understand how Spark runs on a cluster Debug, monitor, and tune Spark clusters and applications Learn the power of Structured Streaming, Spark’s stream-processing engine Learn how you can apply MLlib to a variety of problems, including classification or recommendation.
650 _aBig data
650 _aComputer Science
650 _aData Processing
650 _aHardware General
650 _aInformation Technology
650 _aData mining
650 _aInformation retrieval
650 _aSpark
650 _aAdvanced analytics
650 _a Aggregations
650 _a Apache Hive
650 _a Cluster manager
650 _a Configuration options
650 _aDataframe
650 _a Datasets
650 _a Decision trees
650 _aGraphframes
650 _aHadoop distributed file system
650 _a Hyperparameters
650 _aJSON data
650 _aLinear regression
650 _aLogistic regression
650 _aMachine learning
650 _a MLib
650 _aNullIf fiunction
650 _a Python
650 _aRandom forests
650 _a RDD method
650 _aStream processing
650 _a Timestamp type class
650 _aUnsupervised learning
650 _aWatermarks
700 _aZaharia, Matei
942 _2ddc
_cBK