MARC View

000			a
999			_c33764 _d33764
008			250305b xxu\|\|\|\|\| \|\|\|\| 00\| 0 eng d
020			_a9781484219096
082			_a005.74 _bKOI
100			_aKoitzsch, Kerry
245			_aPro Hadoop data analytics : designing and building big data systems using the Hadoop ecosystem
260			_bApress, _c2017 _aNew York :
300			_axxi, 298 p. ; _bill., (some col.), _c26 cm
365			_b39.99 _c€ _d93.20
504			_aIncludes bibliographical references at the end of each chapters and index.
520			_aLearn advanced analytical techniques and leverage existing toolkits to make your analytic applications more powerful, precise, and efficient. This book provides the right combination of architecture, design, and implementation information to create analytical systems which go beyond the basics of classification, clustering, and recommendation. In Pro Hadoop Data Analytics best practices are emphasized to ensure coherent, efficient development. A complete example system will be developed using standard third-party components which will consist of the toolkits, libraries, visualization and reporting code, as well as support glue to provide a working and extensible end-to-end system. The book emphasizes four important topics: The importance of end-to-end, flexible, configurable, high-performance data pipeline systems with analytical components as well as appropriate visualization results. Deep-dive topics will include Spark, H20, Vopal Wabbit (NLP), Stanford NLP, and other appropriate toolkits and plugins. Best practices and structured design principles. This will include strategic topics as well as the how to example portions. The importance of mix-and-match or hybrid systems, using different analytical components in one application to accomplish application goals. The hybrid approach will be prominent in the examples. Use of existing third-party libraries is key to effective development. Deep dive examples of the functionality of some of these toolkits will be showcased as you develop the example system.
650			_aApache Hadoop
650			_aCloud Computing
650			_aSoftware development
650			_aData mining
650			_aDatabase management
650			_aAnalytical engine
650			_aBig data analytics
650			_aData pipeline
650			_a Environment variable
650			_aHadoop ecosystem
650			_aSpring Framework
942			_2ddc _cBK