Apache Hadoop is ideal for organizations with a growing need to store and process massive application datasets. Hadoop: The Definitive Guide is a comprehensive resource for using Hadoop to build reliable, scalable, distributed systems. Programmers will find details for analyzing large datasets with Hadoop, and administrators will learn how to set up and run Hadoop clusters. The book includes case studies that illustrate how Hadoop solves specific problems.
Organizations large and small are adopting Apache Hadoop to deal with huge application datasets. Hadoop: The Definitive Guide provides you with the key for unlocking the wealth this data holds. Hadoop is ideal for storing and processing massive amounts of data, but until now, information on this open-source project has been lacking -- especially with regard to best practices. This comprehensive resource demonstrates how to use Hadoop to build reliable, scalable, distributed systems. Programmers will find details for analyzing large datasets with Hadoop, and administrators will learn how to set up and run Hadoop clusters.
With case studies that illustrate how Hadoop solves specific problems, this book helps you:
* Learn the Hadoop Distributed File System (HDFS), including ways to use its many APIs to transfer data
* Write distributed computations with MapReduce, Hadoop's most vital component
* Become familiar with Hadoop's data and IO building blocks for compression, data integrity, serialization, and persistence
* Learn the common pitfalls and advanced features for writing real-world MapReduce programs
* Design, build, and administer a dedicated Hadoop cluster
* Use HBase, Hadoop's database for structured and semi-structured data
And more. Hadoop: The Definitive Guide is still in progress, but you can get started on this technology with the Rough Cuts edition, which lets you read the book online or download it in PDF format as the manuscript evolves.
Scikit-Learn与TensorFlow机器学习实用指南 本书特色 TensorFlow是一个采用数据流图(data flow graphs),用于数值计...
专门用途英语系列教材是教育部规划的高等学校(包括高等专科院校和高等职业院校)专业英语阶段的英语教材,也可供电大、各类成人
本书是有关WebService讨论最详尽的书籍之一。全书涵盖了构建面向服务的体系结构所涉及的方方面面,包括一整套概念体系、原理、支
《叶兆言散文》内容简介:叶兆言的散文以博识、才学、智趣见长。在他的笔下,家庭生活、读书、采风、故交等皆可成文,厚实的人文功
如果计算机真正消除了文书工作,那么办公室的垃圾箱为什么老是装得满满的?为何银行自动兑款机前的队伍经常比出纳员窗口的队伍长
《应急响应》内容简介:本书的内容将前沿的网络安全应急响应理论与一线实战经验相结合,从科普角度介绍网络安全应急响应基础知识。
《数据结构简明教程(第2版·微课版)》内容简介:本书内容包括概论、线性表、栈和队列、串、数组和稀疏矩阵、树和二叉树、图、查找
《云存储解析》详细介绍了云存储的由来、业务现状和技术现状,并在此基础上重点介绍了云存储的需求和应用、技术架构、关键技术、
《蔡澜旅行食记(2)》内容简介:《蔡澜旅行食记2》是畅销书《蔡澜旅行食记》的续篇,文章仍以蔡澜先生以寻味为目的的游记和杂感随
《夏天最后一朵玫瑰》内容简介:本书是为初中至高中年龄的青少年精选的外国经典诗歌,均出自各国文学大师或大诗人之手,译者也是我
C语言参悟之旅 本书特色 全书共分11章,系统详尽地介绍了c语言程序设计的基本方法,主要包括程序设计与c语言概述,数据及数据类型,运算符、表达式和语句,流程控制...
本书由“java之父”JameGosling以及另外三位顶级大师撰写而成,无论是对java语言的初学者还是专业程序员都具有极高的价值,是关于
《当下的修行:要懂得一点放下》内容简介:其实,生活本该是一个轻松的课题,只是我们一直无法放下心中的累赘,将不该看重的东西看
《从西湖到瓦尔登湖》内容简介:课堂与世界何以融汇贯通,答案或在“越读”之中:越,翻越也,亦翻阅也;读,岂独诵书也哉,焉能不
OpenCL领域公认的权威著作,由OpenCL核心设计人员亲自执笔,不仅全面而深刻地解读了OpenCL规范和编程模型,而且通过大量案例和代
設計是什麼?為什麼設計?原研哉與阿部雅世,東京與柏林,居住在兩個不同城市的設計者,進行了一場以「設計」為題的討論。本書是
《京胡伴奏京剧经典唱段选》内容简介:这本《陈平一京胡伴奏京剧经典唱段》包含青衣、花脸、老旦、老生的唱腔名段,包括《西施》、
《暑期社会调查报告优秀作品集(2019)》内容简介:本书为南京理工大学马克思主义学院“毛泽东思想与中国特色社会主义理论体系概论
Pro/Engineer Wildfire3.0基础设计与实践-(含光盘) 本书特色 本书首先以机械零件的建立为例提出问题,然后结合建模理论分析问题,再通过建模...
《Shell脚本学习指南》将告诉你这些有关UNIX主流工具的知识。除此之外,《Shell脚本学习指南》还会帮助你解决UNIX命令与标准的差