Want to tap the power behind search rankings, product recommendations, social bookmarking, and online matchmaking? This fascinating book demonstrates how you can build Web 2.0 applications to mine the enormous amount of data created by people on the Internet. With the sophisticated algorithms in this book, you can write smart programs to access interesting datasets from other web sites, collect data from users of your own applications, and analyze and understand the data once you've found it.Programming Collective Intelligence takes you into the world of machine learning and statistics, and explains how to draw conclusions about user experience, marketing, personal tastes, and human behavior in general -- all from information that you and others collect every day. Each algorithm is described clearly and concisely with code that can immediately be used on your web site, blog, Wiki, or specialized application. This book explains:* Collaborative filtering techniques that enable online retailers to recommend products or media * Methods of clustering to detect groups of similar items in a large dataset * Search engine features -- crawlers, indexers, query engines, and the PageRank algorithm * Optimization algorithms that search millions of possible solutions to a problem and choose the best one * Bayesian filtering, used in spam filters for classifying documents based on word types and other features * Using decision trees not only to make predictions, but to model the way decisions are made * Predicting numerical values rather than classifications to build price models * Support vector machines to match people in online dating sites * Non-negative matrix factorization to find the independent features in a dataset * Evolving intelligence for problem solving -- how a computer develops its skill by improving its own code the more it plays a gameEach chapter includes exercises for extending the algorithms to make them more powerful. Go beyond simple database-backed applications and put the wealth of Internet data to work for you. "Bravo! I cannot think of a better way for a developer to first learn these algorithms and methods, nor can I think of a better way for me (an old AI dog) to reinvigorate my knowledge of the details."-- Dan Russell, Google "Toby's book does a great job of breaking down the complex subject matter of machine-learning algorithms into practical, easy-to-understand examples that can be directly applied to analysis of social interaction across the Web today.If I had this book two years ago, it would have saved precious time going down some fruitless paths." -- Tim Wolters, CTO, Collective Intellect
Toby Segaran works as a Data Magnate at Metaweb Technologies. Prior to working at Metaweb, he started a biotech software company called Incellico which was later acquired by Genstruct. His book, "Programming Collective Intelligence" has been the best-selling AI book on Amazon for several months. He is the recipient of a National Interest Waiver for "People of Exceptional Abilit...
(展开全部)
Next,getalistofrandompeopletomakeupthedataset.Fortunately,HotorNotprovidesanAPIcallthatreturnsalistofpeoplewithspecifiedcriteria.Inthisexam-ple,theonlycriteriawillbethatthepeoplehave“meetme”profiles,sinceonlyfromtheseprofilescanyougetotherinformationlikelocationandinterests.Addthisfunctiontohotornot.py:
——引自第162页
WhatDoesThisHavetoDowiththeArticlesMatrix?Sofar,whatyouhaveisamatrixofarticleswithwordcounts.Thegoalistofactorizethismatrix,whichmeansfindingtwosmallermatricesthatcanbemultipliedtogethertoreconstructthisone.Thetwosmallermatricesare:ThefeaturesmatrixThismatrixhasarowforeachfeatureandacolumnforeachword.Thevaluesindicatehowimportantawordistoafeature.Eachfeatureshouldrepresentathemethatemergedfromasetofarticles,soyoumightexpectanarticleaboutanewTVshowtohaveahighweightfortheword“television.”TheweightsmatrixThismatrixmapsthefeaturestothearticlesmatrix.Eachrowisanarticleandeachcolumnisafeature.Thevaluesstatehowmucheachfeatureappliestoeacharticl...
——引自第234页
作品目录可爱的小狗①可爱的小狗②温馨世界①温馨世界②小熊恋人的故事 我们仨,猫+熊=大熊猫?时尚搭档三人组复古与流行的好朋
回归中医 内容简介 该书共收录老师近期的专题论文29篇,主要内容大致可以从以下几部分进行概括:1、阐述了中医在未来医学中的地位:中医不仅属于过去,也属于现在,更...
1.现代社会科学主要奠基人——马克斯·韦伯马克斯·韦伯是德国著名社会学家,现代最具影响力和生命力的思想家,与卡尔·马克思、埃米尔·涂尔干并列为现代社会学的奠基者...
如果你所在城市的地铁因为安全因素升级了安检程序,你会积极配合还是消极抵触?如果你喜欢的一位明星或KOL热情地向你推荐一款产品,你会立即下单还是深思熟虑?如果你的...
2014-医学综合-临床医师应试习题集-国家执业医师资格考试指定用书-(上.下册)-(含光盘)-赠200元京师网校学习卡 本书特色 我国执业医师资格考试...
中药药剂学 内容简介 本书是《新世纪全国高等中医药院校教材同步辅导系列丛书》之一,紧扣《中药药剂学》*新教学大纲,以章节为序,分重点难点提示、知识点精析、综合测...
作家,诗人,文艺少年,只喝啤酒。曾在知名广告公司和互联网公司任职,人生阅历丰富,知识涉猎范围广,人性洞察敏感。微博知名大V,知名职场、互联网行业评论博主,且经常...
This2011volumearguesthatthecommitmenttojusticeisafundamentalmotiveandthat,althou...
The fox knows many things, the Greeks said, but the hedgehog knows one big thing...
作者亚历山大•柯瓦雷(1892-1964),俄裔法国哲学家、科学史家。在科学史方面,柯瓦雷享有不亚于科学史之父乔治•萨顿的地位。柯瓦雷的著作和他在普林斯顿研究院...
《3色配色速查手册》内容简介:◆只用3种颜色,解决配色烦恼,即查即用的配色速查手册 ◆每种颜色都标注CMYK、RGB和HEX色值,适应多
赛斯·高汀(Seth Godin):雅虎前副总裁、高产的国际畅销书作者、当代最有影响力的商业思想家之一。《快速企业》(Fast Company)杂志专栏作家拥有...
耳鼻喉及口腔科疾病 内容简介 本书为《社区医师中西医诊疗规范丛书》之一。全书对耳鼻咽喉科及口腔科常见疾病,从诊断到中西医治疗规范全面进行了阐述。 本书...
罗伯特·E.勒纳,西南大学历史学教授,西南大学《人类计划》项目负责人,美国中世纪研究学会会员,美国罗马研究学会会员。著有《灾祸时代》、《十四世纪》、《中世纪后期...
有栖川有栖,1959 年出生于日本大阪市,新本格派先锋人物,曾担任新本格推理作家俱乐部的首任会长。他注重推理的严密性和诡计的新奇,被誉为“日本的埃勒里•奎因”。...
《官术》内容简介:身在仕途,追逐权力是势所必然,主人公薛冰也不例外。受处分刘瑟困山区担任镇党委书记,环境的恶劣、政敌的压
作者简介Tariq Rashid 拥有物理学学士学位、机器学习和数据挖掘硕士学位。常年活跃于伦敦的技术领域,领导并组织伦敦 Python 聚
WhenMarcellaSanchez,aseniordetectiveinthehomicidedivisionoftheLosAngelesPoliceDe...
社会学界最高奖项 赖特·米尔斯奖《洛杉矶时报》年度图书全球各院校社科课程 阅读书目《街角社会》作者 威廉·怀特力荐——“《人行道王国》延续了参与式观察的最佳传统...
伊冯·乔伊纳德 Yvon Chouinard,世界顶级户外品牌巴塔哥尼亚的创始人,被誉为“户外Gucci”,在美国户外零售店及美国最大的户外零售连锁店REI销售...