Want to tap the power behind search rankings, product recommendations, social bookmarking, and online matchmaking? This fascinating book demonstrates how you can build Web 2.0 applications to mine the enormous amount of data created by people on the Internet. With the sophisticated algorithms in this book, you can write smart programs to access interesting datasets from other web sites, collect data from users of your own applications, and analyze and understand the data once you've found it.Programming Collective Intelligence takes you into the world of machine learning and statistics, and explains how to draw conclusions about user experience, marketing, personal tastes, and human behavior in general -- all from information that you and others collect every day. Each algorithm is described clearly and concisely with code that can immediately be used on your web site, blog, Wiki, or specialized application. This book explains:* Collaborative filtering techniques that enable online retailers to recommend products or media * Methods of clustering to detect groups of similar items in a large dataset * Search engine features -- crawlers, indexers, query engines, and the PageRank algorithm * Optimization algorithms that search millions of possible solutions to a problem and choose the best one * Bayesian filtering, used in spam filters for classifying documents based on word types and other features * Using decision trees not only to make predictions, but to model the way decisions are made * Predicting numerical values rather than classifications to build price models * Support vector machines to match people in online dating sites * Non-negative matrix factorization to find the independent features in a dataset * Evolving intelligence for problem solving -- how a computer develops its skill by improving its own code the more it plays a gameEach chapter includes exercises for extending the algorithms to make them more powerful. Go beyond simple database-backed applications and put the wealth of Internet data to work for you. "Bravo! I cannot think of a better way for a developer to first learn these algorithms and methods, nor can I think of a better way for me (an old AI dog) to reinvigorate my knowledge of the details."-- Dan Russell, Google "Toby's book does a great job of breaking down the complex subject matter of machine-learning algorithms into practical, easy-to-understand examples that can be directly applied to analysis of social interaction across the Web today.If I had this book two years ago, it would have saved precious time going down some fruitless paths." -- Tim Wolters, CTO, Collective Intellect
Toby Segaran works as a Data Magnate at Metaweb Technologies. Prior to working at Metaweb, he started a biotech software company called Incellico which was later acquired by Genstruct. His book, "Programming Collective Intelligence" has been the best-selling AI book on Amazon for several months. He is the recipient of a National Interest Waiver for "People of Exceptional Abilit...
(展开全部)
Next,getalistofrandompeopletomakeupthedataset.Fortunately,HotorNotprovidesanAPIcallthatreturnsalistofpeoplewithspecifiedcriteria.Inthisexam-ple,theonlycriteriawillbethatthepeoplehave“meetme”profiles,sinceonlyfromtheseprofilescanyougetotherinformationlikelocationandinterests.Addthisfunctiontohotornot.py:
——引自第162页
WhatDoesThisHavetoDowiththeArticlesMatrix?Sofar,whatyouhaveisamatrixofarticleswithwordcounts.Thegoalistofactorizethismatrix,whichmeansfindingtwosmallermatricesthatcanbemultipliedtogethertoreconstructthisone.Thetwosmallermatricesare:ThefeaturesmatrixThismatrixhasarowforeachfeatureandacolumnforeachword.Thevaluesindicatehowimportantawordistoafeature.Eachfeatureshouldrepresentathemethatemergedfromasetofarticles,soyoumightexpectanarticleaboutanewTVshowtohaveahighweightfortheword“television.”TheweightsmatrixThismatrixmapsthefeaturestothearticlesmatrix.Eachrowisanarticleandeachcolumnisafeature.Thevaluesstatehowmucheachfeatureappliestoeacharticl...
——引自第234页
Earlyonemorning,intheyear18hundredandsomething,thegreatSouthernOceanwasinoneofit...
《西方美术简史(经典彩色插图版)(最新修订)》将带领读者朋友结识一群狂热的人们,他们生活在不同的时空之下,却有着共同的特点。他们可以忍饥挨饿,他们可以豪情万丈,...
“第一个真正实用的人工智能”搜索引擎WolframAlpha发明人斯蒂芬·沃尔弗拉姆的ChatGPT诚意之作◎ 编辑推荐OpenAI CEO、ChatGPT之父...
不确定性人工智能 本书特色 《不确定性人工智能》的读者,可以是从事认知科学、脑科学、人工智能、计算机科学和控制论研究的学者,尤其是从事自然语言理解与处理、智能检...
江南,男,安徽合肥人。目前的身份是作家和媒体经理。代表作品有《此间的少年》、《一千零一夜之死神》、《九州·缥缈录》系列、《光明皇帝》系列等。
AdentistliesmurderedathisHarleyStreetpractice...Thedentistwasfoundwithablackened...
巴克莱博士(Dr.William Barclay)素有当代宗教文坛奇才之称。他一生所完成的巨著逾六十部,专文更是难数。他不只是新约权威(他的新旧约注释本风行欧美...
积三十年之功,一部研究中国近代启蒙人严复的权威传记。中西文化接轨之关键时刻,看启蒙先驱,如何笔醒山河。| 大学问出品著名学者欧阳哲生、马勇、王宪明一致推荐【名家...
法布里斯•穆瓦罗(Fabrice Moireau)水彩画家。毕业于法国国立高等实用美术与工艺学院。法布里斯常年四处游历,醉心于描绘大自然和建筑遗产,喜爱用水彩画...
thestrapsthatbindthemindintimeallocatethehandstofindatooltofoolthedroolthattends...
李安妮(Tara Ann Lee)出生於台灣,九歲時隨父母遷居美國,曾居住過八個國家,足跡遍及全球40 餘國,是國際上少數同時接受過西方科學訓練(心理學),以及...
叶子,本名张静,一个出生在云南边城的女孩。从小渴望无拘无束的生活。大学毕业后,父母在家乡为她安排了工作,可骨子里不安分的她,只上了三个月的班,便愤然辞职,去广州...
TheSovietmilitaryconceptofoperationalartandtheassociatedtheoriessuchas"warofanni...
作品目录第一章:自然视力是我们的,应当保护第二章:改变近视眼性格第三章:改变远视、散光和性特征第四章:重返清晰世界第五章
同济大学建筑与城市规划学院教授、博士生导师、国家一级注册建筑师。毕业于同济大学建筑学专业,先后获学士、硕士、博士学位。1993-1994年作为访问学者在香港大学...
刘乃昌,山东大学文学院教授,山东省古典文学研究会第一届副会长,中国李清照辛弃疾学会原会长。
作品目录引言、人生有何意义\一、树头铜钲\二、畅想曲\三、中天丽日\四、交响乐\五、共鸣\六、暮钟\附录:胡适年谱简编\编后记\
崔建中,顾问式销售的实战派资深专家,资深培训师。从事管理软件营销和管理信息化咨询工作十七年。作为信息化咨询专家,为近百家大中型企业提供过信息化咨询服务。先后在用...
作品目录译者序序前言符号说明术语第1章 引言第2章 密码设备第3章 能量消耗第4章 能量迹的统计特征第5章 简单能量分析第6章 差分
陈美林,回族,1932年生,中国古代文学研究专家,长期从事古代小说、戏曲、诗文研究,尤以吴敬梓和《儒林外史》研究著称于世。