Want to tap the power behind search rankings, product recommendations, social bookmarking, and online matchmaking? This fascinating book demonstrates how you can build Web 2.0 applications to mine the enormous amount of data created by people on the Internet. With the sophisticated algorithms in this book, you can write smart programs to access interesting datasets from other web sites, collect data from users of your own applications, and analyze and understand the data once you've found it.Programming Collective Intelligence takes you into the world of machine learning and statistics, and explains how to draw conclusions about user experience, marketing, personal tastes, and human behavior in general -- all from information that you and others collect every day. Each algorithm is described clearly and concisely with code that can immediately be used on your web site, blog, Wiki, or specialized application. This book explains:* Collaborative filtering techniques that enable online retailers to recommend products or media * Methods of clustering to detect groups of similar items in a large dataset * Search engine features -- crawlers, indexers, query engines, and the PageRank algorithm * Optimization algorithms that search millions of possible solutions to a problem and choose the best one * Bayesian filtering, used in spam filters for classifying documents based on word types and other features * Using decision trees not only to make predictions, but to model the way decisions are made * Predicting numerical values rather than classifications to build price models * Support vector machines to match people in online dating sites * Non-negative matrix factorization to find the independent features in a dataset * Evolving intelligence for problem solving -- how a computer develops its skill by improving its own code the more it plays a gameEach chapter includes exercises for extending the algorithms to make them more powerful. Go beyond simple database-backed applications and put the wealth of Internet data to work for you. "Bravo! I cannot think of a better way for a developer to first learn these algorithms and methods, nor can I think of a better way for me (an old AI dog) to reinvigorate my knowledge of the details."-- Dan Russell, Google "Toby's book does a great job of breaking down the complex subject matter of machine-learning algorithms into practical, easy-to-understand examples that can be directly applied to analysis of social interaction across the Web today.If I had this book two years ago, it would have saved precious time going down some fruitless paths." -- Tim Wolters, CTO, Collective Intellect
Toby Segaran works as a Data Magnate at Metaweb Technologies. Prior to working at Metaweb, he started a biotech software company called Incellico which was later acquired by Genstruct. His book, "Programming Collective Intelligence" has been the best-selling AI book on Amazon for several months. He is the recipient of a National Interest Waiver for "People of Exceptional Abilit...
(展开全部)
Next,getalistofrandompeopletomakeupthedataset.Fortunately,HotorNotprovidesanAPIcallthatreturnsalistofpeoplewithspecifiedcriteria.Inthisexam-ple,theonlycriteriawillbethatthepeoplehave“meetme”profiles,sinceonlyfromtheseprofilescanyougetotherinformationlikelocationandinterests.Addthisfunctiontohotornot.py:
——引自第162页
WhatDoesThisHavetoDowiththeArticlesMatrix?Sofar,whatyouhaveisamatrixofarticleswithwordcounts.Thegoalistofactorizethismatrix,whichmeansfindingtwosmallermatricesthatcanbemultipliedtogethertoreconstructthisone.Thetwosmallermatricesare:ThefeaturesmatrixThismatrixhasarowforeachfeatureandacolumnforeachword.Thevaluesindicatehowimportantawordistoafeature.Eachfeatureshouldrepresentathemethatemergedfromasetofarticles,soyoumightexpectanarticleaboutanewTVshowtohaveahighweightfortheword“television.”TheweightsmatrixThismatrixmapsthefeaturestothearticlesmatrix.Eachrowisanarticleandeachcolumnisafeature.Thevaluesstatehowmucheachfeatureappliestoeacharticl...
——引自第234页
胡适(1891-1962),安徽绩溪人。现代著名学者,历任北京大学教授、北京大学校长、台湾中央研究院院长等职。胡适的学术研究范围非常广泛,他运用他所推崇的“科学...
Tim Burton, known for his distinctive creative style, effortlessly blends the ma...
夏尔杂·扎西坚赞(藏文:ཤར་རྫ་བཀྲ་ཤིས་རྒྱལ་མཚན,威利:sharrdzabkrashisrgyalmtshan;1859年-1933年)是康...
他者的时代已然逝去。那朋友似的、地狱般的、神秘的、诱惑的、爱欲的他者已让位于同者。如今,同质化的扩散形成病理变化,对社会体造成侵害。扩散之势愈演愈烈。使社会体害...
《2017新金融案例年度报告》内容简介:本书是《经济观察报》与新金融家联盟自2015开始的“新金融案例观察五年计划系列”的第三本。
作者卡尔·施米特(Carl Schmitt,1888-1985),德国法学家、政治思想家,他以“决断论”的法学立场闻名,是实证主义法学、自由主义政治哲学的重要批...
春风榴火,晋江文学城签约作者,擅长细腻的青春校园爱情故事,充满少女心,相信故事有一千种结局,爱是永恒的答案。已出版作品《深情可抵岁月长》《小温柔》《你是如此的难...
圖文書暢銷作者堀川波這次要教你──如何簡單做幸福系手作小物《星期六的裁縫》,用親手縫製的心意來布置家裡吧!!杯墊、抹布、小提袋,三兩下就能縫好的溫馨小物,即使有...
世界著名侦探小说巴格达之旅阿加莎・克里斯蒂著东西方首脑峰会前夕,一名英国情报员密潜巴格达,却神秘地死于非命。但秘密并未随
柴浩然医论医案集 本书特色 《柴浩然医论医案集》编著**批全国名老中医药专家学术经验继承工作指导老师55年来的临床经验精华和临床医案精华,主要分三个部分,**部...
公安民警警械武器使用训练教程(试行) 内容简介 本书阐述了警械武器使用背后的法律支撑和价值追求,讲解警械武器的工作机理和保养维护,介绍了催泪喷射器、伸缩警棍、手...
马克西姆·古列耶夫俄罗斯著名纪录片导演、散文作家、编剧和记者。其执导电影题材以纪录片和艺术片为主,共执导了60多部纪录片电影!同时马克西姆·古列耶夫还是一位高水...
卡拉‧魯科特(Carla Lisbeth Rueckert)生於1943年伊利諾州的森湖市(Lake Forest)。她在肯塔基州的路易斯維爾市成長(Louis...
「從來沒有人教育我們自由是什麼,我們只被教育如何為自由而犧牲。」她說,我寫了三十年,寫得筋疲力盡,為何我們還沒換來自由?她深受杜斯妥也夫斯基與托爾斯泰的影響,傾...
作品目录1 起舞,不落幕(序)2 思想的声音 一梦三四年3 这年冬天的家书 红色抒情4 分裂 忽然之间5 春天的七个瞬间6 从布拉格到
1952年毕业于北京大学,现任西安电子科技大学教授、博士生导师、中国通信学会理事、中国电子学会学术工作委员会委员。先后被评选为中国通信学会会士、中国电子学会会士...
郑作时财经作家,曾任《南风窗》高级记者,中国本土最佳商业作者之一。著有中国最著名企业家传记《希望永行——中国首富刘永行自述》、中国最优秀的互联网公司发展史《阿里...
作品目录24大旅游景点 4 印度小档案 10 旅行计划 12 独一无二的体验 22 嬉戏的众神 24 铁路旅行 30 投奔旷野 36 山峰之巅
作品目录中国 一 西南地区 1 云南 2 贵州 3 四川 二 西北地区 1 陕西 2 甘肃 3 宁夏 4 青海 5 新疆 三 华东地区 1 山东 2 浙江
▍紀念台灣繁體中文版發行,隨書贈台灣限定典藏閃膜書籤 ▍(書籤尺寸約18.5*6.5 cm,市面不售 )☆特別進口.全套書衣採用日本特殊雷射寶石閃膜★「有強大的...