Text Processing in Python describes techniques for manipulation of text using the Python programming language. At the broadest level, text processing is simply taking textual information and doing something with it. This might be restructuring or reformatting it, extracting smaller bits of information from it, or performing calculations that depend on the text. Text processing is arguably what most programmers spend most of their time doing. Because Python is clear, expressive, and object-oriented it is a perfect language for doing text processing, even better than Perl. As the amount of data everywhere continues to increase, this is more and more of a challenge for programmers. This book is not a tutorial on Python. It has two other goals: helping the programmer get the job done pragmatically and efficiently; and giving the reader an understanding - both theoretically and conceptually - of why what works works and what doesn't work doesn't work. Mertz provides practical pointers and tips that emphasize efficent, flexible, and maintainable approaches to the textprocessing tasks that working programmers face daily.
From the Back Cover:
Text Processing in Python is an example-driven, hands-on tutorial that carefully teaches programmers how to accomplish numerous text processing tasks using the Python language. Filled with concrete examples, this book provides efficient and effective solutions to specific text processing problems and practical strategies for dealing with all types of text processing challenges.
Text Processing in Python begins with an introduction to text processing and contains a quick Python tutorial to get you up to speed. It then delves into essential text processing subject areas, including string operations, regular expressions, parsers and state machines, and Internet tools and techniques. Appendixes cover such important topics as data compression and Unicode. A comprehensive index and plentiful cross-referencing offer easy access to available information. In addition, exercises throughout the book provide readers with further opportunity to hone their skills either on their own or in the classroom. A companion Web site (http://gnosis.cx/TPiP) contains source code and examples from the book.
Here is some of what you will find in thie book:
* When do I use formal parsers to process structured and semi-structured data? Page 257
* How do I work with full text indexing? Page 199
* What patterns in text can be expressed using regular expressions? Page 204
* How do I find a URL or an email address in text? Page 228
* How do I process a report with a concrete state machine? Page 274
* How do I parse, create, and manipulate internet formats? Page 345
* How do I handle lossless and lossy compression? Page 454
* How do I find codepoints in Unicode? Page 465
《华为奋斗密码》内容简介:从价值、要素、体系三个方面,深度解析华为人力资源管理的核心法则。本书上篇“价值管理”,围绕价值创
《从感觉开始》内容简介:陈嘉映经典学术随笔。世界在感觉里,感觉又在哪里? 真正的理解里也总有不曾完全明了的东西,清明的理解连
《写给Web开发人员看的HTML5教程》通过结合大量实际案例和源代码对HTML5的重要特性进行了详细讲解,内容全面丰富,易于理解。全书
《游戏人间一孤鸿》内容简介:本书收录的是庐隐创作的经典散文和小说。这些作品有的反映青年人不甘醉生梦死的苦闷,有的反映知识女
《锦绣河山》内容简介:此书主要收录徐铸成先生1984年至1985年来在海内外报刊所刊登的游记和通讯。这一年半来,作者风尘仆仆地跑遍
《JavaScript框架设计》内容简介:本书是一本全面讲解JavaScript框架设计的图书,详细地讲解了设计框架需要具备的知识,主要包括的
Visual C++.NET编程宝典 内容简介 本书作者具有丰富的Viual C++.NET/MFC经验,对许多具体问题的处理都有独到的见解。本书按照由浅入深的...
AquickguidetoeverythinganyonewouldwanttoknowaboutthesoaringlypopularInternetprog...
《黄金白银投资与理财》内容简介:本书从认识黄金和白银的特性入手,分析了黄金以及白银的金属属性和货币属性,阐述了黄金与白银市
《叶秀山全集·第八卷》内容简介:本选题分类结集叶秀山先生全部已经出版的专著,在学术期刊上发表的所有论文,以及部分笔记、札记
《Python游戏设计案例实战》内容简介:本书以Python 3.5为编程环境,从基本的程序设计思想入手,逐步开展Python语言教学,是一本面
中文版InDesign CS5技术大全-(附光盘) 本书特色 超厚手册,超大容量,技术全面,39章教学内容,软件功能全覆盖,基础详解、功能全面、理论实践全接触,...
《与大师同行》内容简介:经济思想的产生和发展,经历了一段漫长的历史。对于初学者来讲,穿越时间隧道的历程,充满着诸多的不确定
《黑客攻防入门(畅销升级版)》内容简介:《黑客攻防入门(畅销升级版)》从黑客新手的需要和学习习惯出发,详细介绍了黑客基础知
《60问读懂《道德经》》内容简介:老子的《道德经》被誉为“万经”,是道家、道教的优选经典。从古到今,注解它的书层出不穷,不同的
《更好更年期》内容简介:为什么出版这本书:1.更年期不仅是医学议题,也折射出女性的社会地位,女性的自我定位和价值体现以及新时
《明亮的泥土:颜料发明史(天际线丛书)》内容简介:每位艺术家,都与他那个时代的颜色有一份属于自己的约定。一部揭示艺术、科学
《50件改变世界的裙装》讲述了:你不必是一个追逐时尚的人,或是设计爱好者,也不用奢望一件裙装就能改变整个社会。阅读本书即是
《人为什么有感觉》内容简介:自呱呱坠地第一声孩啼,一个充满着各种光亮、颜色、声音、味道的广阔世界将逐步向我们展开,随后的漫
软件工程-理论与实践(第三版 影印版) 本书特色 本套教学用书的特点:权威性——教育部高等教育司推荐、教育部高等学校信息科学与技术引进教材专家组遴选。系统性——...