Python數(shù)據(jù)分析(影印版)
出版時(shí)間:2013年06月
頁數(shù):472
“科學(xué)和數(shù)據(jù)分析領(lǐng)域已經(jīng)等了這本書好幾年了:具有具體的實(shí)用建議以及如何聚沙成塔的見解。它應(yīng)該會成為接下來若干年里Python科學(xué)計(jì)算方面的經(jīng)典參考資料。”
——Fernando Perez
UC Berkeley大學(xué)的助理研究員,也是IPython的原創(chuàng)作者之一
你是否在尋找一本完整介紹Python操縱、處理、提取和壓縮結(jié)構(gòu)化數(shù)據(jù)的指南?本書包含了許多實(shí)例分析,通過若干個(gè)Python庫——包括NumPy,pandas,matplotlib和IPython——為你展示了如何高效地解決大量數(shù)據(jù)分析的問題。
《Python數(shù)據(jù)分析》由Wes McKinney撰寫,他是pandas庫的主要作者。本書也是一本具有實(shí)踐性的指南,指導(dǎo)那些使用Python進(jìn)行科學(xué)計(jì)算的數(shù)據(jù)密集型應(yīng)用。它適用于剛剛開始使用Python的分析師,或者是進(jìn)入科學(xué)計(jì)算領(lǐng)域的Python程序員。
· 使用IPython交互式shell作為你的主要開發(fā)環(huán)境
· 學(xué)習(xí)NumPy(Numerical Python)的基礎(chǔ)和高級特性
· 接觸pandas庫中的數(shù)據(jù)分析工具
· 使用高性能工具來加載、抽取、轉(zhuǎn)換、合并和改造數(shù)據(jù)
· 使用matplotlib來創(chuàng)建散點(diǎn)圖和靜態(tài)或者交互式可視化數(shù)據(jù)
· 運(yùn)用pandas的groupby功能來對數(shù)據(jù)集進(jìn)行切片、切塊和匯總
· 通過具體實(shí)例來學(xué)習(xí)如何解決web分析、社交科學(xué)、金融和經(jīng)濟(jì)領(lǐng)域的問題
Wes McKinney是pandas的主要作者,pandas是Python中流行的數(shù)據(jù)分析開源庫。他一開始是AQR資產(chǎn)管理公司的量化分析師,后來創(chuàng)辦了Lambda Foundry——一家企業(yè)數(shù)據(jù)分析公司。Wes是Python和開源社區(qū)的活躍講師和參與者。
- Chapter 1: Preliminaries
- What Is This Book About?
- Why Python for Data Analysis?
- Essential Python Libraries
- Installation and Setup
- Community and Conferences
- Navigating This Book
- Acknowledgements
- Chapter 2: Introductory Examples
- 1.usa.gov data from bit.ly
- MovieLens 1M Data Set
- US Baby Names 1880-2010
- Conclusions and The Path Ahead
- Chapter 3: IPython: An Interactive Computing and Development Environment
- IPython Basics
- Using the Command History
- Interacting with the Operating System
- Software Development Tools
- IPython HTML Notebook
- Tips for Productive Code Development Using IPython
- Advanced IPython Features
- Credits
- Chapter 4: NumPy Basics: Arrays and Vectorized Computation
- The NumPy ndarray: A Multidimensional Array Object
- Universal Functions: Fast Element-wise Array Functions
- Data Processing Using Arrays
- File Input and Output with Arrays
- Linear Algebra
- Random Number Generation
- Example: Random Walks
- Chapter 5: Getting Started with pandas
- Introduction to pandas Data Structures
- Essential Functionality
- Summarizing and Computing Descriptive Statistics
- Handling Missing Data
- Hierarchical Indexing
- Other pandas Topics
- Chapter 6: Data Loading, Storage, and File Formats
- Reading and Writing Data in Text Format
- Binary Data Formats
- Interacting with HTML and Web APIs
- Interacting with Databases
- Chapter 7: Data Wrangling: Clean, Transform, Merge, Reshape
- Combining and Merging Data Sets
- Reshaping and Pivoting
- Data Transformation
- String Manipulation
- Example: USDA Food Database
- Chapter 8: Plotting and Visualization
- A Brief matplotlib API Primer
- Plotting Functions in pandas
- Plotting Maps: Visualizing Haiti Earthquake Crisis Data
- Python Visualization Tool Ecosystem
- Chapter 9: Data Aggregation and Group Operations
- GroupBy Mechanics
- Data Aggregation
- Group-wise Operations and Transformations
- Pivot Tables and Cross-Tabulation
- Example: 2012 Federal Election Commission Database
- Chapter 10: Time Series
- Date and Time Data Types and Tools
- Time Series Basics
- Date Ranges, Frequencies, and Shifting
- Time Zone Handling
- Periods and Period Arithmetic
- Resampling and Frequency Conversion
- Time Series Plotting
- Moving Window Functions
- Performance and Memory Usage Notes
- Chapter 11: Financial and Economic Data Applications
- Data Munging Topics
- Group Transforms and Analysis
- More Example Applications
- Chapter 12: Advanced NumPy
- ndarray Object Internals
- Advanced Array Manipulation
- Broadcasting
- Advanced ufunc Usage
- Structured and Record Arrays
- More About Sorting
- NumPy Matrix Class
- Advanced Array Input and Output
- Performance Tips
- Appendix: Python Language Essentials
- The Python Interpreter
- The Basics
- Data Structures and Sequences
- Functions
- Files and the operating system
書名:Python數(shù)據(jù)分析(影印版)
國內(nèi)出版社:東南大學(xué)出版社
出版時(shí)間:2013年06月
頁數(shù):472
書號:978-7-5641-4204-9
原版書書名:Python for Data Analysis
原版書出版商:O'Reilly Media
Wes McKinney
Wes McKinney是紐約的一名數(shù)據(jù)分析高手和企業(yè)主。在2007年獲得MIT的數(shù)學(xué)學(xué)士學(xué)位之后,他到位于康涅狄格州格林威治市(Greenwich,CT)的AQR Capital Management公司從事定量金融方面的工作。由于不滿那些數(shù)據(jù)分析工具的各種不好用,他開始學(xué)習(xí)Python,并于2008年開始構(gòu)建pandas項(xiàng)目。他目前是Python科學(xué)計(jì)算社區(qū)的活躍分子,而且積極倡導(dǎo)在數(shù)據(jù)分析、金融以及統(tǒng)計(jì)應(yīng)用中使用Python。
The animal on the cover of Python for Data Analysis is a golden-tailed, or pen-tailed, tree shrew (Ptilocercus lowii). The golden-tailed tree shrew is the only one of its species in the genus Ptilocercus and family Ptilocercidae; all the other tree shrews are of the family Tupaiidae. Tree shrews are identified by their long tails and soft red-brown fur. As nicknamed, the golden-tailed tree shrew has a tail that resembles the feather on a quill pen. Tree shrews are omnivores, feeding primarily on insects, fruit, seeds, and small vertebrates.Found predominantly in Indonesia, Malaysia, and Thailand, these wild mammals are known for their chronic consumption of alcohol. Malaysian tree shrews were found to spend several hours consuming the naturally fermented nectar of the bertam palm, equalling about 10 to 12 glasses of wine with 3.8% alcohol content. Despite this, no golden-tailed tree shrew has ever been intoxicated, thanks largely to their impressive ethanol breakdown, which includes metabolizing the alcohol in a way not used by humans. Also more impressive than any of their mammal counterparts, including humans? Brain to body mass ratio.
Despite these mammals’ name, the golden-tailed shrew is not a true shrew, instead more closely related to primates. Because of their close relation, tree shrews have become an alternative to primates in medical experimentation for myopia, psychosocial stress, and hepatitis.