Python數(shù)據(jù)分析 第3版(影印版)
出版時間:2022年11月
頁數(shù):561
“Wes全面更新了新版的內(nèi)容,確保了本書仍然是使用 Python和pandas進行數(shù)據(jù)分析時的首選資源。我強烈向你推薦此書。”
-Paul Barry
講師,Head First Python(O'Reilly出版)的作者
這是一本使用Python操作、處理、清洗和處理數(shù)據(jù)集的權威手冊。這本實用指南的第3版針對Python 3.10和pandas 1.4進行了更新,包含多個實際案例研究,向你展示了如何有效地解決各種數(shù)據(jù)分析問題。你在本書中將學習pandas、NumPy和Jupyter的最新版本。
作者Wes McKinney是pandas項目的創(chuàng)建人,書中對Python中的多種數(shù)據(jù)科學工具作了實用且與時俱進的介紹。本書非常適合剛接觸Python的分析師以及剛接觸數(shù)據(jù)科學和科學計算的Python程序員。數(shù)據(jù)文件和相關材料都可以在GitHub上找到。
● 使用Jupyter notebook和IPython shell進行探索性計算
● 學習NumPy的基礎功能和高級功能
● 學習pandas庫中的數(shù)據(jù)分析工具
● 使用各種靈活的工具來加載、清理、轉換、合并和重塑數(shù)據(jù)
● 使用matplotlib創(chuàng)建內(nèi)容豐富的可視化圖表
● 運用pandas的groupBy工具對數(shù)據(jù)集進行切片、切塊和匯總
● 分析和處理規(guī)則以及不規(guī)則的時間序列數(shù)據(jù)
● 通過全面詳盡的例子學習如何解決真實世界的數(shù)據(jù)分析問題
- Preface
- 1. Preliminaries
- 1.1 What Is This Book About?
- 1.2 Why Python for Data Analysis?
- 1.3 Essential Python Libraries
- 1.4 Installation and Setup
- 1.5 Community and Conferences
- 1.6 Navigating This Book
- 2. Python Language Basics, IPython, and Jupyter Notebooks
- 2.1 The Python Interpreter
- 2.2 IPython Basics
- 2.3 Python Language Basics
- 2.4 Conclusion
- 3. Built-In Data Structures, Functions, and Files
- 3.1 Data Structures and Sequences
- 3.2 Functions
- 3.3 Files and the Operating System
- 3.4 Conclusion
- 4. NumPy Basics: Arrays and Vectorized Computation
- 4.1 The NumPy ndarray: A Multidimensional Array Object
- 4.2 Pseudorandom Number Generation
- 4.3 Universal Functions: Fast Element-Wise Array Functions
- 4.4 Array-Oriented Programming with Arrays
- 4.5 File Input and Output with Arrays
- 4.6 Linear Algebra
- 4.7 Example: Random Walks
- 4.8 Conclusion
- 5. Getting Started with pandas
- 5.1 Introduction to pandas Data Structures
- 5.2 Essential Functionality
- 5.3 Summarizing and Computing Descriptive Statistics
- 5.4 Conclusion
- 6. Data Loading, Storage, and File Formats
- 6.1 Reading and Writing Data in Text Format
- 6.2 Binary Data Formats
- 6.3 Interacting with Web APIs
- 6.4 Interacting with Databases
- 6.5 Conclusion
- 7. Data Cleaning and Preparation
- 7.1 Handling Missing Data
- 7.2 Data Transformation
- 7.3 Extension Data Types
- 7.4 String Manipulation
- 7.5 Categorical Data
- 7.6 Conclusion
- 8. Data Wrangling: Join, Combine, and Reshape
- 8.1 Hierarchical Indexing
- 8.2 Combining and Merging Datasets
- 8.3 Reshaping and Pivoting
- 8.4 Conclusion
- 9. Plotting and Visualization
- 9.1 A Brief matplotlib API Primer
- 9.2 Plotting with pandas and seaborn
- 9.3 Other Python Visualization Tools
- 9.4 Conclusion
- 10. Data Aggregation and Group Operations
- 10.1 How to Think About Group Operations
- 10.2 Data Aggregation
- 10.3 Apply: General split-apply-combine
- 10.4 Group Transforms and “Unwrapped” GroupBys
- 10.5 Pivot Tables and Cross-Tabulation
- 10.6 Conclusion
- 11. Time Series
- 11.1 Date and Time Data Types and Tools
- 11.2 Time Series Basics
- 11.3 Date Ranges, Frequencies, and Shifting
- 11.4 Time Zone Handling
- 11.5 Periods and Period Arithmetic
- 11.6 Resampling and Frequency Conversion
- 11.7 Moving Window Functions
- 11.8 Conclusion
- 12. Introduction to Modeling Libraries in Python
- 12.1 Interfacing Between pandas and Model Code
- 12.2 Creating Model Descriptions with Patsy
- 12.3 Introduction to statsmodels
- 12.4 Introduction to scikit-learn
- 12.5 Conclusion
- 13. Data Analysis Examples
- 13.1 Bitly Data from 1.USA.gov
- 13.2 MovieLens 1M Dataset
- 13.3 US Baby Names 1880–2010
- 13.4 USDA Food Database
- 13.5 2012 Federal Election Commission Database
- 13.6 Conclusion
- A. Advanced NumPy
- A.1 ndarray Object Internals
- A.2 Advanced Array Manipulation
- A.3 Broadcasting
- A.4 Advanced ufunc Usage
- A.5 Structured and Record Arrays
- A.6 More About Sorting
- A.7 Writing Fast NumPy Functions with Numba
- A.8 Advanced Array Input and Output
- A.9 Performance Tips
- B. More on the IPython System
- B.1 Terminal Keyboard Shortcuts
- B.2 About Magic Commands
- B.3 Using the Command History
- B.4 Interacting with the Operating System
- B.5 Software Development Tools
- B.6 Tips for Productive Code Development Using IPython
- B.7 Advanced IPython Features
- B.8 Conclusion
- Index
書名:Python數(shù)據(jù)分析 第3版(影印版)
國內(nèi)出版社:東南大學出版社
出版時間:2022年11月
頁數(shù):561
書號:978-7-5766-0250-0
原版書書名:Python for Data Analysis, 3e
原版書出版商:O'Reilly Media
Wes McKinney
Wes McKinney是紐約的一名數(shù)據(jù)分析高手和企業(yè)主。在2007年獲得MIT的數(shù)學學士學位之后,他到位于康涅狄格州格林威治市(Greenwich,CT)的AQR Capital Management公司從事定量金融方面的工作。由于不滿那些數(shù)據(jù)分析工具的各種不好用,他開始學習Python,并于2008年開始構建pandas項目。他目前是Python科學計算社區(qū)的活躍分子,而且積極倡導在數(shù)據(jù)分析、金融以及統(tǒng)計應用中使用Python。
The animal on the cover of Python for Data Analysis is a golden-tailed, or pen-tailed, tree shrew (Ptilocercus lowii). The golden-tailed tree shrew is the only one of its species in the genus Ptilocercus and family Ptilocercidae; all the other tree shrews are of the family Tupaiidae. Tree shrews are identified by their long tails and soft red-brown fur. As nicknamed, the golden-tailed tree shrew has a tail that resembles the feather on a quill pen. Tree shrews are omnivores, feeding primarily on insects, fruit, seeds, and small vertebrates.
Found predominantly in Indonesia, Malaysia, and Thailand, these wild mammals are known for their chronic consumption of alcohol. Malaysian tree shrews were found to spend several hours consuming the naturally fermented nectar of the bertam palm, equalling about 10 to 12 glasses of wine with 3.8% alcohol content. Despite this, no golden-tailed tree shrew has ever been intoxicated, thanks largely to their impressive ability to break down ethanol, which includes metabolizing the alcohol in a way not used by humans. Also more impressive than any of their mammal counterparts, including humans, is their brain-to-body mass ratio.
Despite its name, the golden-tailed shrew is not a true shrew; instead it is more closely related to primates. Because of their close relation, tree shrews have become an alternative to primates in medical experimentation for myopia, psychosocial stress, and hepatitis.