人工智能的數(shù)學(xué)基礎(chǔ)(影印版)
出版時(shí)間:2024年03月
頁數(shù):568
“技術(shù)和人工智能市場(chǎng)就像一條河流,一部分比另一部分流動(dòng)得更快。要想成功應(yīng)用AI,需要具備評(píng)估流動(dòng)方向的技能,還要有堅(jiān)實(shí)的基礎(chǔ)作為補(bǔ)充,本書以一種引人入勝、令人愉悅和包容的方式實(shí)現(xiàn)了這一點(diǎn)。Hala為人工智能未來的眾多參與者帶來了數(shù)學(xué)的樂趣!”
——Adri Purkayastha
法國(guó)巴黎銀行AI運(yùn)維風(fēng)險(xiǎn)和數(shù)字風(fēng)險(xiǎn)分析部門主管
許多部門和行業(yè)都渴望將AI和數(shù)據(jù)驅(qū)動(dòng)技術(shù)整合到自己的系統(tǒng)和運(yùn)營(yíng)中。但要構(gòu)建真正成功的AI系統(tǒng),你需要牢掌握底層的數(shù)學(xué)知識(shí)。這本全面指南彌補(bǔ)了AI所展現(xiàn)出的無限潛力和應(yīng)用與相關(guān)數(shù)學(xué)基礎(chǔ)之間的存在的現(xiàn)實(shí)差距。
作者Hala Nelson并沒有討論高深的學(xué)術(shù)理論,而是以現(xiàn)實(shí)世界的應(yīng)用和最先進(jìn)的模型為重點(diǎn),介紹了在人工智能領(lǐng)域發(fā)展所需的數(shù)學(xué)知識(shí)。你將在專門的AI背景下探索回歸、神經(jīng)網(wǎng)絡(luò)、卷積、優(yōu)化、概率、馬爾可夫過程、微分方程等主題。工程師、數(shù)據(jù)科學(xué)家、數(shù)學(xué)家、科學(xué)家將為在AI和數(shù)學(xué)領(lǐng)域取得成功打下堅(jiān)實(shí)的基礎(chǔ)。
你將能夠:
● 熟練運(yùn)用AI、機(jī)器學(xué)習(xí)、數(shù)據(jù)科學(xué)和數(shù)學(xué)的語言
● 在數(shù)學(xué)結(jié)構(gòu)下統(tǒng)一機(jī)器學(xué)習(xí)模型和自然語言模型
● 輕松處理圖形和網(wǎng)絡(luò)數(shù)據(jù)
● 探索真實(shí)數(shù)據(jù),可視化空間變換,降低維度,處理圖像
● 為不同的數(shù)據(jù)驅(qū)動(dòng)項(xiàng)目選擇合適的模型
● 探索AI的各種影響和局限性
- Preface
- 1. Why Learn the Mathematics of AI?
- What Is AI?
- Why Is AI So Popular Now?
- What Is AI Able to Do?
- What Are AI’s Limitations?
- What Happens When AI Systems Fail?
- Where Is AI Headed?
- Who Are the Current Main Contributors to the AI Field?
- What Math Is Typically Involved in AI?
- Summary and Looking Ahead
- 2. Data, Data, Data
- Data for AI
- Real Data Versus Simulated Data
- Mathematical Models: Linear Versus Nonlinear
- An Example of Real Data
- An Example of Simulated Data
- Mathematical Models: Simulations and AI
- Where Do We Get Our Data From?
- The Vocabulary of Data Distributions, Probability, and Statistics
- Continuous Distributions Versus Discrete Distributions (Density Versus Mass)
- The Power of the Joint Probability Density Function
- Distribution of Data: The Uniform Distribution
- Distribution of Data: The Bell-Shaped Normal (Gaussian) Distribution
- Distribution of Data: Other Important and Commonly Used Distributions
- The Various Uses of the Word “Distribution”
- A/B Testing
- Summary and Looking Ahead
- 3. Fitting Functions to Data
- Traditional and Very Useful Machine Learning Models
- Numerical Solutions Versus Analytical Solutions
- Regression: Predict a Numerical Value
- Logistic Regression: Classify into Two Classes
- Softmax Regression: Classify into Multiple Classes
- Incorporating These Models into the Last Layer of a Neural Network
- Other Popular Machine Learning Techniques and Ensembles of Techniques
- Performance Measures for Classification Models
- Summary and Looking Ahead
- 4. Optimization for Neural Networks
- The Brain Cortex and Artificial Neural Networks
- Training Function: Fully Connected, or Dense, Feed Forward Neural Networks
- Loss Functions
- Optimization
- Regularization Techniques
- Hyperparameter Examples That Appear in Machine Learning
- Assessing the Significance of the Input Data Features
- Summary and Looking Ahead
- 5. Convolutional Neural Networks and Computer Vision
- Convolution and Cross-Correlation
- Convolution from a Systems Design Perspective
- Convolution and One-Dimensional Discrete Signals
- Convolution and Two-Dimensional Discrete Signals
- Linear Algebra Notation
- Pooling
- A Convolutional Neural Network for Image Classification
- Summary and Looking Ahead
- 6. Singular Value Decomposition: Image Processing, Natural Language Processing, and Social Media
- Matrix Factorization
- Diagonal Matrices
- Matrices as Linear Transformations Acting on Space
- Three Ways to Multiply Matrices
- The Big Picture
- The Ingredients of the Singular Value Decomposition
- Singular Value Decomposition Versus the Eigenvalue Decomposition
- Computation of the Singular Value Decomposition
- The Pseudoinverse
- Applying the Singular Value Decomposition to Images
- Principal Component Analysis and Dimension Reduction
- Principal Component Analysis and Clustering
- A Social Media Application
- Latent Semantic Analysis
- Randomized Singular Value Decomposition
- Summary and Looking Ahead
- 7. Natural Language and Finance AI: Vectorization and Time Series
- Natural Language AI
- Preparing Natural Language Data for Machine Processing
- Statistical Models and the log Function
- Zipf’s Law for Term Counts
- Various Vector Representations for Natural Language Documents
- Cosine Similarity
- Natural Language Processing Applications
- Transformers and Attention Models
- Convolutional Neural Networks for Time Series Data
- Recurrent Neural Networks for Time Series Data
- An Example of Natural Language Data
- Finance AI
- Summary and Looking Ahead
- 8. Probabilistic Generative Models
- What Are Generative Models Useful For?
- The Typical Mathematics of Generative Models
- Shifting Our Brain from Deterministic Thinking to Probabilistic Thinking
- Maximum Likelihood Estimation
- Explicit and Implicit Density Models
- Explicit Density-Tractable: Fully Visible Belief Networks
- Explicit Density-Tractable: Change of Variables Nonlinear Independent Component Analysis
- Explicit Density-Intractable: Variational Autoencoders Approximation via Variational Methods
- Explicit Density-Intractable: Boltzman Machine Approximation via Markov Chain
- Implicit Density-Markov Chain: Generative Stochastic Network
- Implicit Density-Direct: Generative Adversarial Networks
- Example: Machine Learning and Generative Networks for High Energy Physics
- Other Generative Models
- The Evolution of Generative Models
- Probabilistic Language Modeling
- Summary and Looking Ahead
- 9. Graph Models
- Graphs: Nodes, Edges, and Features for Each
- Example: PageRank Algorithm
- Inverting Matrices Using Graphs
- Cayley Graphs of Groups: Pure Algebra and Parallel Computing
- Message Passing Within a Graph
- The Limitless Applications of Graphs
- Random Walks on Graphs
- Node Representation Learning
- Tasks for Graph Neural Networks
- Dynamic Graph Models
- Bayesian Networks
- Graph Diagrams for Probabilistic Causal Modeling
- A Brief History of Graph Theory
- Main Considerations in Graph Theory
- Algorithms and Computational Aspects of Graphs
- Summary and Looking Ahead
- 10. Operations Research
- No Free Lunch
- Complexity Analysis and O() Notation
- Optimization: The Heart of Operations Research
- Thinking About Optimization
- Optimization on Networks
- The n-Queens Problem
- Linear Optimization
- Game Theory and Multiagents
- Queuing
- Inventory
- Machine Learning for Operations Research
- Hamilton-Jacobi-Bellman Equation
- Operations Research for AI
- Summary and Looking Ahead
- 11. Probability
- Where Did Probability Appear in This Book?
- What More Do We Need to Know That Is Essential for AI?
- Causal Modeling and the Do Calculus
- Paradoxes and Diagram Interpretations
- Large Random Matrices
- Stochastic Processes
- Markov Decision Processes and Reinforcement Learning
- Theoretical and Rigorous Grounds
- Summary and Looking Ahead
- 12. Mathematical Logic
- Various Logic Frameworks
- Propositional Logic
- First-Order Logic
- Probabilistic Logic
- Fuzzy Logic
- Temporal Logic
- Comparison with Human Natural Language
- Machines and Complex Mathematical Reasoning
- Summary and Looking Ahead
- 13. Artificial Intelligence and Partial Differential Equations
- What Is a Partial Differential Equation?
- Modeling with Differential Equations
- Numerical Solutions Are Very Valuable
- Some Statistical Mechanics: The Wonderful Master Equation
- Solutions as Expectations of Underlying Random Processes
- Transforming the PDE
- Solution Operators
- AI for PDEs
- Hamilton-Jacobi-Bellman PDE for Dynamic Programming
- PDEs for AI?
- Other Considerations in Partial Differential Equations
- Summary and Looking Ahead
- 14. Artificial Intelligence, Ethics, Mathematics, Law, and Policy
- Good AI
- Policy Matters
- What Could Go Wrong?
- How to Fix It?
- Distinguishing Bias from Discrimination
- The Hype
- Final Thoughts
- Index
書名:人工智能的數(shù)學(xué)基礎(chǔ)(影印版)
國(guó)內(nèi)出版社:東南大學(xué)出版社
出版時(shí)間:2024年03月
頁數(shù):568
書號(hào):978-7-5766-1223-3
原版書書名:Essential Math for AI
原版書出版商:O'Reilly Media
Hala Nelson
Hala Nelson是詹姆斯·麥迪遜大學(xué)數(shù)學(xué)系副教授,專門研究數(shù)學(xué)建模,并為公共部門提供應(yīng)急和基礎(chǔ)設(shè)施服務(wù)方面的咨詢。她擁有紐約大學(xué)庫蘭特?cái)?shù)學(xué)科學(xué)研究所的數(shù)學(xué)博士學(xué)位。
The animal on the cover of Essential Math for AI is a harnessed bushbuck (Tragelaphus scriptus scriptus), an antelope found throughout sub-Saharan Africa. The animals live in many types of habitat, such as woodland, savanna, and rainforest. The harnessed bushbuck is named for a pattern of white stripes and spots along its back and flanks that resembles a saddle or harness. These white patches also appear on the animal’s neck, ears, and chin.
The harnessed bushbuck is the smallest of eight bushbuck subspecies, generally standing about 30 inches tall at the shoulder and weighing 70–100 pounds. Its coat is reddish-brown, though females tend to be lighter in color and have more conspicuous white markings. Male bushbucks also sport horns, which appear around the age of 10 months and eventually develop a single twist. Bushbucks graze on the leaves of trees and shrubs, as well as flowering plants—it is uncommon for them to eat grass.
The bushbuck is most active during the day and lives a solitary life within a defined territory. However, while they don’t gather in groups, neither are these animals overly aggressive. The male’s horns can be used in mating displays, to drive away competitors when a female is in heat, and for the rare territorial dispute, but adult bushbuck tend to avoid contact with each other. Female bushbucks bear one calf at a time, and hide the young one very carefully after birth, only visiting it to nurse. The mother also eats the calf ’s dung so predators are not drawn to the area. After about four months, the calf begins to accompany its mother to graze and play.