設計機器學習系統(tǒng)(影印版)
出版時間:2022年09月
頁數(shù):367
“簡而言之,這是一本關于如何在公司構(gòu)建、部署和擴展機器學習模型以獲得最大影響的佳作。”
一Josh Wills
WeaveGrid軟件工程師,Slack前數(shù)據(jù)工程總監(jiān)
“在一個蓬勃發(fā)展卻又混亂的生態(tài)系統(tǒng)中,這種關于端到端機器學習的原則性觀點既是你的地圖,也是你的指南針:無論你是否身處大型科技公司,這都是一本必讀之作?!?br />
--Jacopo Tagliabue
Coveo人工智能總監(jiān)
機器學習系統(tǒng)既復雜又獨特。復雜是因為包含大量組件,涉及許多不同的利益方;獨特是因為其依賴于數(shù)據(jù),不同用例之間的數(shù)據(jù)差異很大。在本書中,你將學習以一種整體方法來設計兼具可靠性、可伸縮性、可維護性,并能適應不斷變化的環(huán)境和業(yè)務需求的機器學習系統(tǒng)。
作者Chip Huyen是Claypot Al的聯(lián)合創(chuàng)始人,她在如何幫助系統(tǒng)作為一個整體實現(xiàn)其目標的背景下考慮了每一種設計決策,例如如何處理和創(chuàng)建訓練數(shù)據(jù),使用哪些特性,重新訓練模型的頻率,以及監(jiān)測哪些內(nèi)容。書中的迭代框架采用了真實的案例研究,并輔以大量參考資料。
這本書將幫助你處理以下情況:
● 工程化數(shù)據(jù)并選擇正確的指標來解決業(yè)務問題
● 實現(xiàn)持續(xù)開發(fā)、評估、部署和更新模型的流程自動化
● 開發(fā)監(jiān)控系統(tǒng),快速檢測和解決模型在生產(chǎn)中可能遇到的問題
● 構(gòu)建跨用例服務的機器學習平臺
● 開發(fā)可靠的機器學習系統(tǒng)
- Preface
- 1. Overview of Machine Learning Systems
- When to Use Machine Learning
- Understanding Machine Learning Systems
- Summary
- 2. Introduction to Machine Learning Systems Design
- Business and ML Objectives
- Requirements for ML Systems
- Iterative Process
- Framing ML Problems
- Mind Versus Data
- Summary
- 3. Data Engineering Fundamentals
- Data Sources
- Data Formats
- Data Models
- Data Storage Engines and Processing
- Modes of Dataflow
- Batch Processing Versus Stream Processing
- Summary
- 4. Training Data
- Sampling
- Labeling
- Class Imbalance
- Data Augmentation
- Summary
- 5. Feature Engineering
- Learned Features Versus Engineered Features
- Common Feature Engineering Operations
- Data Leakage
- Engineering Good Features
- Summary
- 6. Model Development and Offline Evaluation
- Model Development and Training
- Model Offline Evaluation
- Summary
- 7. Model Deployment and Prediction Service
- Machine Learning Deployment Myths
- Batch Prediction Versus Online Prediction
- Model Compression
- ML on the Cloud and on the Edge
- Summary
- 8. Data Distribution Shifts and Monitoring
- Causes of ML System Failures
- Data Distribution Shifts
- Monitoring and Observability
- Summary
- 9. Continual Learning and Test in Production
- Continual Learning
- Test in Production
- Summary
- 10. Infrastructure and Tooling for MLOps
- Storage and Compute
- Development Environment
- Resource Management
- ML Platform
- Build Versus Buy
- Summary
- 11. The Human Side of Machine Learning
- User Experience
- Team Structure
- Responsible AI
- Summary
- Epilogue
- Index
書名:設計機器學習系統(tǒng)(影印版)
國內(nèi)出版社:東南大學出版社
出版時間:2022年09月
頁數(shù):367
書號:978-7-5766-0224-1
原版書書名:Designing Machine Learning Systems
原版書出版商:O'Reilly Media
Chip Huyen
Chip Huyen是實時機器學習平臺Claypot AI的聯(lián)合創(chuàng)始人。憑借在 NVIDIA、Netflix和Snorkel Al的工作,她幫助了一些世界上最大的組織開發(fā)和部署機器學習系統(tǒng)。本書是Chip根據(jù)她在斯坦福大學開設的課程“機器學習系統(tǒng)設計”(CS329S)的講義撰寫的。
The animal on the cover of Designing Machine Learning Systems is a red-legged partridge (Alectoris rufa), also known as a French partridge.
Bred for centuries as a gamebird, this economically important, largely nonmigratory member of the pheasant family is native to western continental Europe, though populations have been introduced elsewhere, including England, Ireland, and New Zealand.
Relatively small but stout bodied, the red-legged partridge boasts ornate coloration and feather patterning, with light brown to gray plumage along its back, a light pink belly, a cream-colored throat, a brilliant red bill, and rufous or black barring on its flanks.
Feeding primarily on seeds, leaves, grasses, and roots, but also on insects, red-legged partridges breed each year in dry lowland areas, such as farmland, laying their eggs in ground nests. Though they continue to be bred in large numbers, these birds are now considered near threatened due to steep population declines attributed, in part, to overhunting and disappearance of habitat. Like all animals on O’Reilly covers, they’re vitally important to our world.