91精品国产综合久久四虎久久_国产成人午夜高潮毛片_99er视频精品免费观看_2020亚洲熟女在线观看_日本女优人体写真_国内黄色毛片_年轻的老师中文版在线_丰满女邻居做爰_久久久久久精品成人免费图片

 
數(shù)據(jù)科學(xué)入門(影印版)
數(shù)據(jù)科學(xué)入門(影印版)
Sam Lau, Joseph Gonzalez, Deborah Nolan
出版時(shí)間:2024年03月
頁數(shù):594
“我真希望在第一次用‘?dāng)?shù)據(jù)科學(xué)家’這個(gè)詞來描述我們的工作時(shí)能有這本書。如果你想從事數(shù)據(jù)科學(xué)/工程、AI或機(jī)器學(xué)習(xí),這本書就是你的起點(diǎn)。”
——DJ Patil博士
美國第一位首席數(shù)據(jù)科學(xué)家

作為一名有抱負(fù)的數(shù)據(jù)科學(xué)家,你理解為什么組織機(jī)構(gòu)的重要決策都依賴于數(shù)據(jù) —— 無論是設(shè)計(jì)網(wǎng)站的公司、決定如何改善服務(wù)的城市,還是致力于阻止疾病傳播的科學(xué)家。你需要具備將一堆雜亂的數(shù)據(jù)提煉成可操作的洞見所需的技能。我們稱之為數(shù)據(jù)科學(xué)生命周期:收集、整理、分析數(shù)據(jù)并從中得出結(jié)論的過程。
本書是第一本兼顧編程和統(tǒng)計(jì)學(xué)基礎(chǔ)技能的書籍,涵蓋了整個(gè)數(shù)據(jù)科學(xué)生命周期。它面向那些希望成為數(shù)據(jù)科學(xué)家或與數(shù)據(jù)科學(xué)家合作的讀者,以及希望跨越“技術(shù)/非技術(shù)”界限的數(shù)據(jù)分析師。如果具備基本的Python編程知識(shí),你將學(xué)到如何使用像pandas這樣的行業(yè)標(biāo)準(zhǔn)工具來處理數(shù)據(jù)。
● 將感興趣的問題提煉為可通過數(shù)據(jù)研究的問題
● 進(jìn)行數(shù)據(jù)收集可能涉及的文本處理、Web抓取等技術(shù)
● 通過數(shù)據(jù)清洗、探索和可視化獲得有價(jià)值的洞見
● 學(xué)習(xí)如何使用建模來描述數(shù)據(jù)
● 將研究結(jié)果推廣到數(shù)據(jù)之外
  1. Preface
  2. Part I. The Data Science Lifecycle
  3. 1. The Data Science Lifecycle
  4. The Stages of the Lifecycle
  5. Examples of the Lifecycle
  6. Summary
  7. 2. Questions and Data Scope
  8. Big Data and New Opportunities
  9. Target Population, Access Frame, and Sample
  10. Instruments and Protocols
  11. Measuring Natural Phenomena
  12. Accuracy
  13. Summary
  14. 3. Simulation and Data Design
  15. The Urn Model
  16. Example: Simulating Election Poll Bias and Variance
  17. Example: Simulating a Randomized Trial for a Vaccine
  18. Example: Measuring Air Quality
  19. Summary
  20. 4. Modeling with Summary Statistics
  21. The Constant Model
  22. Minimizing Loss
  23. Summary
  24. 5. Case Study: Why Is My Bus Always Late?
  25. Question and Scope
  26. Data Wrangling
  27. Exploring Bus Times
  28. Modeling Wait Times
  29. Summary
  30. Part II. Rectangular Data
  31. 6. Working with Dataframes Using pandas
  32. Subsetting
  33. Aggregating
  34. Joining
  35. Transforming
  36. How Are Dataframes Different from Other Data Representations?
  37. Summary
  38. 7. Working with Relations Using SQL
  39. Subsetting
  40. Aggregating
  41. Joining
  42. Transforming and Common Table Expressions
  43. Summary
  44. Part III. Understanding The Data
  45. 8. Wrangling Files
  46. Data Source Examples
  47. File Formats
  48. File Encoding
  49. File Size
  50. The Shell and Command-Line Tools
  51. Table Shape and Granularity
  52. Summary
  53. 9. Wrangling Dataframes
  54. Example: Wrangling CO2 Measurements from the Mauna Loa Observatory
  55. Quality Checks
  56. Missing Values and Records
  57. Transformations and Timestamps
  58. Modifying Structure
  59. Example: Wrangling Restaurant Safety Violations
  60. Summary
  61. 10. Exploratory Data Analysis
  62. Feature Types
  63. What to Look For in a Distribution
  64. What to Look For in a Relationship
  65. Comparisons in Multivariate Settings
  66. Guidelines for Exploration
  67. Example: Sale Prices for Houses
  68. Summary
  69. 11. Data Visualization
  70. Choosing Scale to Reveal Structure
  71. Smoothing and Aggregating Data
  72. Facilitating Meaningful Comparisons
  73. Incorporating the Data Design
  74. Adding Context
  75. Creating Plots Using plotly
  76. Other Tools for Visualization
  77. Summary
  78. 12. Case Study: How Accurate Are Air Quality Measurements?
  79. Question, Design, and Scope
  80. Finding Collocated Sensors
  81. Wrangling and Cleaning AQS Sensor Data
  82. Wrangling PurpleAir Sensor Data
  83. Exploring PurpleAir and AQS Measurements
  84. Creating a Model to Correct PurpleAir Measurements
  85. Summary
  86. Part IV. Other Data Sources
  87. 13. Working with Text
  88. Examples of Text and Tasks
  89. String Manipulation
  90. Regular Expressions
  91. Text Analysis
  92. Summary
  93. 14. Data Exchange
  94. NetCDF Data
  95. JSON Data
  96. HTTP
  97. REST
  98. XML, HTML, and XPath
  99. Summary
  100. Part V. Linear Modeling
  101. 15. Linear Models
  102. Simple Linear Model
  103. Example: A Simple Linear Model for Air Quality
  104. Fitting the Simple Linear Model
  105. Multiple Linear Model
  106. Fitting the Multiple Linear Model
  107. Example: Where Is the Land of Opportunity?
  108. Feature Engineering for Numeric Measurements
  109. Feature Engineering for Categorical Measurements
  110. Summary
  111. 16. Model Selection
  112. Overfitting
  113. Train-Test Split
  114. Cross-Validation
  115. Regularization
  116. Model Bias and Variance
  117. Summary
  118. 17. Theory for Inference and Prediction
  119. Distributions: Population, Empirical, Sampling
  120. Basics of Hypothesis Testing
  121. Bootstrapping for Inference
  122. Basics of Confidence Intervals
  123. Basics of Prediction Intervals
  124. Probability for Inference and Prediction
  125. Summary
  126. 18. Case Study: How to Weigh a Donkey
  127. Donkey Study Question and Scope
  128. Wrangling and Transforming
  129. Exploring
  130. Modeling a Donkey’s Weight
  131. Summary
  132. Part VI. Classification
  133. 19. Classification
  134. Example: Wind-Damaged Trees
  135. Modeling and Classification
  136. Modeling Proportions (and Probabilities)
  137. A Loss Function for the Logistic Model
  138. From Probabilities to Classification
  139. Summary
  140. 20. Numerical Optimization
  141. Gradient Descent Basics
  142. Minimizing Huber Loss
  143. Convex and Differentiable Loss Functions
  144. Variants of Gradient Descent
  145. Summary
  146. 21. Case Study: Detecting Fake News
  147. Question and Scope
  148. Obtaining and Wrangling the Data
  149. Exploring the Data
  150. Modeling
  151. Summary
  152. Additional Material
  153. Data Sources
  154. Index
書名:數(shù)據(jù)科學(xué)入門(影印版)
國內(nèi)出版社:東南大學(xué)出版社
出版時(shí)間:2024年03月
頁數(shù):594
書號(hào):978-1098113001
原版書書名:Learning Data Science
原版書出版商:O'Reilly Media
Sam Lau
 
Sam Lau是加州大學(xué)圣地亞哥分校Halicioglu數(shù)據(jù)科學(xué)研究所的助理教學(xué)教授。Sam擁有十年的教學(xué)經(jīng)驗(yàn),并曾在加州大學(xué)伯克利分校和加州大學(xué)圣地亞哥分校設(shè)計(jì)并教授一流的數(shù)據(jù)科學(xué)課程。
 
 
Joseph Gonzalez
 
Joey Gonzalez是加州大學(xué)伯克利分校電子工程與計(jì)算機(jī)科學(xué)系副教授,是伯克利人工智能研究組成員,也是伯克利RISE實(shí)驗(yàn)室創(chuàng)始成員。他還共同創(chuàng)立了Turi Inc.和Aqueduct,為數(shù)據(jù)科學(xué)家開發(fā)各種工具。
 
 
Deborah Nolan
 
Deborah Nolan是加州大學(xué)伯克利分校計(jì)算機(jī)、數(shù)據(jù)科學(xué)和社會(huì)學(xué)院的統(tǒng)計(jì)學(xué)名譽(yù)教授兼學(xué)生事務(wù)副院長。
 
 
The animal on the cover of Learning Data Science is an edible dormouse (Glis glis). As you might suspect, these creatures have wound up in human cuisine. The edible dormouse was served grilled as a delicacy in ancient Rome and is still consumed today in Croatia and Slovenia. Edible dormice have squirrel-like bodies with small ears, short legs, large feet, and long, bushy tails. Their front feet have four digits and their hind feet have five. They are predominantly covered in gray to gray-brown fur with white underbellies. Their feet have naked soles that secrete a sticky substance that enables climbing.
These nocturnal creatures spend most of their time in trees. They can be found across Europe and in parts of western and central Asia. While the IUCN categorizes edible dormice as a species of Least Concern, they are threatened by illegal hunting and habitat loss. Many of the animals on O’Reilly covers are endangered; all of them are important to the world. The cover illustration is by Karen Montgomery, based on an antique line engraving from Lydekker’s Royal Natural History.
購買選項(xiàng)
定價(jià):169.00元
書號(hào):978-1098113001
出版社:東南大學(xué)出版社