Introduction to Data Science, by Harvard biostatistics professor Rafael A. Irizarry, is an open source book that provides, as you might have guessed, an introduction to data science:

The demand for skilled data science practitioners in industry, academia, and government is rapidly growing. This book introduces concepts and skills that can help you tackle real-world data analysis challenges. It covers concepts from probability, statistical inference, linear regression, and machine learning.