Normal view MARC view ISBD view

SQL for data science : data cleaning, wrangling and analytics with relational databases (Record no. 30489)

000 -LEADER
fixed length control field	a
008 - FIXED-LENGTH DATA ELEMENTS--GENERAL INFORMATION
fixed length control field	211026b xxu\|\|\|\|\| \|\|\|\| 00\| 0 eng d
020 ## - INTERNATIONAL STANDARD BOOK NUMBER
International Standard Book Number	9783030575915
082 ## - DEWEY DECIMAL CLASSIFICATION NUMBER
Classification number	005.74
Item number	BAD
100 ## - MAIN ENTRY--PERSONAL NAME
Personal name	Badia, Antonio
245 ## - TITLE STATEMENT
Title	SQL for data science : data cleaning, wrangling and analytics with relational databases
260 ## - PUBLICATION, DISTRIBUTION, ETC. (IMPRINT)
Name of publisher, distributor, etc	Springer,
Date of publication, distribution, etc	2020
Place of publication, distribution, etc	Cham :
300 ## - PHYSICAL DESCRIPTION
Extent	xi, 285 p. ;
Other physical details	ill.,
Dimensions	24 cm
365 ## - TRADE PRICE
Price amount	49.99
Price type code	EUR
Unit of pricing	90.50
490 ## - SERIES STATEMENT
Series statement	Data-centric systems and applications,
Volume number/sequential designation	2197-9723
504 ## - BIBLIOGRAPHY, ETC. NOTE
Bibliography, etc	Includes bibliographical references and index.
520 ## - SUMMARY, ETC.
Summary, etc	This textbook explains SQL within the context of data science and introduces the different parts of SQL as they are needed for the tasks usually carried out during data analysis. Using the framework of the data life cycle, it focuses on the steps that are very often given the short shift in traditional textbooks, like data loading, cleaning and pre-processing. The book is organized as follows. Chapter 1 describes the data life cycle, i.e. the sequence of stages from data acquisition to archiving, that data goes through as it is prepared and then actually analyzed, together with the different activities that take place at each stage. Chapter 2 gets into databases proper, explaining how relational databases organize data. Non-traditional data, like XML and text, are also covered. Chapter 3 introduces SQL queries, but unlike traditional textbooks, queries and their parts are described around typical data analysis tasks like data exploration, cleaning and transformation. Chapter 4 introduces some basic techniques for data analysis and shows how SQL can be used for some simple analyses without too much complication. Chapter 5 introduces additional SQL constructs that are important in a variety of situations and thus completes the coverage of SQL queries. Lastly, chapter 6 briefly explains how to use SQL from within R and from within Python programs. It focuses on how these languages can interact with a database, and how what has been learned about SQL can be leveraged to make life easier when using R or Python. All chapters contain a lot of examples and exercises on the way, and readers are encouraged to install the two open-source database systems (MySQL and Postgres) that are used throughout the book in order to practice and work on the exercises, because simply reading the book is much less useful than actually using it. This book is for anyone interested in data science and/or databases. It just demands a bit of computer fluency, but no specific background on databases or data analysis. All concepts are introduced intuitively and with a minimum of specialized jargon. After going through this book, readers should be able to profitably learn more about data mining, machine learning, and database management from more advanced textbooks and courses.
650 ## - SUBJECT ADDED ENTRY--TOPICAL TERM
Topical term or geographic name as entry element	Database Management

Topical term or geographic name as entry element	Big Data Analytics

Topical term or geographic name as entry element	SQL

Topical term or geographic name as entry element	Computer program language

Topical term or geographic name as entry element	Python

Topical term or geographic name as entry element	Association Rule

Topical term or geographic name as entry element	Binning

Topical term or geographic name as entry element	Duplicate data

Topical term or geographic name as entry element	Foreign Key

Topical term or geographic name as entry element	Outliers

Topical term or geographic name as entry element	Subquery

Topical term or geographic name as entry element	Unstructured data

Topical term or geographic name as entry element	Big data

Topical term or geographic name as entry element	Meta data

Topical term or geographic name as entry element	Data cleaning
942 ## - ADDED ENTRY ELEMENTS (KOHA)
Source of classification or shelving scheme
Item type	Books

Holdings
Withdrawn status	Lost status	Source of classification or shelving scheme	Damaged status	Not for loan	Permanent location	Current location	Date acquired	Cost, normal purchase price	Total Checkouts	Full call number	Barcode	Date last seen	Date last borrowed	Koha item type
					DAIICT	DAIICT	2021-10-21	4524.10	6	005.74 BAD	032640	2022-10-06	2022-09-22	Books

Koha online

SQL for data science : data cleaning, wrangling and analytics with relational databases (Record no. 30489)