CS3352-FOUNDATIONS OF DATA SCIENCE IMPORTANT QUESTIONS

COURSE OBJECTIVES:
 To understand the data science fundamentals and process.
 To learn to describe the data for the data science process.
 To learn to describe the relationship between data.
 To utilize the Python libraries for Data Wrangling.
 To present and interpret data using visualization libraries in Python

UNIT I INTRODUCTION
Data Science: Benefits and uses – facets of data – Data Science Process: Overview – Defining
research goals – Retrieving data – Data preparation – Exploratory Data analysis – build the model–
presenting findings and building applications – Data Mining – Data Warehousing – Basic Statistical
descriptions of Data

UNIT II DESCRIBING DATA
Types of Data – Types of Variables -Describing Data with Tables and Graphs –Describing Data
with Averages – Describing Variability – Normal Distributions and Standard (z) Scores

UNIT III DESCRIBING RELATIONSHIPS
Correlation –Scatter plots –correlation coefficient for quantitative data –computational formula for
correlation coefficient – Regression –regression line –least squares regression line – Standard
error of estimate – interpretation of r2 –multiple regression equations –regression towards the mean

UNIT IV PYTHON LIBRARIES FOR DATA WRANGLING
Basics of Numpy arrays –aggregations –computations on arrays –comparisons, masks, boolean
logic – fancy indexing – structured arrays – Data manipulation with Pandas – data indexing and
selection – operating on data – missing data – Hierarchical indexing – combining datasets –
aggregation and grouping – pivot tables

UNIT V DATA VISUALIZATION
Importing Matplotlib – Line plots – Scatter plots – visualizing errors – density and contour plots –
Histograms – legends – colors – subplots – text and annotation – customization – three dimensional
plotting – Geographic Data with Basemap – Visualization with Seaborn

UNIT I

1.Exploratory of data analysis

Data Preparation
3.Data mining and warehousing
4.Data Science Process

UNIT II

1.Describing data with tables

[Frequency distribution for quantative data, constructing FD, Outliers relative and cumulative frequency distribution, frequency distribution for qualitative]

2.Graphs for Quantitative data
3.Probems
[Standard deviation, interquartile range, z scores]

UNIT III

Interpretation of R
Standard Error of estimate (problem)
Regression and Least Square regression (problem)
Coefficient of correlation (problem)

UNIT IV

Numpy array
Comparisons, mask, Boolean array
Structured arrays, Hierachical Indexing
Data Manipulation with pandas

UNIT V

Line plot and Scatter plots
Geographic Data with basemap- Visualization with seaborn
Density and Contour plots
Histogram- legend- 3 dimensional plot

CS3352-FOUNDATIONS OF DATA SCIENCE IMPORTANT QUESTIONS

Leave a Reply Cancel reply

AD3491 FUNDAMENTALS OF DATA SCIENCE AND ANALYTICS

AL3452 Operating Systems Important questions

CS3452 THEORY OF COMPUTATION Important Questions

CS3401 Algorithms Important questions

AD3491 FUNDAMENTALS OF DATA SCIENCE AND ANALYTICS

AL3452 Operating Systems Important questions

CS3452 THEORY OF COMPUTATION Important Questions

CS3401 Algorithms Important questions