Programming For Data Science
Programming For Data Science
NÚMERO DE CRÉDITOS 2
DURACIÓN 8 semanas
CONDICIONES DE INSCRIPCIÓN
The contents in this Module are designed to be an applied introduction with some statistical concepts. The work will be done using R
language, so you will need to install the programming language and the code editor, that are in free.
It is not necessary to have previous knowledge, but if you have experience in computer programming and statistics, this may feel more
familiar. However, all the contents were structured to learn from scratch. There are a great deal of topics and resources to learn R, so you
will often find references to manuals and online resources.
This Module will provide you with basic tools so that you can later delve into disciplinary applications. It will be extremely useful for
you.
Understanding the basic operation of programming languages for data science focused on the tasks of importing, managing, cleaning, preparing
and exploratory analysis of data, as necessary tools for the technical development of data science projects.
COMPETENCIA UNIDAD 1
Understanding the types of data involved in data science project to decide the proper programming tool to approach them for an effecient and
effective processing.
ESCENARIO 1 ESCENARIO 2
“Este documento es propiedad intelectual del POLITECNICO GRANCOLOMBIANO, se prohíbe su reproducción total
o parcial sin la autorización escrita de la Rectoría. TODO DOCUMENTO IMPRESO O DESCARGADO DEL
SISTEMA, ES CONSIDERADO COPIA NO CONTROLADA”.
Página 1 de 4
PROCESO:
FORMATO Código: JD-RG-002
Diseño y Desarrollo
de Programas SÍLABO DE MÓDULOS PREGRADO Y POSGRADO
Académicos Versión: 3
VIRTUAL
COMPETENCIA UNIDAD 2
Understanding the tools used to take raw data, clean it and transform it for the subsequent modeling that will extract knowledge.
ESCENARIO 3 ESCENARIO 4
COMPETENCIA UNIDAD 3
Understanding the main strategies and instruments for initial data exploration, to draw hypotheses, select variables and planning the design of
experiments with data mining tools.
ESCENARIO 5 ESCENARIO 6
Understanding and Understanding data simulation Understands and applies the tools of
Runs multivariate analysis using
developing multivariate tools, sampling and stochastic simulation, sampling and stochastic
programming code.
descriptive analysis. processes. processes.
COMPETENCIA UNIDAD 4
Understanding and using the different packages for the graphic exploration of data for the descriptive representation of data or the explanation of
results.
ESCENARIO 7 ESCENARIO 8
Understanding the use of Understands code that generates Presenting descriptive reports Presents descriptive reports with
Página 2 de 4
PROCESO:
FORMATO Código: JD-RG-002
Diseño y Desarrollo
de Programas SÍLABO DE MÓDULOS PREGRADO Y POSGRADO
Académicos Versión: 3
VIRTUAL
Setup
Starting tasks Basic R
Types of data in R
Tibbles 7
Importing data
Loops and iterations
Conditional declarations (conditional statements)
Functions
Statistics in geometric spaces
Areal data
Some models applied to data analysis in economics
Decision trees
Missing values
Verification of the type of variable
Approach to outliers and missing data
Grouping
Functions applied to an entire data frame
Página 3 de 4
PROCESO:
FORMATO Código: JD-RG-002
Diseño y Desarrollo
de Programas SÍLABO DE MÓDULOS PREGRADO Y POSGRADO
Académicos Versión: 3
VIRTUAL
V. APOYOS REFERENCIALES
BIBLIOGRÁFICOS
Wickham, H. & Grolemund, G. (2016). R for data science: import, tidy, transform, visualize, and model data. O’Reilly Media, Inc.
Laude, H. (2017). Data Scientist y lenguaje R Guía de autoformación para el uso de Big Data. Eni.
De Jonge, E., & Van Der Loo, M. (2013). An introduction to data cleaning with R. Statistics Netherlands Heerlen.
Burns, E. (2021). Data Cleaning in R Made Simple. Towards Data Science. https://towardsdatascience.com/data-cleaning-in-r-made-
simple-1b77303b0b17
Rincón, L. (2007). Curso elemental de probabilidad y estadística. Universidad UNAM.
Laude, H. (2017). Data Scientist y lenguaje R Guía de autoformación para el uso de Big Data. Eni.
Kabacoff, R. (2020). Data visualization with R. Wesleyan University.
Plotly. (n.d.). Plotly R Open Source Graphing Library. Plotly. https://plotly.com/r/
VI. ANEXOS
1. Desarrollo Didáctico de los Módulos en Pregrado Virtual
2. Evaluación de los Módulos en Pregrado Virtual
3. Desarrollo Didáctico de los Módulos en Posgrado Virtual
4. Evaluación de los Módulos en Posgrado Virtual
Página 4 de 4