CoderzColumn : Data Science Tutorials (Page: 1)

Data Science Tutorials


Data science is an interdisciplinary field that applies information from data across a wide range of application fields by using scientific methods, procedures, algorithms, and systems to infer knowledge and insights from noisy, structured, and unstructured data.

Data visualization libraries like matplotlib, Bokeh, bqplot, Plotnine, cufflinks, Altair, hvplot, Holoviews, seaborn and more.

Different Interactive charts Sunburst Charts, Sankey Diagrams (Alluvial), Candlestick Charts, Network Charts, Chord Diagram, Parallel Coordinates Plots, Radar Charts, Connection Map, Treemap, Choropleth Maps, Scatter & Bubble Maps.

Apart from this, you will find tutorials about time series data and its applications, creating dashboards, and other concepts.

For an in-depth understanding of the above concepts, check out the sections below.

Recent Data Science Tutorials


Tags candlestick, mplfinance, plotly, bokeh, bqplot, c…
Candlestick Chart in Python (mplfinance, plotly, bokeh, bqplot & cufflinks)
Data Science

Candlestick Chart in Python (mplfinance, plotly, bokeh, bqplot & cufflinks)

A simple guide to creating candlestick charts in Python using data visualization libraries mplfinance (matplotlib), Plotly, Bokeh, Bqplot, and Cufflinks. The tutorial covers a simple styling guide as well. Charts created using mplfinance are static whereas interactive for other libraries.

Sunny Solanki  Sunny Solanki
Tags data-visualizaton, altair
Altair - Basic Interactive Plotting in Python
Data Science

Altair - Basic Interactive Plotting in Python

A simple guide on how to create interactive charts using Python library Altair in Jupyter notebook with simple examples. Altair is a Python data visualization library built on top of javascript libraries Vega and Vega-lite. Tutorial covers basic charts like scatter charts, bar charts, line charts, area charts, histograms, box plots, pie charts, heatmaps, etc. Tutorial is a good starting point for beginners.

Sunny Solanki  Sunny Solanki
Tags time-series, dates, times, timezones
Dates, Timestamps, Timedeltas, Periods & Time Zone Handling in Python using Pandas
Data Science

Dates, Timestamps, Timedeltas, Periods & Time Zone Handling in Python using Pandas

A simple guide to work with dates, timestamps, periods, time deltas, and time zones using Python library Pandas. Tutorial covers aspects like creating date time ranges / time stamps / time deltas / periods / period ranges, adding / subtracting time deltas from dates / periods, adding time zone to dates, converting time zone of dates, etc.

Sunny Solanki  Sunny Solanki
Tags time-series, resampling, moving-window-functions
Time Series - Resampling & Moving Window Functions in Python using Pandas
Data Science

Time Series - Resampling & Moving Window Functions in Python using Pandas

A detailed guide to resampling time series data using Python Pandas library. Tutorial covers pandas functions ('asfreq()' & 'resample()') to upsample and downsample time series data. Apart from resampling, tutorial covers a guide to apply moving window functions ('rolling', 'expanding' & 'ewm()') to time series data as well. The rolling window, expanding window and exponential moving average is covered in tutorial.

Sunny Solanki  Sunny Solanki
Tags sunburst-chart, plotly
How to Create Sunburst Chart / Diagram in Python [Plotly]?
Data Science

How to Create Sunburst Chart / Diagram in Python [Plotly]?

A simple guide to creating sunburst charts in Python using interactive data visualization library Plotly. Tutorial explains how we can use plotly express and plotly graph objects API of a library to create sunburst charts. The sunburst chart is also referred to by other names like multi-level pie chart, ring chart, donut chart, doughnut chart, or radial treemap.

Sunny Solanki  Sunny Solanki
Tags time-series, trend, seasonality, pandas
How to Remove Trend & Seasonality from Time-Series Data in Python?
Data Science

How to Remove Trend & Seasonality from Time-Series Data in Python?

Tutorial provides a brief guide to detect stationarity (absence of trend and seasonality) in time series data. After checking for stationarity, the tutorial explains various ways to remove trends and seasonality from time series to make them stationary.

Sunny Solanki  Sunny Solanki
Tags cufflinks, plotly, pandas
cufflinks [Python] - How to create plotly charts from pandas dataframe with one line of code?
Data Science

cufflinks [Python] - How to create plotly charts from pandas dataframe with one line of code?

A detailed guide on how to use Python library "cufflinks" to create interactive data visualizations/charts. Cufflinks is built on top of Plotly and let us create charts by calling 'iplot()' method on Pandas dataframe. The 'iplot()' method tries to mimic 'plot()' API (matplotlib) of pandas dataframe to generate charts but uses Plotly.

Sunny Solanki  Sunny Solanki
Tags sankey-diagram, holoviews, plotly
How to Create Sankey Diagrams (Alluvial) in Python (holoviews & plotly)?
Data Science

How to Create Sankey Diagrams (Alluvial) in Python (holoviews & plotly)?

A detailed guide to creating Sankey Diagram (Alluvial Diagram) using Python data visualization libraries Plotly and Holoviews (Bokeh & Matplotlib). The charts are interactive and visualized in Jupyter Notebooks.

Sunny Solanki  Sunny Solanki
Tags missing-data, visualization
missingno - Visualize Missing Values (NaNs / Null Values) Distribution in Datasets [Python]
Data Science

missingno - Visualize Missing Values (NaNs / Null Values) Distribution in Datasets [Python]

Tutorial explains how to use Python module "missingno" to analyze the distribution of missing data (NaNs/NULLs/None Values) in our datasets. It let us create various charts to visualize the spread of missing data from various angles which can help us make better decisions.

Sunny Solanki  Sunny Solanki
Tags dashboard, streamlit, cufflinks, plotly
How to Create Basic Dashboard using Streamlit and Cufflinks (Plotly)?
Data Science

How to Create Basic Dashboard using Streamlit and Cufflinks (Plotly)?

How to Create Basic Dashboard using Streamlit and Cufflinks (Plotly)?

Sunny Solanki  Sunny Solanki
Python Data Visualization Libraries

Python Data Visualization Libraries


Data Visualization is a field of graphical representation of information / data. It is one of the most efficient ways of communicating information with users as humans are quite good at capturing patterns in data.

Python has a bunch of libraries that can help us create data visualizations. Some of these libraries (matplotlib, seaborn, plotnine, etc) generate static charts whereas others (bokeh, plotly, bqplot, altair, holoviews, cufflinks, hvplot, etc) generate interactive charts. Majority of basic visualizations like bar charts, line charts, scatter plots, histograms, box plots, pie charts, etc are supported by all libraries. Many libraries also support advanced visualization, widgets, and dashboards.

Advanced Data Visualizations using Python

Advanced Data Visualizations using Python


Basic Data Visualizations like bar charts, line charts, scatter plots, histograms, box plots, pie charts, etc are quite good at representing information and exploring relationships between data variables.

But sometimes these visualizations are not enough and we need to analyze data from different perspectives. For this purpose, many advanced visualizations are developed over time like Sankey diagrams, candlestick charts, network charts, chord diagrams, sunburst charts, radar charts, parallel coordinates charts, etc. Python has many data visualization libraries that let us create such advanced data visualizations.

Dashboards using Python

Dashboards using Python


Dashboards are literally everywhere and everyone is using them. Dashboards are GUI with various visualizations and metrics that can be used to monitor key performance indicators. Dashboards have a very wide range of applications in all fields.

Python has a bunch of libraries (dash, panel, streamlit, bokeh, etc) that let us create dashboards using them. They let us include widgets and interactive data visualizations in dashboards.

Work with Time Series Data in Python

Work with Time Series Data in Python


Time series is a type of data where data points are recorded in time order or at specified time intervals. Many real-world datasets like stock prices, weather indicators, heights of ocean tides, retail sales, etc.

Time series analysis involves various tasks like resampling time series, trying moving window functions, forecasting, classification, etc.

Python has various libraries (pandas, statsmodels, etc.) that let us load and work with time series data efficiently. They even provide useful functionalities to work with time series data.

Visualize Maps using Python

Visualize Maps using Python


Maps are one of the best ways to display and analyze geospatial data. It helps us better see patterns and trends geographically. This can help us with better decision-making.

Many different types of maps have been developed over time to analyze data from different perspectives. Some common map visualization types are choropleth maps, scatter maps, bubble maps, connection maps, etc. Apart from these, we can also include pins on maps to identify locations.

Python has many different libraries (geopandas, folium, ipyleaflet, cartopy, geoviews, geoplot, bokeh, altair, plotly, hvplot, etc) that let us create static as well as interactive maps.

Exploratory Data Analysis using Python

Exploratory Data Analysis using Python


Exploratory data analysis (commonly referred to as EDA) is an initial analysis of data to look for various relationships, anomalies, missing values, distributions, basic statistics, etc. It helps us understand data better to make further decisions. Various stats are calculated and statistical visualizations are created during EDA to understand data.

Python provides many different tools / libraries (Sweetviz, missingno, seaborn, pandas, etc) for performing EDA. It's quite common to use more than one of these tools to perform EDA.