plot hi plot available sale to plot in gurgaon 100sqyards 20lac 9971108008
Data Visualization in Python - Biotec · Workflow III: construct basic plot • In our case, we...
Transcript of Data Visualization in Python - Biotec · Workflow III: construct basic plot • In our case, we...
Data Visualization in Python
Violinplot. Michael Waskom. https://blog.modeanalytics.com/images/post-images/viz-libraries-02.png
Visualizing Information
http://mapdesign.icaci.org/wp-content/uploads/2014/01/MapCarte32_ise_large.png
Guide for Visitors to Ise Shrine. Adapted from Edward R. Tufte: Envisioning Information
„All communication [...] [to] readers of an image must now take place on a two-dimensional surface.“
Graphical Excellence
Tableaux Graphiques et Cartes Figuratives de M. Minard. Adapted from Edward R. Tufte: The Visual Display of Quantitative Information
„Graphical excellence consists of complex ideas communicated with clarity, precision, and efficiency“
Tools for generating plots
• R (statistics software) + ggplot2
• Python with matplotlib or seaborn or ...
• Online tools: e.g. plot.ly
• (Excel)
=> Since you already familiar with Python syntax, why not use it directly for visualization?
Matplotlib
Material adapted from Matplotlib Pyplot tutorial
• 2D plotting library for Python• many plot types: scatter, histogram, bar...• You have control over everything
• Data in the plot• Fonts, styling• Axes• additional elements (lines, etc.)
• Large userbase offering examples, tutorials, help• Seaborn as addition for more advanced and nicer layouts• Website: matplotlib.org
Simple Usage Example
Material adapted from Matplotlib Pyplot tutorial
import matplotlib.pyplot as plt
xdata = [1, 2, 3, 4]ydata = [1, 4, 9, 16]plt.plot(xdata, ydata, “ro“)plt.axis([0,6,0,20])plt.show()plt.savefig(“scatterplot.pdf“)
Code for simple scatterplot
Import just one module from matplotlib Give it a shorter alias plt
“ro“ = red dots
Save generated plot as PDF
A short note on libraries
Material adapted from Matplotlib Pyplot tutorial
• All functions we have used so far come from the Python standard library• is shipped with Python, no installation necessary• E.g. sys, random, os, math
• Matplotlib is an external library and needs to be installed• Fortunately, Python has a package system which makes this process easy
pip3 install --user matplotlib
System terminal command for installation of matplotlib(already installed on computer pool machines!)
pip = pip installs packageThis also works for other packages, e.g. seaborn
Workflow I: Choose plot type
x,y1,12,43,94,16
data.csv
http://matplotlib.org/users/screenshots.html
Bar chart -> bar() Scatter/Line plot-> plot()
?
Histogram-> hist() Anything else?
Workflow II: Prepare Data
x,y1,12,43,94,16
data.csv
xdata, ydata = [], []with open(‘data.csv’, ‘r’) as f:
for line in f:if not line.startswith(‘x’):
x, y = line.strip().split(‘,’)xdata.append(x)ydata.append(y)
print(xdata, ydata)
[1,2,3,4], [1,4,9,16]
Storing your data in lists is a good idea *
Workflow III: construct basic plot• Read the documentation of the plot function
• E.g. for scatter plot -> pyplot.plot()• matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.plot
• The plot() function is very flexible concerning its input
import matplotlib.pyplot as pltxdata = [1,2,3,4]ydata = [1,4,9,16]plt.plot(xdata,ydata)
...plt.plot(xdata,ydata,’ro’)
import matplotlib.pyplot as pltdata = [1,2,3,4]plt.plot(data)
Workflow III: construct basic plot• In our case, we want to plot the X against the Y values
• dots looks best for this application• Also, let’s use some additional data
import matplotlib.pyplot as pltxdata = [1,2,3,4,5,6,7,8]ydata = [1,4,9,16,25,36,49,64]plt.plot(xdata, ydata, ‘go’)plt.savefig(‘scatter.pdf’)
This time, it‘s green dots
However, several things could be improved...
We don‘t have a title
No axis labels
Would be nice to have larger dots and a line
First/last data points almost hidden
Workflow IV: Add elements
import matplotlib.pyplot as pltxdata = [1,2,3,4,5,6,7,8]ydata = [1,4,9,16,25,36,49,64]plt.plot(xdata, ydata, color=‘g’, \
marker=‘o’, markersize=10)plt.title(“Square Function”)plt.xlabel(“X”)plt.ylabel(“Y”)plt.xlim([0,9])plt.ylim([-1,70])plt.savefig(‘scatter.pdf’)
Standard line and additional markers (dots) for data points larger markers(points)
Adds title and axis labels
Adjusts axis range
Practical Example: Bar Plots
Practical Example: Bar Plots
import matplotlib.pyplot as plt
plt.style.use("grayscale")
data = {"Streptomyces":72, \"Halobacterium":67, "Plasmodium":20}
positions = range(len(data))
plt.barh( \positions, \list(data.values()), \align="center", \alpha=0.4 \
)
plt.yticks(positions, list(data.keys()))plt.title("Average GC Content by Organism")plt.xlabel("GC Content in %")plt.tight_layout()plt.show()
predefined gray color scheme
gc contents data (x-axis)
bar positions (y-axis)
horizontal bar chart. For vertical, use bar()
Add ticks/labels for y-axis
Auto optimize space
15
Summary
• Visualizaion supports our understanding
• Different Python libraries and modules for visualization
• Popular Matplotlib‘s Pyplot module for creating 2-dimensional plots (Scatter, Bar...)
• Matplotlib need to be installed using pip3(Python Package Manager)
• Many different ways to prepare data and to configure output
• Pyplot‘s styles for uniform plot styling