python how to format data for use with seaborn
# Short answer:
# You have to make a pandas dataframe with labeled data from all
# samples essentially concatenated together in the same columns.
# Example process:
# Say you're starting with 2 samples with x and y data in lists:
sample_1_x = [1,2,1,1.6,1.4,3,2,0.9,2.2,2.6,3,2]
sample_1_y = [2,2,1.5,1.6,1.4,3,2,3,2.2,2.6,3,2]
sample_2_x = [1,1.7,1,1.6,1.4,3,2,1,2.2,2.6,3,2.1]
sample_2_y = [2,3,1,1.6,1.7,3,2,0.9,2.3,2.6,2.5,2]
# First, import packages and make sample-specific pandas dataframes:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
sample_1_df = pd.DataFrame({'x':sample_1_x, 'y':sample_1_y})
sample_2_df = pd.DataFrame({'x':sample_2_x, 'y':sample_2_y})
# Second, add a column of labels to distinguish data later on:
sample_1_df['labels'] = 'Sample_1'
sample_2_df['labels'] = 'Sample_2'
# Concatenate the dataframes together:
vertical_concat = pd.concat([sample_1_df, sample_2_df], axis=0)
# View final format:
vertical_concat
x y labels
0 1.0 2.0 Sample_1
1 2.0 2.0 Sample_1
2 1.0 1.5 Sample_1
. . . .
. . . .
. . . .
0 1.0 2.0 Sample_2
1 1.7 3.0 Sample_2
2 1.0 1.0 Sample_2
# Make plots in which samples are distinguished by their labels:
sns.scatterplot(data=vertical_concat, x='x', y='y', hue='labels')